← Back to hard AIs

Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →

Models · DeepSeek

Feature-frozen. The creator has frozen feature development on this model (critical fixes only).

DeepSeek-V3.2

Model family: deepseek-v3-2

Size
frontier (671.0B params)
Context
131,072 tokens
Released
2025-11-30
Openness
open-weight
License
MIT License · commercial: yes
Cost tier
mixed
Rating
4.0 — A genuinely strong MIT-licensed reasoning model — the Speciale variant took competition-math gold — but now second-best inside its own lineup and being phased off the first-party API in favor of V4.
Modalities
text
Capabilities
chat, coding, instruction-following, long-context, math, reasoning, tool-use
Access
api-third-party, local-runtime-vllm, weights-download-hf

Quick Take

DeepSeek's prior flagship: an MIT-licensed reasoning modelA model trained to "think through" problems step by step before answering, often by producing internal reasoning that's either shown or hidden from the user. Reasoning models trade speed for accuracy on hard problems — they're slower and more expensive per answer, but markedly better at math, logic, and complex analysis. OpenAI's o1 series and Mistral's Magistral are reasoning models. strong enough that its high-compute variant took competition-math gold — now overtaken by V4 but still a solid self-host choice.

Plain-English Description

DeepSeek-V3.2 was DeepSeek's leading model before the V4 family arrived. It's a 671-billion-parameter mixture-of-experts system (activating about 37 billion parameters per tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words.) that came in a few flavors: the standard V3.2, an experimental V3.2-Exp that pioneered a cheaper way to handle long context, and a high-compute V3.2-Speciale tuned purely for hard reasoning. On DeepSeek's reporting, the standard model performed comparably to GPT-5, while Speciale pushed past it and reached gold-medal scores at the 2025 International Mathematical Olympiad and International Olympiad in Informatics — a striking result for a freely downloadable model.

For most of late 2025 and early 2026, V3.2 was also DeepSeek's budget API workhorse, sitting behind the deepseek-chat and deepseek-reasoner endpoints at prices that embarrassed Western competitors. That has changed: those endpoints now route to DeepSeek-V4-Flash, and V3.2 has stepped back into the role of a strong, free, MIT-licensed model you self-host or reach through a third-party provider.

If you're choosing a DeepSeek model today, V4 is the forward-looking pick. But V3.2 remains worth knowing about — it's proven, it's permissively licensed, and for reasoning and math specifically the Speciale variant is still formidable.

Best For

  • Self-hostedRunning a model on hardware you control — your own servers, your own cloud instance, or your own laptop — rather than paying to access it through someone else's API. Self-hosting gives you full control over data and predictable costs, but requires the hardware and operational effort to run the model. Only possible with open-weight models. reasoning and math workloads where you want a proven, MIT-licensed model and don't need V4's million-tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. window.
  • Teams already running V3.2 who want to understand what they have before deciding whether to migrate to V4.
  • Research and experimentation where the Speciale variant's competition-grade reasoning is the draw.
  • Cost-sensitive deployments via third-party hosts that still serve V3.2 cheaply.

Not For

  • New deployments that would be better served by DeepSeek-V4-Flash or DeepSeek-V4-Pro — V3.2 is the previous generation.
  • Workloads needing more than a 128K-tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. context — that's a key reason to move up to V4.
  • Laptop or single-GPUThe specialized chip that runs most AI models. Originally designed for 3D graphics, GPUs turned out to be excellent at the math AI requires. Nvidia dominates the AI GPU market; common datacenter models include the H100, H200, and B200. Running an AI model without a GPU is possible but painfully slow for anything but the smallest models. self-hosting — at 671B parameters it needs serious hardware.
  • Anyone who can't route data to China and isn't prepared to self-host.

License — Plain-English Summary

V3.2's weightsThe numerical values inside a trained model that encode everything it has learned. A model is, functionally, a giant list of weights — tens of billions of numbers for a mid-sized model, hundreds of billions for a frontier model. "Open-weight" means those numbers are published. "Downloading the weights" means getting the actual file you'd need to run the model yourself. are MIT-licensed: full commercial use, modification, and redistribution, with the only obligation being to keep the copyright notice. The licensing is clean and unrestrictive. The usual DeepSeek caveat applies — the license governs the weights, not where your data goes, so self-hosting (or a trusted third-party host) is what keeps a privacy-sensitive deployment clean.

How It Compares

Against the V4 family that replaced it, V3.2 has a smaller context windowThe maximum amount of text the model can "see" at once — prompt plus prior conversation plus any documents you give it. Measured in tokens (which are roughly three-quarters of a word each). A 128K context window is about 96,000 words of input — roughly a 400-page book. Larger context windows let the model work with bigger documents but cost more to run. (128K versus a million) and sits a step behind on most benchmarks, though the Speciale variant's reasoning remains competitive. Against DeepSeek-R1, V3.2 is the more general-purpose model where R1 is the dedicated reasoning specialist — and V3.2-Speciale arguably narrowed the gap by folding heavy reasoning into the V3 line. Against closed models like GPT-5, DeepSeek positioned standard V3.2 as comparable and Speciale as ahead on the hardest reasoning tasks, all while being open-weightA model where the trained weights are freely downloadable — you can run it yourself without contacting the creator. Llama, Mistral, Qwen, and Gemma are open-weight. Open-weight does not mean open-source: the training data and code often stay private. The license still governs what you can do with the weights, including whether you can use them commercially. and far cheaper — the same trade as ever: capability-per-dollar in exchange for the China-jurisdiction question.

Cost

Self-hosted cost
$0.00 beyond compute
Notes
V3.2 was DeepSeek's budget API workhorse, but the first-party deepseek-chat and deepseek-reasoner endpoints have since migrated to V4 Flash. V3.2 remains freely downloadable under MIT and is still served by third-party hosts; treat it now primarily as a self-host or third-party option rather than a first-party API product.

Hardware requirements

Min VRAM
320 GB
Recommended VRAM
640 GB
Runs on laptop
No
Notes
671B parameters; multi-GPU or multi-node self-hosting only.

Comparable models

Sources