Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →
DeepSeek-V3.2
Model family: deepseek-v3-2
- llm
- open-weight
- commercial-friendly
- frontier
- reasoning
- math
- china-based
- mixture-of-experts
Quick Take
DeepSeek's prior flagship: an MIT-licensed reasoning modelA model trained to "think through" problems step by step before answering, often by producing internal reasoning that's either shown or hidden from the user. Reasoning models trade speed for accuracy on hard problems — they're slower and more expensive per answer, but markedly better at math, logic, and complex analysis. OpenAI's o1 series and Mistral's Magistral are reasoning models. strong enough that its high-compute variant took competition-math gold — now overtaken by V4 but still a solid self-host choice.
Plain-English Description
DeepSeek-V3.2 was DeepSeek's leading model before the V4 family arrived. It's a 671-billion-parameter mixture-of-experts system (activating about 37 billion parameters per tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words.) that came in a few flavors: the standard V3.2, an experimental V3.2-Exp that pioneered a cheaper way to handle long context, and a high-compute V3.2-Speciale tuned purely for hard reasoning. On DeepSeek's reporting, the standard model performed comparably to GPT-5, while Speciale pushed past it and reached gold-medal scores at the 2025 International Mathematical Olympiad and International Olympiad in Informatics — a striking result for a freely downloadable model.
For most of late 2025 and early 2026, V3.2 was also DeepSeek's budget API workhorse, sitting behind the deepseek-chat and deepseek-reasoner endpoints at prices that embarrassed Western competitors. That has changed: those endpoints now route to DeepSeek-V4-Flash, and V3.2 has stepped back into the role of a strong, free, MIT-licensed model you self-host or reach through a third-party provider.
If you're choosing a DeepSeek model today, V4 is the forward-looking pick. But V3.2 remains worth knowing about — it's proven, it's permissively licensed, and for reasoning and math specifically the Speciale variant is still formidable.
Best For
- Self-hostedRunning a model on hardware you control — your own servers, your own cloud instance, or your own laptop — rather than paying to access it through someone else's API. Self-hosting gives you full control over data and predictable costs, but requires the hardware and operational effort to run the model. Only possible with open-weight models. reasoning and math workloads where you want a proven, MIT-licensed model and don't need V4's million-tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. window.
- Teams already running V3.2 who want to understand what they have before deciding whether to migrate to V4.
- Research and experimentation where the Speciale variant's competition-grade reasoning is the draw.
- Cost-sensitive deployments via third-party hosts that still serve V3.2 cheaply.
Not For
- New deployments that would be better served by DeepSeek-V4-Flash or DeepSeek-V4-Pro — V3.2 is the previous generation.
- Workloads needing more than a 128K-tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. context — that's a key reason to move up to V4.
- Laptop or single-GPUThe specialized chip that runs most AI models. Originally designed for 3D graphics, GPUs turned out to be excellent at the math AI requires. Nvidia dominates the AI GPU market; common datacenter models include the H100, H200, and B200. Running an AI model without a GPU is possible but painfully slow for anything but the smallest models. self-hosting — at 671B parameters it needs serious hardware.
- Anyone who can't route data to China and isn't prepared to self-host.
License — Plain-English Summary
V3.2's weightsThe numerical values inside a trained model that encode everything it has learned. A model is, functionally, a giant list of weights — tens of billions of numbers for a mid-sized model, hundreds of billions for a frontier model. "Open-weight" means those numbers are published. "Downloading the weights" means getting the actual file you'd need to run the model yourself. are MIT-licensed: full commercial use, modification, and redistribution, with the only obligation being to keep the copyright notice. The licensing is clean and unrestrictive. The usual DeepSeek caveat applies — the license governs the weights, not where your data goes, so self-hosting (or a trusted third-party host) is what keeps a privacy-sensitive deployment clean.
How It Compares
Against the V4 family that replaced it, V3.2 has a smaller context windowThe maximum amount of text the model can "see" at once — prompt plus prior conversation plus any documents you give it. Measured in tokens (which are roughly three-quarters of a word each). A 128K context window is about 96,000 words of input — roughly a 400-page book. Larger context windows let the model work with bigger documents but cost more to run. (128K versus a million) and sits a step behind on most benchmarks, though the Speciale variant's reasoning remains competitive. Against DeepSeek-R1, V3.2 is the more general-purpose model where R1 is the dedicated reasoning specialist — and V3.2-Speciale arguably narrowed the gap by folding heavy reasoning into the V3 line. Against closed models like GPT-5, DeepSeek positioned standard V3.2 as comparable and Speciale as ahead on the hardest reasoning tasks, all while being open-weightA model where the trained weights are freely downloadable — you can run it yourself without contacting the creator. Llama, Mistral, Qwen, and Gemma are open-weight. Open-weight does not mean open-source: the training data and code often stay private. The license still governs what you can do with the weights, including whether you can use them commercially. and far cheaper — the same trade as ever: capability-per-dollar in exchange for the China-jurisdiction question.
Cost
- Self-hosted cost
- $0.00 beyond compute
- Notes
- V3.2 was DeepSeek's budget API workhorse, but the first-party deepseek-chat and deepseek-reasoner endpoints have since migrated to V4 Flash. V3.2 remains freely downloadable under MIT and is still served by third-party hosts; treat it now primarily as a self-host or third-party option rather than a first-party API product.
Hardware requirements
- Min VRAM
- 320 GB
- Recommended VRAM
- 640 GB
- Runs on laptop
- No
- Notes
- 671B parameters; multi-GPU or multi-node self-hosting only.