Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →

Models · Qwen

Qwen3-Coder-30B-A3B

Model family: qwen3-coder

Size

mid (30.5B params)

Context

262,144 tokens

Released

2025-07-30

Openness

open-weight

License

Apache License 2.0 · commercial: yes

Cost tier

mixed

Rating

4.5 ★ — The open coding workhorse: excellent agentic-coding and tool-calling for its size, repo-scale context, runs on one GPU, Apache 2.0. Still the local-coding default because no newer Qwen-Coder has shipped.

Modalities

code, text

Capabilities

coding, function-calling, instruction-following, long-context, tool-use

Access

api-third-party, local-runtime-llama-cpp, local-runtime-lm-studio, local-runtime-mlx, local-runtime-ollama, local-runtime-vllm, weights-download-hf

llm
open-weight
commercial-friendly
coding
mid-size
long-context
self-hostable
china-based
apache-2-0
mixture-of-experts

Quick Take

Qwen's open coding workhorse: a 30B mixture-of-experts model tuned for agentic coding and tool-calling, with repo-scale context, that runs on a single GPU — under Apache 2.0.

Plain-English Description

Qwen3-Coder-30B-A3B is the coding specialist of the open Qwen lineup. It's a mixture-of-experts model — 30.5 billion parameters total, but only about 3.3 billion active per token — which is why it punches well above its memory footprint. The design goal is agentic coding: not just autocompleting a function, but operating inside coding tools (it ships with a function-call format built for platforms like Qwen Code and Cline), reading large codebases, and executing multi-step tasks.

Two practical strengths matter for business use. First, context: it handles 256,000 tokens natively and can stretch to about a million with a scaling technique called YaRN, which means it can hold an entire repository in working memory rather than squinting at one file at a time. Second, it runs locally — with only 3.3B active parameters it's a long-standing favorite for coding on a single consumer GPU or an Apple Silicon Mac via the MLX runtime.

It's a focused tool, not a general chatbot — it runs in a direct "non-thinking" mode without the step-by-step reasoning traces some models expose. And because no newer Qwen-Coder has shipped (there's no 3.6 or 3.7 Coder yet), this remains the go-to open coding model from Qwen in 2026.

Best For

Self-hosted coding assistants and agents wired into tools like Cline, Qwen Code, or your own IDE integration.
Repository-scale tasks — refactors, codebase Q&A, multi-file changes — that need the long context.
Local development on a single GPU or a Mac, with no per-token API costs and no code leaving your machine.
Teams that want an Apache 2.0 coding model they can fine-tune on their own codebase.

Not For

General chat, reasoning, or writing — it's tuned for code; use a generalist like Qwen3.6-27B for mixed work.
Tasks that benefit from visible chain-of-thought reasoning — this model runs in direct, non-thinking mode only.
Multimodal needs — it's text and code, no image or video input.
Anyone wanting the absolute strongest coding scores regardless of footprint — the larger flagships and the dense Qwen3.6-27B can edge it on some agentic-coding suites.

License — Plain-English Summary

Apache 2.0 — unrestricted commercial use, modification, fine-tuning, and redistribution, no royalties, no carve-outs. For a coding model that's especially valuable: you can fine-tune it on your proprietary codebase and ship the result in a commercial product with no licensing entanglements. Keep the notices, flag significant changes if you redistribute, and that's the whole obligation. Self-hosted, it also keeps your source code entirely in-house.

How It Compares

Against the dense Qwen3.6-27B, the Coder is more specialized for tool-driven coding workflows while the 27B is the stronger all-rounder of similar size — many teams now reach for the 27B for general agentic coding and keep the Coder for tooling-heavy pipelines. Against Qwen3-30B-A3B, the general MoE of the same size, the Coder trades broad capability for sharper code and tool-calling behavior. Against closed coding options, Qwen3-Coder's pitch is ownership: comparable everyday coding help that you self-host for free, versus renting a closed model by the token.

Cost

Self-hosted cost: $0.00 beyond compute
Notes: Free to self-host under Apache 2.0. 256K native context extends to ~1M via YaRN scaling for repository-scale work. Third-party hosts also serve it.

Hardware requirements

Min VRAM: 18 GB
Recommended VRAM: 24 GB
Runs on laptop: Yes
Notes: With only 3.3B active parameters it runs efficiently on a single consumer GPU; a long-time favorite for local coding on Apple Silicon via MLX.