Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →

Models · Qwen

Qwen3.6-27B

Model family: qwen3-6

Size

mid (27.0B params)

Context

262,144 tokens

Released

2026-04-21

Openness

open-weight

License

Apache License 2.0 · commercial: yes

Cost tier

mixed

Rating

4.5 ★ — The standout accessibility story of the whole lineup: near-frontier agentic coding on a single consumer GPU, under clean Apache 2.0. For self-hosters this is arguably the most practically useful model in the catalog.

Modalities

text

Capabilities

chat, coding, function-calling, instruction-following, long-context, reasoning, tool-use

Access

api-third-party, local-runtime-llama-cpp, local-runtime-lm-studio, local-runtime-mlx, local-runtime-ollama, local-runtime-vllm, weights-download-hf

llm
open-weight
commercial-friendly
mid-size
coding
long-context
self-hostable
on-device
china-based
apache-2-0

Quick Take

The best open model you can actually run yourself: a dense 27B that beats Qwen's own 397B flagship on agentic coding while fitting on a single consumer GPU, under Apache 2.0.

Plain-English Description

Qwen3.6-27B, released in April 2026, is the model to point most self-hosters at. It's a "dense" model — meaning every one of its 27 billion parameters is used on every token, unlike the mixture-of-experts designs that switch on only a slice. Dense models are simpler to run predictably, and at 27B this one is small enough to fit on a single 24GB consumer graphics card after quantization (about 16.8GB), or a high-end laptop.

The surprising part is how good it is for its size. On several agentic-coding benchmarks it actually edges out Qwen's own 397-billion-parameter flagship — for example, around 77.2 on SWE-bench Verified versus the bigger model's 76.2. That's the payoff of a focused, well-trained dense model: you get near-frontier coding capability without needing a server full of GPUs. For a business that wants a capable coding or agent model running entirely in-house, on hardware it already owns, this is close to ideal.

Like the rest of the open Qwen tier, it's Apache 2.0, so there are no licensing strings on commercial use, modification, or fine-tuning. The main thing to know is that it's a focused workhorse, not a do-everything frontier model — for the absolute top scores you'd reach for the closed Max flagship or the much larger 397B.

Best For

Self-hosted coding and agentic workflows where you want strong capability on hardware you control.
Privacy-sensitive teams that need everything to stay in-house on a single GPU — no API, no data leaving the building.
Local development, on-device assistants, and edge deployments where a 27B dense model is the sweet spot.
Cost-conscious teams avoiding per-token API fees entirely.
Fine-tuning on a budget — small enough to adapt without a cluster.

Not For

Maximum capability regardless of cost — the closed Qwen3.7-Max and the open Qwen3.5-397B-A17B both go higher.
Multimodal work — this is a text model; for image or video understanding use the 397B or a dedicated vision model.
Phone-class or very low-memory devices — for those, drop to the smaller Qwen3 dense sizes (8B, 4B, and down).
Teams that specifically want a mixture-of-experts model for throughput reasons.

License — Plain-English Summary

Apache 2.0, the same clean, permissive license as the rest of the open Qwen tier: commercial use, modification, fine-tuning, and redistribution all allowed, no royalties, no user-count carve-out. Keep the notices, flag significant changes if you redistribute, and you're done. Because it's so easy to self-host, this is also the Qwen model with the cleanest privacy story — run it on your own GPU and no data ever leaves your environment.

How It Compares

Against Qwen3.5-397B-A17B, the 27B is far easier to run (one GPU versus eight) and actually wins on some coding suites, while giving up the bigger model's multimodality and broad-knowledge edge. Against Qwen3-Coder-30B-A3B, the dedicated coder, the 27B is a stronger generalist of similar size — pick the Coder for pure code-tooling workflows and the 27B when you want one capable local model for mixed work. Against other "best local model" picks like the mid-size Llama and DeepSeek-R1 distills, Qwen3.6-27B currently sets the bar for near-frontier capability on a single consumer GPU.

Cost

Self-hosted cost: $0.00 beyond compute
Notes: Free to self-host under Apache 2.0; this is the model's whole pitch. Some third-party hosts also serve it for per-token pricing. Context window here follows the current Qwen generation's 256K-class standard.

Hardware requirements

Min VRAM: 18 GB
Recommended VRAM: 24 GB
Runs on laptop: Yes
Notes: Fits in about 16.8GB at Q4_K_M quantization — a single 24GB consumer GPU (e.g. an RTX-class card) runs it comfortably, and high-end laptops can manage it. This is the easiest near-frontier model to run on your own hardware.