Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →
Hermes 3 3B
fine-tune derivative of Llama 3.2 3B by Nous Research
Nous Research's post-training of Llama 3.2 3B into Hermes 3 — steerable instruction following in an edge-deployable size.
- llm
- open-weight
- small
- on-device
- fine-tune
- us-based
- llama-derivative
Quick Take
The smallest Hermes: a 3B fine-tuneA model that has been further trained on additional data to specialize it for a particular task, domain, or style. Fine-tuning a general model on medical literature produces a medical specialist; fine-tuning on your company's support tickets produces a support assistant that sounds like your team. Fine-tunes are much cheaper to create than training a model from scratch. of Llama 3.2 3B for on-deviceRunning a model directly on a consumer device — a laptop, a phone, a smart speaker — rather than in a data center. On-device inference keeps data private by never leaving the device, and works offline. Small models (under ~10B parameters, often quantized) can run on-device; larger models cannot yet. and edge use, with Hermes's steerable instruction following.
Plain-English Description
Hermes 3 3B brings the Hermes tuning — steerable, low-refusal, system-prompt-responsive instruction following — to an edge-deployable 3-billion-parameter size built on Llama 3.2 3B. It's the model to reach for when you want Hermes behavior on a phone, a laptop, or constrained hardware.
At 3B it's a small model: useful for on-deviceRunning a model directly on a consumer device — a laptop, a phone, a smart speaker — rather than in a data center. On-device inference keeps data private by never leaving the device, and works offline. Small models (under ~10B parameters, often quantized) can run on-device; larger models cannot yet. assistants, classification, and simple agentic tasks, but not for hard reasoning. Note it's built on Llama 3.2 (not 3.1), so its license is the Llama 3.2 Community License.
License details below.
Best For
- On-deviceRunning a model directly on a consumer device — a laptop, a phone, a smart speaker — rather than in a data center. On-device inference keeps data private by never leaving the device, and works offline. Small models (under ~10B parameters, often quantized) can run on-device; larger models cannot yet. and edge assistants wanting Hermes's steerability in a tiny footprint.
- Simple instruction-following, classification, and routing at low cost.
- Privacy-first local deployments.
Not For
- Hard reasoning or complex tasks — step up to a larger Hermes.
- Products near 700M MAU (Llama carve-out).
- MultimodalA model that can handle more than one type of input or output — typically text plus images, sometimes plus audio or video. "GPT-4 Vision" and "Llama 3.2 11B Vision" are multimodal models that accept both text and images. A text-only model is called "unimodal" but nobody uses that term; text-only is the assumed default. tasks — text only.
License — Plain-English Summary
Two layers. Nous's open weightsThe numerical values inside a trained model that encode everything it has learned. A model is, functionally, a giant list of weights — tens of billions of numbers for a mid-sized model, hundreds of billions for a frontier model. "Open-weight" means those numbers are published. "Downloading the weights" means getting the actual file you'd need to run the model yourself. sit on Meta's Llama 3.2 3B, so the Llama 3.2 Community License governs (note: 3.2, not 3.1) — commercial use with "Built with Llama" attribution and the 700M-MAU carve-out. Confirm you're reading the Llama 3.2 terms specifically.
How It Compares
Against the existing Hermes 3 — Llama 3.1 8B, the 3B is smaller and lighter — the 8B is more capable where hardware allows. Against its base Llama 3.2 3B, it's Nous's steerable tuning. Against DeepHermes 3 8B, that DeepHermes variant adds toggleable reasoning at a larger size.
Cost
- Self-hosted cost
- $0.00 beyond compute
- Notes
- Free to self-host; the base model's license governs commercial use (see License).
Comparable models
Commercial-use conditions
Nous releases the Hermes weights openly, but the base is Meta's Llama 3.2, so Meta's Llama 3.2 Community License governs the model — including the clause requiring a separate Meta license if your product exceeds 700 million monthly active users.