Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →
Hermes 3 70B
fine-tune derivative of Llama 3.1 70B by Nous Research
Nous Research's post-training of Llama 3.1 70B into Hermes 3 — agentic capability, structured output, steerable instruction following.
- llm
- open-weight
- large
- agentic
- self-hostable
- fine-tune
- us-based
- llama-derivative
Quick Take
The 70B Hermes 3: a steerable, agentic fine-tuneA model that has been further trained on additional data to specialize it for a particular task, domain, or style. Fine-tuning a general model on medical literature produces a medical specialist; fine-tuning on your company's support tickets produces a support assistant that sounds like your team. Fine-tunes are much cheaper to create than training a model from scratch. of Llama 3.1 70B from the 2024 generation, now succeeded by Hermes 4 70B.
Plain-English Description
Hermes 3 70B is the mid-large model of the Hermes 3 line — the same agentic, structured-output, steerable tuning as the 405B, at a size deployable on a single multi-GPUThe specialized chip that runs most AI models. Originally designed for 3D graphics, GPUs turned out to be excellent at the math AI requires. Nvidia dominates the AI GPU market; common datacenter models include the H100, H200, and B200. Running an AI model without a GPU is possible but painfully slow for anything but the smallest models. node. It was a popular open generalist in its generation.
Hermes 4 70B supersedes it with hybrid reasoning, so new projects should generally start there; this entry supports existing deployments and rounds out the Hermes 3 family in the catalog.
License is inherited from Llama (see below).
Best For
- Existing Hermes 3 70B deployments.
- Agentic and steerable workloads on a single multi-GPUThe specialized chip that runs most AI models. Originally designed for 3D graphics, GPUs turned out to be excellent at the math AI requires. Nvidia dominates the AI GPU market; common datacenter models include the H100, H200, and B200. Running an AI model without a GPU is possible but painfully slow for anything but the smallest models. node where the 3 line is already in use.
- Lineage reference between the 8B and 405B Hermes 3 sizes.
Not For
- New deployments — start on Hermes 4 70B.
- Laptop/single-GPUThe specialized chip that runs most AI models. Originally designed for 3D graphics, GPUs turned out to be excellent at the math AI requires. Nvidia dominates the AI GPU market; common datacenter models include the H100, H200, and B200. Running an AI model without a GPU is possible but painfully slow for anything but the smallest models. — use Hermes 3 — Llama 3.1 8B.
- Products near 700M MAU (Llama carve-out).
- MultimodalA model that can handle more than one type of input or output — typically text plus images, sometimes plus audio or video. "GPT-4 Vision" and "Llama 3.2 11B Vision" are multimodal models that accept both text and images. A text-only model is called "unimodal" but nobody uses that term; text-only is the assumed default. tasks — text only.
License — Plain-English Summary
Two layers — Nous's open Hermes 3 weightsThe numerical values inside a trained model that encode everything it has learned. A model is, functionally, a giant list of weights — tens of billions of numbers for a mid-sized model, hundreds of billions for a frontier model. "Open-weight" means those numbers are published. "Downloading the weights" means getting the actual file you'd need to run the model yourself. over Meta's Llama 3.1 70B, governed by the Llama 3.1 Community License (commercial use with "Built with Llama" attribution; 700M-MAU carve-out). Standard Llama-based Hermes terms.
How It Compares
Against Hermes 4 70B, this is the older 70B — Hermes 4 is the upgrade. Against Hermes 3 405B, it's the lighter same-generation sibling. Against its base Llama 3.1 70B, it's Nous's steerable tuning.
Cost
- Self-hosted cost
- $0.00 beyond compute
- Notes
- Free to self-host; the base model's license governs commercial use (see License).
Comparable models
Commercial-use conditions
Nous releases the Hermes weights openly, but the base is Meta's Llama 3.1, so Meta's Llama 3.1 Community License governs the model — including the clause requiring a separate Meta license if your product exceeds 700 million monthly active users.