Hermes 4.3 36B — full entry
Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →
Nous Research
4.0 ★ — Among the most respected names in open-source fine-tuning, with real research output (YaRN, DeMo, the Hermes family) and $65M in funding behind them. The half-point ding is that their work is all derivative — valuable, but dependent on upstream base models for foundational capability.
- derivative-author
- us-based
- ai-native-company
- open-source
- fine-tuning
Quick Take
A respected open-sourceA stricter standard than open-weight: the weights, the training code, and the training data are all released publicly. Very few large language models meet the full open-source bar — most "open" models in the AI world are actually open-weight. When in doubt, check the license file and the creator's documentation. AI research lab that specializes in high-quality fine-tunes of other people's foundation models — most notably the Hermes series built on Llama.
Who They Are
Nous Research is an open-sourceA stricter standard than open-weight: the weights, the training code, and the training data are all released publicly. Very few large language models meet the full open-source bar — most "open" models in the AI world are actually open-weight. When in doubt, check the license file and the creator's documentation. AI lab founded in 2023 by Jeffrey Quesnelle, Karan Malhotra, Teknium, and Shivani Mitra, headquartered in New York. Unlike Meta, Google, or OpenAI, they don't train foundation models from scratch. What they do is take existing open-weightA model where the trained weights are freely downloadable — you can run it yourself without contacting the creator. Llama, Mistral, Qwen, and Gemma are open-weight. Open-weight does not mean open-source: the training data and code often stay private. The license still governs what you can do with the weights, including whether you can use them commercially. foundations — historically the Llama family, and more recently Mistral and ByteDance's Seed models too — and fine-tuneA model that has been further trained on additional data to specialize it for a particular task, domain, or style. Fine-tuning a general model on medical literature produces a medical specialist; fine-tuning on your company's support tickets produces a support assistant that sounds like your team. Fine-tunes are much cheaper to create than training a model from scratch. them into models that often outperform the original creator's own instruction-tuned versions on specific capabilities. Their Hermes series — now on version 4, with a recent 4.3 release built on a non-Llama base — is their flagship line, downloaded over 33 million times.
The lab has real financial and research credibility behind it. They've raised $65 million in funding, including a $50M Series A led by Paradigm in April 2025, with participation from Together AI, Distributed Global, North Island Ventures, and others. Their published research isn't limited to fine-tuning recipes — they've contributed the YaRN context-extension method (used by Meta and DeepSeek among others), the DeMo optimization paper co-authored with OpenAI's Diederik Kingma, and several technical reports on their Hermes training process. This isn't a hobbyist Discord server releasing weekend projects; it's a funded, researched, and published operation.
Their specific niche is post-trainingAny training that happens after pretraining to make a base model useful for real tasks. Includes instruction tuning, chat tuning, and alignment work. Post-training is dramatically cheaper than pretraining — thousands to low millions rather than tens of millions. Most of what distinguishes GPT-4 from Llama 3.1 as a product, rather than as a base capability, is post-training. — the process of taking a pretrained model and tuning it for specific behaviors like instruction-following, function calling, roleplay, and structured output. Meta, Mistral, and other foundation model makers release their own instruction-tuned variants, but Nous's Hermes versions often trade blows with, or beat, the originals on specific tasks. For developers building on open-weight Llama models, Hermes is frequently the starting point instead of Meta's own Instruct release.
Model Philosophy
Nous leans hard into the "user steerability" direction of open-sourceA stricter standard than open-weight: the weights, the training code, and the training data are all released publicly. Very few large language models meet the full open-source bar — most "open" models in the AI world are actually open-weight. When in doubt, check the license file and the creator's documentation. AI. Their public positioning, visible in their model cards and technical reports, emphasizes that end users should have meaningful control over the models they run — guiding rules, roles, stylistic choices, and system-level behavior. In practice, this means Hermes models tend to be more willing to adopt strong personas, less heavy-handed with refusals, and more responsive to detailed system prompts than Meta's own Instruct versions.
They've also been early and active on decentralized training — their Psyche Network is an attempt to coordinate distributed GPUThe specialized chip that runs most AI models. Originally designed for 3D graphics, GPUs turned out to be excellent at the math AI requires. Nvidia dominates the AI GPU market; common datacenter models include the H100, H200, and B200. Running an AI model without a GPU is possible but painfully slow for anything but the smallest models. compute for model training across contributor hardware, using their DisTrO technology to reduce inter-GPU communication overhead. That bet has started to pay off: Hermes 4.3 (December 2025) was the first Hermes model trained on Psyche rather than a centralized cluster — a notable milestone, even if decentralized training at scale is still proving itself.
Pseudonymity is part of the culture. "Teknium," the Head of Post-TrainingAny training that happens after pretraining to make a base model useful for real tasks. Includes instruction tuning, chat tuning, and alignment work. Post-training is dramatically cheaper than pretraining — thousands to low millions rather than tens of millions. Most of what distinguishes GPT-4 from Llama 3.1 as a product, rather than as a base capability, is post-training., is publicly known only by that handle. That's worth flagging for businesses evaluating where their AI stack comes from — not as a red flag (the research is published, the funding is documented, the models are widely adopted and verified) but as a cultural fact that distinguishes Nous from a traditional corporate AI lab.
What To Know Before You Commit
Three practical considerations for a business considering a Nous Research model.
License inheritance matters. Nous doesn't train their own foundation models — they fine-tuneA model that has been further trained on additional data to specialize it for a particular task, domain, or style. Fine-tuning a general model on medical literature produces a medical specialist; fine-tuning on your company's support tickets produces a support assistant that sounds like your team. Fine-tunes are much cheaper to create than training a model from scratch. someone else's. That means every Nous model's license is inherited from its base. A Hermes-3 Llama fine-tune is governed by Meta's Llama Community License, not by some Nous-specific license. Before using any Nous model commercially, read the base modelA model straight out of pretraining, before any fine-tuning for chat or specific tasks. Base models predict the next token but don't follow instructions well — they'll continue your prompt rather than respond to it. Most people never use base models directly; they use the instruct-tuned or chat versions built on top. Useful mostly for researchers and people doing their own fine-tuning.'s license. This isn't a gotcha — it's how the open-sourceA stricter standard than open-weight: the weights, the training code, and the training data are all released publicly. Very few large language models meet the full open-source bar — most "open" models in the AI world are actually open-weight. When in doubt, check the license file and the creator's documentation. ecosystem works — but it's a question people sometimes skip.
Their niche is fine-tune quality, not foundation capability. A Nous fine-tune of Llama will not outperform a Llama model at tasks Llama fundamentally can't do. If Llama 3.1 8B is too small for your use case, Hermes-3 Llama 3.1 8B is also too small for your use case. Nous's value-add is in how the model responds, follows instructions, and handles edge cases — not in raw capability ceiling.
Steerability cuts both ways. Hermes models being more responsive to system prompts is genuinely useful for many applications and genuinely riskier for others. If you're deploying a consumer-facing chatbot in a regulated industry, the more tightly aligned behavior of a first-party Instruct model (Meta's own Llama-3.1-8B-Instruct, for example) may be the safer default. If you're building something where you want more control over the model's voice and behavior, Nous's approach is a feature.
Original Models
This creator has no original models in the catalog yet.
Derivatives Authored
Deephermes 3
DeepHermes 3 Mistral 24B — full entry
DeepHermes 3 8B — full entry
Hermes 3
Hermes 3 3B — full entry
Hermes 3 405B — full entry
Hermes 3 70B — full entry
Hermes 3 — Llama 3.1 8B — full entry