← Back to hard AIs

Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →

Models · Meta

Feature-frozen. The creator has frozen feature development on this model (critical fixes only).

Llama 3.1 405B Instruct

Model family: llama-3-1

Meta's Llama 3.1 405B chat modelShorthand for an instruct-tuned model specifically designed for back-and-forth conversation rather than single-shot tasks. Chat models remember earlier turns in the conversation (within the context window) and respond in a conversational register. GPT-4, Claude, and most Llama Instruct variants are chat models. In practice, "chat model" and "instruct-tuned model" often mean the same thing. — frontier-scale dense modelA model where every parameter is used for every input — the entire model runs on every token. Contrast with sparse or Mixture of Experts models, which activate only a fraction of the model per input. Dense models are simpler and more predictable; MoE models are more efficient at scale. with 128K context and strong multilingual reasoning. Practical access is through hosted APIAccessing a model by sending requests to the creator's (or a provider's) servers, typically pay-per-use. Hosted APIs handle all the operational work — scaling, hardware, uptime — in exchange for a per-token or per-request fee. Every closed-API model is hosted; many open-weight models are also available via hosted APIs from providers like Together, Fireworks, or Groq. providers; self-hosting requires a multi-GPUThe specialized chip that runs most AI models. Originally designed for 3D graphics, GPUs turned out to be excellent at the math AI requires. Nvidia dominates the AI GPU market; common datacenter models include the H100, H200, and B200. Running an AI model without a GPU is possible but painfully slow for anything but the smallest models. server cluster.

Identity

Creator
Meta
Model family
llama-3-1
Release date
2024-07-22

Technical specs

Parameter count
405B
Context window
131K tokens
Modalities
  • Text
Primary capabilities
  • Chat
  • Instruction Following
  • Long Context
  • Multilingual
  • Reasoning
  • Tool Use

License

License
Llama 3.1 Community License
Commercial use
  • Conditional

Free for commercial use unless the licensee's product has 700 million monthly active users measured at Llama 3.1 release date.

Terms
  • Modification
  • Redistribution
  • Attribution

Access

Openness
  • Open Weight
Access methods
  • Api Third Party
  • Local Runtime Vllm
  • Weights Download Direct
  • Weights Download Hf
Cost tier
  • Mixed

Full model card →