Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →
Mistral Large 3 Base
Model family: mistral-large-3
Pretrained base variant of Mistral's Large 3 flagship — 675B MoEA model architecture that splits the model into many smaller specialized "expert" networks, only activating a handful per input rather than running the whole model every time. The practical effect: you get the knowledge capacity of a big model with the compute cost of a much smaller one. Mistral Large 3 and Mistral Small 4 are both MoE models., Apache 2.0, for teams that want to run their own instruction-tuning or domain adaptation.
Listing Notes
This is the pretrained base (non-instruction-tuned) variant of Mistral Large 3. For direct chat, agent, or Q&A use, reach for Mistral Large 3 Instruct instead — the base modelA model straight out of pretraining, before any fine-tuning for chat or specific tasks. Base models predict the next token but don't follow instructions well — they'll continue your prompt rather than respond to it. Most people never use base models directly; they use the instruct-tuned or chat versions built on top. Useful mostly for researchers and people doing their own fine-tuning. produces text continuations rather than answering questions, and behaves unlike a typical assistant out of the box. Base models are primarily interesting to teams planning their own instruction-tuning, reward-modeling, or domain-adaptation fine-tuning workflows. Same 675B MoEA model architecture that splits the model into many smaller specialized "expert" networks, only activating a handful per input rather than running the whole model every time. The practical effect: you get the knowledge capacity of a big model with the compute cost of a much smaller one. Mistral Large 3 and Mistral Small 4 are both MoE models. architecture, same 256K context, same Apache 2.0 license as the Instruct variant.
Identity
- Creator
- Mistral AI
- Model family
- mistral-large-3
- Release date
- 2025-12-01
Technical specs
- Parameter count
- 675B
- Context window
- 262K tokens
- Modalities
- Image Input
- Text
- Primary capabilities
- Long Context
- Multilingual
License
- License
- Apache 2.0
- Commercial use
- Allowed
- Terms
- Modification ✓
- Redistribution ✓
- Attribution ✓
Access
- Openness
- Open Weight
- Access methods
- Local Runtime Vllm
- Weights Download Direct
- Weights Download Hf
- Cost tier
- Self Hosted Only
- llm
- open-weight
- commercial-friendly
- frontier
- long-context
- multimodal
- multilingual
- eu-based
- moe
- apache-licensed
- base-model