← Back to hard AIs

Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →

Models · Mistral AI

Mistral Embed

Model family: embeddings

Context
8,192 tokens
Released
2024-02-25
Openness
closed-api
License
Cost tier
paid-api
Rating
3.5 — Solid general-purpose embeddings at competitive pricing with the EU-jurisdiction benefit that closed U.S. alternatives don't offer. Not category-leading on retrieval benchmarks — OpenAI's `text-embedding-3-large` and Voyage AI's specialist models generally edge it out on public MTEB evaluations — but perfectly adequate for most RAG and semantic-search workloads and the natural pick if you're already in the Mistral API ecosystem.
Modalities
text
Capabilities
classification, embeddings, multilingual, rag
Access
api-first-party, api-third-party

Quick Take

Mistral's general-purpose text embedding model — competitive with OpenAI and Cohere for standard RAG and semantic-search workloads, at the usual Mistral advantage of EU jurisdiction and aggressive pricing.

Plain-English Description

Mistral Embed (accessed via the API model name mistral-embed) is the company's general-purpose text embedding model — the counterpart to Codestral Embed's code specialization. It converts text into 1,024-dimensional vectors suitable for semantic search, retrieval-augmented generation (RAG), clustering, classification, and recommendation systems. This is one of Mistral's older models, initially released alongside the original Mistral 7B in February 2024, and it's been the steady workhorse of Mistral's embedding offering ever since.

The spec sheet is straightforward: 8,192-tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. input context, 1,024 output dimensions, multilingual support, API-only access at $0.10 per million tokens. No frills, no Matryoshka variable-dimension support (unlike Codestral Embed), no open weightsThe numerical values inside a trained model that encode everything it has learned. A model is, functionally, a giant list of weights — tens of billions of numbers for a mid-sized model, hundreds of billions for a frontier model. "Open-weight" means those numbers are published. "Downloading the weights" means getting the actual file you'd need to run the model yourself.. The model's position in the Mistral lineup is essentially "the competent default for general embedding tasks" — you reach for it when you need embeddings and you're already using Mistral for other things, or when you want EU-jurisdiction embeddings without paying OpenAI or Cohere.

Independent benchmark performance on MTEB (the standard embedding evaluation suite) is solid but not category-leading. OpenAI's text-embedding-3-large and Voyage AI's specialist models generally outperform Mistral Embed on public retrieval benchmarks. For most RAG and semantic-search workloads the difference is small enough that operational factors (provider relationship, pricing, jurisdiction, rate limits) dominate the selection. But if retrieval quality is the binding constraint, Mistral Embed is rarely the top-of-leaderboard pick.

Best For

  • Teams already in the Mistral API ecosystem. If you're using Mistral Small 4 or Medium 3.1 for generation, keeping embeddings in the same ecosystem reduces vendor count and simplifies billing.
  • Standard RAG pipelines where retrieval quality is adequate rather than bleeding-edge. Document search, knowledge-base retrieval, Q&A over private corpora. Mistral Embed is solidly capable at this tier.
  • Multilingual embedding workloads in European markets. The model handles major European languages cleanly and sits with a French vendor under EU jurisdiction.
  • Cost-optimized embedding at scale. $0.10/M tokens is competitive. For indexing very large corpora, the per-tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. cost matters.

Not For

  • Code-specific retrieval. Codestral Embed is Mistral's own specialized code embedder and substantially outperforms Mistral Embed on code tasks. If your content is code-heavy, route to Codestral Embed.
  • Teams that need state-of-the-art retrieval performance. On public MTEB benchmarks, specialist embedders from Voyage AI, Cohere, and OpenAI's latest releases typically outperform Mistral Embed. For retrieval-quality-critical workloads, those are the options to evaluate first.
  • Variable-dimension embeddings. Mistral Embed outputs 1,024 dimensions fixed. For Matryoshka-style nested dimensions that let you trade quality for storage cost, use Codestral Embed (for code) or OpenAI's embedding models (for text).
  • Open-weightA model where the trained weights are freely downloadable — you can run it yourself without contacting the creator. Llama, Mistral, Qwen, and Gemma are open-weight. Open-weight does not mean open-source: the training data and code often stay private. The license still governs what you can do with the weights, including whether you can use them commercially. requirements. Closed-APIA model that's only accessible through the creator's own API or product — you can't download it, run it yourself, or inspect its weights. GPT-4, Claude, and Gemini Pro are closed-API models. The tradeoff is convenience and often capability (closed-API models are frequently the strongest) versus loss of control over data, pricing, and availability. only. For open-weight general embeddings, look to community releases like Nomic Embed, BGE, or Jina Embeddings.

License — Plain-English Summary

Proprietary closed-APIA model that's only accessible through the creator's own API or product — you can't download it, run it yourself, or inspect its weights. GPT-4, Claude, and Gemini Pro are closed-API models. The tradeoff is convenience and often capability (closed-API models are frequently the strongest) versus loss of control over data, pricing, and availability. model. Standard pay-per-tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. API access — you send text to Mistral's API and receive embedding vectors back. No weightsThe numerical values inside a trained model that encode everything it has learned. A model is, functionally, a giant list of weights — tens of billions of numbers for a mid-sized model, hundreds of billions for a frontier model. "Open-weight" means those numbers are published. "Downloading the weights" means getting the actual file you'd need to run the model yourself., no self-hosting, no modification. Enterprise on-premise deployment available on a negotiated basis.

How It Compares

  • vs. OpenAI text-embedding-3-large — OpenAI generally wins on public MTEB benchmarks for general retrieval. Mistral Embed is cheaper, has EU-jurisdiction benefits, and is easier to integrate if you're already using Mistral for generation.
  • vs. Cohere Embed v4.0 — Cohere is the multilingual embedding specialist and has deeper multilingual benchmark coverage. Mistral Embed is adequate multilingually but Cohere is stronger if multilingual retrieval quality is the priority.
  • vs. Codestral Embed — Sibling model from Mistral, code-specialized. Use Mistral Embed for text-heavy content, Codestral Embed for code-heavy content. For mixed content, either can work; the task-specific winner depends on the mix.
  • vs. Voyage AI embeddings — Voyage publishes task-specific embedders (retrieval, reranking, code, law, finance). For any given specialized task, Voyage often has a purpose-built model that outperforms Mistral Embed's general-purpose offering. For a single general-purpose embedder, Mistral is simpler.

Under the Hood

Mistral Embed outputs 1,024-dimensional float32 vectors by default. The model has been stable since its February 2024 release with no publicly announced version updates, which is unusual for an actively-marketed embedding model in 2026 — OpenAI ships new embedding model generations annually. Whether this reflects Mistral's internal deprioritization of the general-purpose embedder in favor of specialists like Codestral Embed, or simply that the model performs well enough not to warrant a refresh, isn't publicly stated.

Integration paths include Mistral's Python and TypeScript SDKs, Spring AI's Mistral integration, LangChain's MistralAIEmbeddings, LlamaIndex's Mistral connector, and OpenRouter's OpenAI-compatible embedding API. Fine-tuning is not publicly supported.

Cost

API input (per 1M tokens)
$0.10
API providers
mistral, openrouter
Notes
Competitive pricing against OpenAI, Cohere, and Voyage general-purpose embedders. Outputs a 1,024-dimensional vector by default.

Comparable models

Sources