Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →
Claude Opus 4.8
Model family: claude-opus-4
- llm
- closed-api
- frontier
- multimodal
- long-context
- coding
- agentic
- us-based
- proprietary
Quick Take
Anthropic's flagship: top-tier agentic coding and careful reasoning with a 1M-tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. context windowThe maximum amount of text the model can "see" at once — prompt plus prior conversation plus any documents you give it. Measured in tokens (which are roughly three-quarters of a word each). A 128K context window is about 96,000 words of input — roughly a 400-page book. Larger context windows let the model work with bigger documents but cost more to run. — premium-priced, closed, and a common pick for high-stakes code and analysis.
Plain-English Description
Claude Opus 4.8, released in late May 2026, is the most capable model in Anthropic's lineup, aimed at the hardest work: autonomous coding agents, complex reasoning, and tasks where accuracy in legal, financial, or technical contexts is what you're paying for. It carries a 1-million-tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. context windowThe maximum amount of text the model can "see" at once — prompt plus prior conversation plus any documents you give it. Measured in tokens (which are roughly three-quarters of a word each). A 128K context window is about 96,000 words of input — roughly a 400-page book. Larger context windows let the model work with bigger documents but cost more to run. and continues Anthropic's particular strength in coding reliability, with measurable gains over the previous flagship on agentic coding benchmarks.
It's priced as a flagship — $5 in / $25 out per million tokens — which is notable in that it matches GPT-5.5 on input and undercuts it on output, while Anthropic's aggressive prompt-caching (90% off cached input) can make long-context agents surprisingly affordable. There's also a higher-cost "fast mode" for latency-sensitive use.
The honest framing, which Anthropic itself echoes: most workloads don't need Opus. The mid-tier Sonnet handles the majority of production tasks well, and the flagship earns its premium specifically when autonomous reasoning, code quality, or high-stakes accuracy justifies it. And like all of Claude, it's closed — there's no version to download or run yourself.
Best For
- Autonomous coding agents and complex, multi-step engineering tasks.
- High-stakes reasoning where accuracy justifies the premium (legal, financial, technical analysis).
- Long-context work — large codebases or document sets in a single 1M-tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. window, made affordable by caching.
- Teams that value Claude's coding reliability and careful instruction-following.
Not For
- Everyday or high-volume work — Claude Sonnet 4.6 does most of it well at lower cost.
- Cost-minimizing budgets — it's a premium tier; route cheap tasks to Claude Haiku 4.5.
- Self-hosting or fully in-house data — Claude is closed with no self-host path; use an open-weightA model where the trained weights are freely downloadable — you can run it yourself without contacting the creator. Llama, Mistral, Qwen, and Gemma are open-weight. Open-weight does not mean open-source: the training data and code often stay private. The license still governs what you can do with the weights, including whether you can use them commercially. model.
- Buyers optimizing purely on tokenThe basic unit of text a model reads and writes. Tokens are roughly three-quarters of a word in English — so 100 tokens is about 75 words. Models don't see letters or words directly; they see tokens. Pricing is almost always quoted per million tokens, and context windows are measured in tokens rather than words. price — DeepSeek and open models go far cheaper.
License — Plain-English Summary
Proprietary and API-only. Claude Opus 4.8 is accessed through Anthropic's API (or Amazon Bedrock / Google Vertex AI) under Anthropic's commercial terms — you get commercial rights to what you build with the outputs, but no rights to the model: no weightsThe numerical values inside a trained model that encode everything it has learned. A model is, functionally, a giant list of weights — tens of billions of numbers for a mid-sized model, hundreds of billions for a frontier model. "Open-weight" means those numbers are published. "Downloading the weights" means getting the actual file you'd need to run the model yourself., no modification, no redistribution. Diligence items are Anthropic's usage policy and the data terms of whichever endpoint you use (enterprise and zero-retention options exist). There is no open or self-hostable version of any Claude model.
How It Compares
Against Claude Sonnet 4.6, Opus is more capable for the hardest tasks but several times the cost — Sonnet is the better default for most work. Against Claude Opus 4.7, 4.8 is a cost-neutral upgrade with real coding/agentic gains. Against GPT-5.5, the two trade blows at a similar flagship tier — Claude is often preferred for coding reliability and undercuts GPT-5.5 on output price, while GPT-5.5 brings a larger context windowThe maximum amount of text the model can "see" at once — prompt plus prior conversation plus any documents you give it. Measured in tokens (which are roughly three-quarters of a word each). A 128K context window is about 96,000 words of input — roughly a 400-page book. Larger context windows let the model work with bigger documents but cost more to run. and OpenAI's broader tooling. Against Google's Gemini 3.5 Flash, Gemini leads on multimodality and price; Claude on coding depth.
Cost
- API input (per 1M tokens)
- $5.00
- API output (per 1M tokens)
- $25.00
- API providers
- anthropic-api, amazon-bedrock, google-vertex-ai
- Notes
- $5.00 input / $25.00 output per million tokens (unchanged from Opus 4.7/4.6). A faster "fast mode" runs $10 / $50. Prompt caching cuts cached input ~90%; batch processing is ~50% off. 1M-token context at flat rate. No self-hosting.
Comparable models
Commercial-use conditions
Commercial use permitted through the Anthropic API (or Amazon Bedrock / Google Vertex AI) under Anthropic's terms. You buy access, not the model — no weights, no modification, no redistribution.