Claude Opus 4.8: AI price, benchmarks
Claude Opus 4.8 by Anthropic: 94.0 overall (preliminary), $20.00/M blended cost, 1M context. New 2026-05 release.
Claude Opus 4.8 by Anthropic: 94.0 overall (preliminary), $20.00/M blended cost, 1M context. New 2026-05 release.
Claude Opus 4.8 is a 2026-05 Anthropic model in the Benchquill record with a preliminary 94.0 overall score, $20.00/M blended cost, and 1M context window. Its full Benchquill sub-scores are pending review. Anthropic's most capable model (released May 28 2026). Beats GPT-5.5 on Terminal-Bench 2.1, OSWorld-Verified 82.3%, Online-Mind2Web 84%; first to break 10% on Legal Agent Bench all-pass. Adds a 2.5x fast mode.
| Rank | Model | Provider | Overall | Blended cost | Context |
|---|---|---|---|---|---|
| 2 | Claude Opus 4.8 | Anthropic | 94.0 | $20.00/M | 1M |
| Rank | Model | Provider | Overall | Blended cost | Context |
|---|---|---|---|---|---|
| 1 | GPT-5.5 | OpenAI | 94.6 | $23.75/M | 1.05M |
| 3 | Claude Opus 4.7 | Anthropic | 93.8 | $20.00/M | 1M |
| 4 | Gemini 3.1 Pro Preview | 92.4 | $9.50/M | 1M | |
| 5 | GPT-5 | OpenAI | 91.2 | $7.81/M | 400k |
Claude Opus 4.8 is a 2026-05 release with a preliminary Benchquill overall of 94.0 (pending full review). Anthropic's most capable model (released May 28 2026). Beats GPT-5.5 on Terminal-Bench 2.1, OSWorld-Verified 82.3%, Online-Mind2Web 84%; first to break 10% on Legal Agent Bench all-pass. Adds a 2.5x fast mode. Blended cost $20.00/M, 1M context, closed.
| Metric | Score / value |
|---|---|
| Overall (Benchquill composite) | 94.0 / 100 (preliminary) |
| Coding | Pending review |
| Reasoning | Pending review |
| Math | Pending review |
| Vision / multimodal | Pending review |
| Speed (estimated) | 80 tokens/sec |
| Input price | $5.00 / 1M tokens |
| Output price | $25.00 / 1M tokens |
| Blended price | $20.00 / 1M tokens |
| Context window | 1M |
| License | Closed |
| Modalities | Text, Vision |
| Released | 2026-05 |
| Provider | Anthropic |
Preliminary estimate (June 2026) - full Benchquill sub-scores pending review; figures from vendor/public benchmarks. Benchmarks tracked across Benchquill: SWE-Bench Verified, HumanEval, LiveBench, BFCL v3, GPQA Diamond, MMLU, MMMU, AIME 2025, MATH-500.
Send a note to the editorial team. We reply within 24–48 hours.