Claude Opus 4.8: What Changed Since 4.7 and 4.6

What was announced#

Anthropic released Claude Opus 4.8 on May 28, 2026, positioning it as their most capable generally available model for "complex reasoning, long-horizon agentic coding, and high-autonomy work" (Anthropic announcement, Models overview). With this release, Opus 4.6 and Opus 4.7 move into the "legacy" section of Anthropic's model catalog. They remain available but are no longer the recommended default.

This post compares the three Opus 4.x releases against each other on the facts that matter for someone deciding whether to migrate: pricing, context, capabilities, benchmarks, and the API features attached to each version.

At-a-glance comparison#

All three models live at the same headline price point. The differences are elsewhere.

	Opus 4.6	Opus 4.7	Opus 4.8
Release date	Feb 5, 2026	Apr 16, 2026	May 28, 2026
Model ID	`claude-opus-4-6`	`claude-opus-4-7`	`claude-opus-4-8`
Input pricing	$5 / MTok	$5 / MTok	$5 / MTok
Output pricing	$25 / MTok	$25 / MTok	$25 / MTok
Context window	1M tokens	1M tokens	1M tokens (200k on Microsoft Foundry)
Max output (sync)	128k tokens	128k tokens	128k tokens
Max output (batch)	300k tokens (beta)	300k tokens (beta)	300k tokens (beta)
Extended thinking	Yes	No	No
Adaptive thinking	Yes	Yes	Yes
Reliable knowledge cutoff	May 2025	Jan 2026	Jan 2026
Status	Legacy	Legacy	Current default

Source: Anthropic Models overview, accessed May 28, 2026.

Pricing has not moved — but fast mode got cheaper#

The standard pricing for all three Opus 4.x models is $5 per million input tokens and $25 per million output tokens. Up to 90% prompt caching discount and 50% batch processing discount apply on Opus 4.8, as Anthropic notes in the Opus 4.8 announcement.

The change with Opus 4.8 is fast mode. Anthropic priced it at $10 input / $50 output per million tokens, which the announcement describes as "3× cheaper than previous fast mode." If you were running production workloads on Opus 4.7 fast mode, this is a meaningful operating-cost change. If you used standard mode, your bill stays the same.

Opus 4.6 carried an extended-context surcharge above 200k tokens: $10 / $37.50 per million input / output, plus a 1.1× multiplier for US-only inference (Opus 4.6 announcement). Anthropic's Opus 4.7 and 4.8 pages do not currently document an equivalent surcharge for >200k context use, so the long-context economics may have changed; verify against your invoice if you regularly push past 200k tokens.

Context window is the same — but the tokenizer changed#

All three Opus 4.x models advertise a 1M-token context window with 128k tokens maximum output in the synchronous Messages API, and 300k output tokens in the Message Batches API behind the output-300k-2026-03-24 beta header.

There is a footnote that matters for capacity planning. Per Anthropic's model docs, Opus 4.7 uses a new tokenizer versus 4.6, which means 1M tokens corresponds to roughly 555k words on 4.7 and 4.8, compared to roughly 750k words on 4.6. If you have prompt-budgeting code calibrated against character or word counts, expect it to behave differently between 4.6 and 4.7/4.8 even if the token ceiling is identical on paper.

One platform exception: on Microsoft Foundry, Opus 4.8 is capped at a 200k context window, not 1M. This is documented in a footnote on the Models overview page.

Thinking modes: the controversial change#

Opus 4.6 supported both adaptive thinking (the model decides when and how much to think) and extended thinking (developer-controlled extended reasoning). Starting with Opus 4.7, Anthropic dropped extended thinking and kept only adaptive thinking. Opus 4.8 follows the same model.

Practically this means if your code explicitly set extended thinking budgets on Opus 4.6, you cannot port that exact control to 4.7 or 4.8. The replacement is the effort parameter, with four levels — low, medium, high, max — plus an xhigh level introduced with Opus 4.7. On Opus 4.8 the effort parameter defaults to high on all surfaces including the API and Claude Code, per the model docs. You have to set it explicitly to use a different level.

What Anthropic claims got better in 4.8 versus 4.7#

Anthropic's Opus 4.8 announcement lists specific reliability gains over 4.7, not vague "smarter" claims. The notable ones:

Code review: Opus 4.8 is "around four times less likely" to let flaws in code pass undetected compared to 4.7. The post also notes that the verbose-commenting and tool-calling regressions some users reported in 4.7 are fixed.
Agentic tasks: better judgment and reliability across multi-step workflows, with more efficient tool calling (fewer steps to reach the same result).
Citation precision: improved on retrieval-heavy tasks.
Token efficiency on retrieval: improved, which means lower cost at the same input length when reading documents.

On benchmarks, Anthropic publishes:

Super-Agent benchmark: Opus 4.8 is "the only model to complete every case end-to-end," ahead of prior Opus models and GPT-5.5 at cost parity.
Legal Agent Benchmark: highest score recorded, "first model to break 10% overall on the all-pass standard."
Online-Mind2Web (computer-use / browser-agent): 84%.
OSWorld-Verified: 82.3%.
Terminal-Bench 2.1: outperforms GPT-5.5.

Independent reproductions of these numbers are not available the day of release. Treat them as vendor-reported.

What 4.7 brought over 4.6#

For completeness, the Opus 4.7 announcement cited the following gains over 4.6:

13% resolution lift on a 93-task internal coding benchmark versus Opus 4.6.
98.5% on Anthropic's visual-acuity benchmark for computer-use work, up from 4.6's 54.5%.
State-of-the-art on GDPval-AA, Anthropic's economically-valuable knowledge-work eval.
Vision input now accepts images up to 2,576 pixels on the long edge (~3.75 megapixels), "more than triple previous Claude models."
New xhigh effort level between high and max.
New /ultrareview slash command for dedicated code review sessions.
Task budgets in public beta.

4.7 also tightened instruction-following — Anthropic explicitly notes that users may need to re-tune existing prompts because the model interprets them more literally than 4.6 did.

What 4.6 brought as the starting point#

The Opus 4.6 announcement (February 5, 2026) was the first Opus 4.x release. Key facts:

1M-token context window in beta, originally limited to the Claude Platform.
128k output tokens.
GDPval-AA: outperformed GPT-5.2 by approximately 144 Elo points.
MRCR v2 (long-context retrieval): 76%, versus Sonnet 4.5's 18.5%.
BigLaw Bench: 90.2%.
New: adaptive thinking, four effort levels (low/medium/high/max), context compaction for long-running tasks, agent teams in Claude Code (research preview), Claude in PowerPoint (research preview).

New API features added with 4.8#

Three features ship alongside Opus 4.8, per the announcement:

Dynamic workflows with parallel subagents, available on enterprise plans.
Effort control exposed inside claude.ai and Cowork (previously API-only on some surfaces).
System entries mid-task in the Messages API — the ability to inject system-level instructions during an ongoing conversation rather than only at session start.

The last one is the kind of plumbing change that does not look exciting but matters for agentic workflows where the operator needs to hand off context or change behavior mid-run.

Should you migrate?#

Anthropic's own migration guide is the authoritative reference, but the pattern is straightforward:

From 4.7: Same pricing in standard mode, same context, same output cap, and concrete reliability gains in code review and agentic work. Effort defaults change to high on 4.8 — verify your code is not relying on a different implicit default. Fast-mode users see a 3× price drop.
From 4.6: You will lose extended-thinking control (replaced by the effort parameter), gain a newer knowledge cutoff (Jan 2026 versus May 2025), and pick up every improvement from both 4.7 and 4.8. Prompts may need re-tuning because 4.7 introduced stricter instruction interpretation that carries into 4.8.

For most production users on either 4.6 or 4.7, the migration is small. The only group that needs to plan carefully is anyone whose prompts explicitly request extended-thinking budgets, since that lever no longer exists.

What is not in this post#

A few things we deliberately do not claim:

We have not benchmarked Opus 4.8 ourselves on the day of release. Numbers cited are vendor-reported.
We do not know yet whether independent third-party evaluations (LMArena, SWE-bench Verified replications, Aider benchmarks) will confirm the agentic claims. Those typically arrive within two to four weeks of a frontier release.
The Opus 4.8 system card was not published as a separate PDF at the time of writing; Anthropic's release pattern suggests it will follow shortly.

If you operate AI in production, the right move is usually: pin to a dated model ID, run your own eval suite, and migrate when the numbers on your tasks justify it. Vendor announcements are a signal, not a verdict.

Sources#

Roland Hentschel

AI & Web Technology Expert

Web developer and AI enthusiast helping businesses navigate the rapidly evolving landscape of AI tools. Testing and comparing tools so you don't have to.

Claude Opus 4.8: What Changed Since 4.7 and 4.6

What was announced#

At-a-glance comparison#

Pricing has not moved — but fast mode got cheaper#

Context window is the same — but the tokenizer changed#

Thinking modes: the controversial change#

What Anthropic claims got better in 4.8 versus 4.7#

What 4.7 brought over 4.6#

What 4.6 brought as the starting point#

New API features added with 4.8#

Should you migrate?#

What is not in this post#

Sources#

Roland Hentschel

Tools Covered in This Post

Claude Guide 2026

Opus Clip Guide 2026

Tome Guide 2026

More from the Blog

4 New AI Tools: Content, AI Search, Websites, Video

3 Niche AI Tools Worth Knowing: Research, Agents, and E-Commerce

Open-Source AI Radar: 99 Rising GitHub Repos (July 2026)