Claude Opus 4.7 Released: The Benchmarks, Pricing, and What's New

Release Summary

Anthropic made Claude Opus 4.7 generally available on April 16, 2026. The headline claim is "notable improvement on Opus 4.6 in advanced software engineering, with particular gains on the most difficult tasks." Pricing is unchanged. The model ID is claude-opus-4-7.

Available across Claude products, the Anthropic API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.

Benchmarks

Representative gains over Opus 4.6:

Benchmark	Opus 4.6	Opus 4.7	Delta
CursorBench (coding)	58%	70%	+12 pts
XBOW (visual acuity / security)	54.5%	98.5%	+44 pts
Rakuten SWE-Bench	baseline	3x more production tasks resolved	-
Finance Agent evaluation	-	State-of-the-art	-

The XBOW jump is the most dramatic number in the release. It measures how well the model can visually parse security-relevant content, which matters for agentic workflows that touch screenshots, PDFs, and UI automation.

The Rakuten SWE-Bench result is the one most teams will feel day-to-day. "3x more production tasks" is Anthropic's framing of how often the model fully resolves real-world bug reports and feature requests compared to 4.6.

Pricing

Unchanged from Opus 4.6:

●Input: $5 per million tokens
●Output: $25 per million tokens

Prompt caching discounts and batch API discounts continue to apply per the standard Anthropic pricing page.

New API Features

Effort levels: xhigh added. Previous levels were low, medium, high, and max. The new xhigh sits between high and max, giving developers finer-grained control over reasoning depth vs latency.

Task budgets (public beta). Developers can now set explicit caps on how many tokens Claude is allowed to spend on a task. Anthropic calls this "guiding Claude's token spend."

Tokenizer change. Opus 4.7 uses an updated tokenizer. The same input can map to more tokens, roughly 1.0 to 1.35 times depending on content type. Anthropic reports that total token usage across effort levels is still improved on their internal coding evaluations, meaning the model uses fewer reasoning tokens to reach comparable or better answers even though individual inputs tokenize to more units.

Get the Weekly IT + AI Roundup

What changed this week in NinjaOne, ServiceNow, CrowdStrike, and AI. One email, every Monday.

No spam, unsubscribe anytime. Privacy Policy

Vision. Image inputs now accept up to 2,576 pixels on the long edge, around 3.75 megapixels. That is more than 3 times the previous ceiling.

Instruction following. Described as "substantially better" than 4.6 with no specific benchmark cited.

Claude Code Additions

/ultrareview slash command. A new dedicated bug-detection review mode in Claude Code. Designed to catch issues a standard review pass might miss.

Auto mode extended to Max users. Auto mode, which lets Claude handle permission prompts automatically, was previously restricted. It is now available to Max subscribers.

Safety and Cybersecurity

The alignment assessment describes Opus 4.7 as "largely well-aligned and trustworthy, though not fully ideal in its behavior." Safety profile is similar to 4.6 with improvements in honesty and prompt injection resistance, and a slight regression on harm-reduction advice.

Project Glasswing safeguards. Opus 4.7 ships with automatic detection and blocking of requests that indicate prohibited or high-risk cybersecurity uses.

Cyber Verification Program. A new program that lets legitimate security professionals access the model for authorized testing. Registration is required.

Upgrading

For the Claude Desktop app and Claude Code, the model selector now lists claude-opus-4-7. If it does not appear, update the application: the VS Code extension bundles its own Claude Code binary, separate from any CLI installed via npm, so updating one does not update the other.

For direct API use, point requests at the claude-opus-4-7 model string. Existing Opus 4.6 code paths remain valid. Opus 4.6 is not immediately deprecated.

What to Watch Next

Three open questions after release day:

●Does the 3x Rakuten SWE-Bench result hold up on other real-world benchmarks? Synthetic coding evals have diverged from real production outcomes before.
●How does xhigh compare to max in practice? If the latency delta is small, xhigh becomes the new default for most agentic work.
●Does the tokenizer change shift any existing prompt caching strategies? Cached blocks are recomputed if token boundaries move.

Opus 4.7 is available now. The factual case for upgrading is straightforward: same price, better on every benchmark Anthropic published, and the Claude Code tooling gets a genuine new capability in /ultrareview. Whether the real-world coding gains match the benchmark gains is the part that always takes a few weeks to settle.

Share this article

Find Your Perfect AI Match

Not sure which AI tools are right for you? Take our free 3-minute quiz.

Take the Quiz

Claude Opus 4.7 Released: The Benchmarks, Pricing, and What's New

In This Article