Gabriel Anhaia

Posted on Apr 16

Claude Opus 4.7 Is Here: Everything That Changed

#ai #programming #productivity #news

My Books: The Complete Guide to Go Programming | Hexagonal Architecture in Go
My Tool: Hermes IDE — free, open-source AI shell wrapper for zsh/bash/fish
Me: xGabriel.com | GitHub

Anthropic dropped Claude Opus 4.7 today. Same price as Opus 4.6, but the numbers are hard to ignore: visual acuity jumped from 54.5% to 98.5%, image resolution tripled, coding benchmarks are up 13%, and it resolves tasks that neither Opus 4.6 nor Sonnet 4.6 could solve. Available right now across the API, Bedrock, Vertex AI, and Microsoft Foundry.

Here's everything that actually changed.

The Numbers at a Glance

Metric	Opus 4.6	Opus 4.7	Change
Visual acuity	54.5%	98.5%	+81%
Max image resolution	~1.25 MP	~3.75 MP	3x
Document reasoning errors	baseline	-21%
Complex multi-step workflows	baseline	+14%
Tool call accuracy	baseline	+10-15%
Internal coding benchmark (93 tasks)	baseline	+13%
Finance Agent eval	—	State-of-the-art

Pricing stays at $5/M input tokens and $25/M output tokens.

Vision: From "Kinda Works" to Production-Ready

The biggest jump in this release. Opus 4.7 processes images up to 2,576 pixels on the long edge — roughly 3.75 megapixels. That's 3x the resolution of any previous Claude model.

What this means in practice:

Dense terminal screenshots are now readable. Small fonts, dimmed colors, all of it.
Chemical structures and technical diagrams get parsed correctly instead of hallucinated.
Computer-use agents can finally read real application UIs without squinting.

98.5% visual acuity compared to 54.5% on Opus 4.6. That's not a tuning improvement — it's a capability unlock for anyone building screen-reading or document-processing pipelines.

Coding: Solving What Previous Models Couldn't

13% improvement on Anthropic's internal 93-task coding benchmark. But the more interesting claim: Opus 4.7 resolves tasks that neither Opus 4.6 nor Sonnet 4.6 could solve. Not faster — previously impossible.

Early testers report 3x more production task resolution on engineering benchmarks.

Specific improvements Anthropic highlights:

Cleaner code output. Fewer unnecessary wrapper functions and over-abstractions. You ask for a function, you get a function.
Better error recovery in agentic workflows. When the model hits a wrong path — bad file reference, unexpected schema — it self-corrects instead of doubling down.
More creative reasoning. Better at logic, problem-framing, and finding non-obvious solutions on professional-grade tasks.
Self-correcting during execution. The model catches its own mistakes mid-task and adjusts without human intervention.

New: xhigh Effort Level

Opus 4.7 introduces a new effort parameter value: xhigh. It sits between high and max.

response = client.messages.create(
    model="claude-opus-4-7-20260416",
    max_tokens=8192,
    thinking={
        "type": "enabled",
        "budget_tokens": 8192,
        "effort": "xhigh"
    },
    messages=[{"role": "user", "content": "..."}],
)

Anthropic recommends xhigh as the default starting point for coding and agentic use cases. The logic: high sometimes under-thinks complex problems, max over-spends tokens on simple ones. xhigh balances reasoning depth against latency.

Task Budgets (Public Beta)

A new feature for guiding how the model allocates tokens across a complex task. If you've ever had an agent burn most of its budget on the easy setup steps and run out of gas on the hard part, this is the fix.

Still in public beta, but worth experimenting with for long-running agentic workflows.

Instruction Following: Way More Literal

This one needs a warning. Opus 4.7 follows instructions more literally than any previous Claude model. Anthropic explicitly recommends retuning existing prompts.

What this means: if your prompt says "always respond in JSON," Opus 4.6 might still give you a natural language preamble when it thought that was helpful. Opus 4.7 gives you JSON. Period. Every single time.

Good for production predictability. Potentially breaking for prompts that relied on the model interpreting intent charitably. Audit your system prompts before deploying.

Memory and Multi-Session Work

Better file system-based memory utilization. The model retains important information more reliably across multi-session work. If you're using Claude Code or building agents that span multiple interactions, context retention got a meaningful bump.

Tokenizer Changes (Watch Your Bills)

The tokenizer was updated. Input tokens now increase by 1.0–1.35x depending on content. The per-token price didn't change, but the same text produces more tokens.

Check your:

Rate limit calculations
Context window budgets
Cost monitoring dashboards

If you're running near the context limit, you might start hitting truncation you didn't see before.

Claude Code Updates

For Claude Code users, Opus 4.7 is already live. Two notable additions:

/ultrareview — A dedicated deep code review command. Not linting — actual design-level review. Identifies bugs and architectural issues a careful senior reviewer would catch. Pro and Max subscribers get three free ultrareviews per billing cycle.

Auto mode for Max users — Longer agentic sessions with fewer permission interruptions. Less babysitting, more shipping.

Safety Profile

Largely unchanged from Opus 4.6:

Low rates of deception, sycophancy, and misuse cooperation
Improved honesty and prompt injection resistance
Cybersecurity capabilities deliberately reduced versus Mythos Preview
New Cyber Verification Program for legitimate security researchers who need higher-capability access
Staged rollout approach before broader Mythos-class capabilities

Anthropic describes it as "largely well-aligned and trustworthy," with Mythos Preview still holding the crown for best-aligned model overall.

Availability

Live today on all platforms:

Platform	Model ID
Anthropic API	`claude-opus-4-7-20260416`
Claude.ai	Available (web + desktop)
Amazon Bedrock	Available
Google Cloud Vertex AI	Available
Microsoft Foundry	Available

Pricing: $5/M input, $25/M output. Same as Opus 4.6.

Bottom Line

The vision upgrade alone makes this a significant release — going from 54.5% to 98.5% visual acuity opens up use cases that were genuinely blocked before. The coding improvements and stricter instruction following make it better for production. The tokenizer change means your bills might shift slightly.

Update your model string. Test your prompts. Ship.

If you're working with AI in the terminal, check out Hermes IDE — free, open-source shell wrapper that layers AI completions, git management, and multi-project sessions on top of your existing shell. Works with Claude, Gemini, Aider, Codex, and Copilot.

For more: xGabriel.com.

Top comments (1)

Alexandru A • Apr 18

What stood out to me most in the Coding Solving What Previous Models Couldn't section is that the improvement isn't just incremental it actually changes the ceiling of what is possible. A 13% percent benchmark bump is nice, but the real story is the model handling tasks that earlier versions simply failed at.

The cleaner code output is a big deal in practice. Less over engineering and fewer unnecessary abstractions directly reduce the time spent refactoring AI generated code into something production ready. The self correction during execution also feels like a shift toward more reliable agentic workflows especially for multi step tasks where one wrong assumption used to derail everything.