Originally published at news.skila.ai
Anthropic released Claude Opus 4.7 on April 16 2026. The pitch is three words long: hand it off.
Hand off the refactor you've been dodging. Hand off the migration everyone punted on. Hand off the bug that took two senior engineers a full day last quarter. That is the framing Anthropic is using, and the benchmark numbers suggest it is not marketing fluff.
SWE-Bench Verified: 41.6%. CursorBench: 70%, up from 58% on Opus 4.6. Rakuten's internal SWE-Bench variant says Opus 4.7 resolves three times more production tasks than its predecessor. Box deployed it internally and measured a 56% drop in model calls and a 24% response speedup.
Same price as 4.6. $5 per million input tokens. $25 per million output tokens. No premium for the new capabilities.
What actually changed between 4.6 and 4.7
Five months is a short gap for a flagship model. Three capability shifts stand out:
1. Coding benchmarks jumped across the board. SWE-Bench Verified is the industry's closest proxy for real software engineering work. Opus 4.7 hits 41.6% — ahead of GPT-5.4 and Gemini 3.1 Pro on the same benchmark.
2. Vision got a 3x resolution upgrade. The model now accepts images up to 2,576 pixels on the long edge — roughly 3.75 megapixels. You can feed it a full-resolution Figma export or a 4K dashboard screenshot without downsampling.
3. File-system memory for long sessions. Opus 4.7 has improved multi-session memory tied to files. For devs running agent loops that span hours or days, the model holds context better across sessions.
The benchmark numbers in context
Box ran its own evaluation after integrating Opus 4.7 into internal agent workflows:
- 56% reduction in total model calls per task
- 50% fewer tool calls per task
- 24% faster end-to-end response time
- 30% fewer AI Units consumed per completed task
Read that again. Fewer calls. Fewer tools invoked. Faster. Cheaper per finished task.
The catch: tokenizer changes
Opus 4.7 ships with an updated tokenizer. Anthropic says input token counts run 1.0 to 1.35 times higher than 4.6 for the same prompt. At higher effort levels, output token counts also climb.
What does that mean in practice? If you were spending $800 a month on Opus 4.6, your worst case on 4.7 is roughly $1,080 — before accounting for the 30% fewer AI Units that Box measured on finished tasks. Net-net, teams running agent loops should see a cost drop.
Where Opus 4.7 is available
Day-one availability:
- claude.ai and Claude Code — default model for Pro, Max, Team, and Enterprise
-
Anthropic API — model ID
claude-opus-4-7 - AWS Bedrock — us-east-1 and us-west-2 at launch
- Microsoft Foundry — global availability
- Google Vertex AI — publisher model, available on launch day
The agent architecture shift
Before Opus 4.7, most teams built agent loops with a cheaper reasoning model plus a more expensive model for hard steps. With Box's reported 56% drop in total model calls, running Opus 4.7 on every turn is often cheaper than the router setup because you stop paying for reasoning-model calls that never produced useful output.
If you're building agent loops in 2026, this model changes the cost math enough to revisit your architecture assumptions.
Full article with benchmarks, cost math, and FAQ: news.skila.ai
Top comments (0)