Opus 4.7 dropped today. The headline numbers are real — 87.6% SWE-bench, +13% on coding benchmarks, triple the vision resolution. Worth upgrading.
But there are three changes in the release notes that compound quietly:
1. New tokenizer
The same input now maps to 1.0–1.35x more tokens depending on content type. On a typical codebase query, that's not nothing.
2. xhigh effort mode
Sits between high and max. Reasons longer per turn. More output tokens on every step.
3. /ultrareview
Spins up parallel multi-agent review. Impressive output. Expensive by design.
None of this is a criticism — a model that reasons more deeply is worth more tokens. The issue is that the underlying pattern hasn't changed: Claude still spends the majority of its budget exploring your codebase before doing anything useful. Smarter reasoning applied to irrelevant files is still wasted tokens.
I've been working on the context side of this problem with vexp.dev — a local MCP context engine that pre-ranks relevant code before the agent starts. On 4.7 the savings compounds more than on 4.6 because deeper reasoning benefits proportionally more from cleaner input.
Anyone else measuring the 4.6→4.7 delta on real tasks?
Top comments (0)