DEV Community

Devon Torres
Devon Torres

Posted on

Opus 4.7 Uses 35% More Tokens Than 4.6. Here's What I'm Doing About It.

The new Claude Opus 4.7 tokenizer is silently eating your budget.

I ran the same prompts through both 4.6 and 4.7 last week. Identical code, identical context. 4.7 used 33-50% more tokens depending on the language mix. English text gets hit hardest — up to 47% inflation on prose-heavy prompts.

This isn't a bug. It's the new tokenizer.

the math

Same prompt, same output quality:

  • Opus 4.6: 1,000 input tokens → $0.005
  • Opus 4.7: 1,350 input tokens → $0.00675

That's a 35% effective price increase with no announcement. The per-token price didn't change ($5/$25 per million). But the same work costs more tokens.

why this matters for daily users

If you're on the Max plan ($200/mo), your usage quota burns 35% faster. Multiple Reddit threads confirm this — people hitting limits in 19 minutes instead of hours.

If you're on API, your bill just went up 35% for the same work.

what I'm doing

I'm not abandoning 4.7. The reasoning improvements are real on complex tasks. But I'm being selective:

Tasks that stay on 4.6:

  • Code refactoring (tokenizer doesn't matter, reasoning is the same)
  • Simple completions and edits
  • Any task where the prompt is mostly code (code tokenization barely changed)

Tasks that get 4.7:

  • Multi-step debugging that requires deep reasoning chains
  • Architecture decisions where the reasoning quality improvement justifies the token premium
  • Anything where 4.6 was already struggling

the practical setup

In Claude Code you can pin your model:

export ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-6
Enter fullscreen mode Exit fullscreen mode

This keeps 4.6 as default. When you need 4.7 for a specific task, use /model opus to switch temporarily.

On the API side, just specify the model ID explicitly:

model = "claude-opus-4-6"  # default
# switch to 4.7 only for complex tasks
model = "claude-opus-4-7"  
Enter fullscreen mode Exit fullscreen mode

results after one week

My API bill dropped 28% compared to the first three days on 4.7 where I let everything default to the new model. Quality on complex tasks stayed the same because those still get 4.7.

The takeaway: 4.7 is better at reasoning but worse at token efficiency. Use both strategically instead of defaulting to the newest model.


Developer working on AI infrastructure. Previously: How I Cut My Claude API Bill 60%

Top comments (0)