DEV Community

Fillip Kosorukov
Fillip Kosorukov

Posted on

Claude Code Is Silently Burning 10-20x Your Token Budget — Here's the Fix

If you're on Claude Max and wondering why your usage cap disappears in an hour, you're not alone. I went down this rabbit hole after burning through my entire daily quota in a single coding session.

The Problem

There are two independent bugs inflating Claude Code token usage right now:

Bug 1: Broken prompt caching in the standalone binary

The standalone Claude Code binary (installed via claude.ai/install.sh or npm install -g) ships with Anthropic's custom Bun fork. Baked into this fork is a native-layer string replacement that modifies the JSON request body after serialization but before TLS encryption. It targets a billing attribution sentinel (cch=24b72) and replaces it with a 5-character hash.

The issue: it replaces the first occurrence in the serialized body. Since messages[] comes before system[] in the JSON, if your conversation history contains the sentinel string (from discussing billing, reading CC source, or having it in your CLAUDE.md), the wrong instance gets replaced. This changes the message content on every request, breaking the cache prefix and forcing a full rebuild — roughly $0.04-0.15 per request wasted.

Bug 2: MCP connector tool schema injection

Cloud MCP connectors (Ahrefs, Supabase, Similarweb, etc.) inject their complete tool definition schemas into every API call, regardless of whether you're using them. Ahrefs alone adds 100+ tool definitions. That's tens of thousands of tokens of overhead on every single message.

The Fixes

Switch to npx to bypass the Bun binary:

# One-time alias setup
echo "alias claude='npx @anthropic-ai/claude-code'" >> ~/.bashrc
source ~/.bashrc
Enter fullscreen mode Exit fullscreen mode

This runs Claude Code through Node instead of the custom Bun fork, sidestepping the sentinel replacement bug entirely. Only downside is slightly slower startup (npx checks the registry each time).

Disconnect unused MCP connectors from Claude Settings → Connectors. Keep only what you actively use. Disconnected connectors show "Needs authentication" and stop injecting tool schemas.

Other mitigations:

  • Use /compact to compress context mid-session
  • Avoid --resume (can inherit broken cache from previous sessions)
  • Start fresh sessions frequently
  • Keep your CLAUDE.md lean

Status

Anthropic is aware of both issues. The cache bug is tracked at anthropics/claude-code#24147 and #40524. No public fix date yet, but the word is it's coming.

Credit to Paweł Huryn and Alex Volkov for surfacing this publicly on X.

Top comments (0)