How I cut my Claude Code costs in half: token efficiency settings that actually work
There's a GitHub repo going viral on HN right now — drona23/claude-token-efficient — claiming a universal CLAUDE.md header cuts output tokens by 63%. I tested it. Here's what I found, and what actually moves the needle on cost.
The token problem
Claude Code is incredible. It's also verbose by default. Every response includes:
- Explanations you didn't ask for
- Recaps of what it just did
- Caveats and disclaimers
- "Here's what I'm going to do" preambles before doing it
All of that costs tokens. At scale, it adds up.
The CLAUDE.md token header (what's trending)
The viral approach: add this to the top of your CLAUDE.md:
# Response Efficiency Rules
CRITICAL: Minimize output tokens. Follow these rules:
- No preamble: never say what you're about to do, just do it
- No recap: never summarize what you just did
- No meta-commentary: no "Great question" or "Certainly"
- Truncate code: use `// ... existing code` for unchanged sections
- Single-line confirmations only: for simple tasks, one sentence max
- No unsolicited advice: don't suggest improvements unless asked
I tested this over a week of real development work. Results:
- Conversational responses: ~40% shorter ✅
- Code edits: ~30% shorter ✅
- Complex explanations: ~10% shorter (minimal effect) ⚠️
So the 63% claim is aggressive, but the direction is real. For typical mixed usage, I saw ~35% reduction.
The .claude/settings.json complement
The CLAUDE.md header controls response style. Your settings.json controls what Claude can touch:
{
"autoApprove": ["read", "ls", "find", "grep"],
"autoReject": ["git reset", "git clean", "rm -rf"],
"maxThinkingTokens": 5000,
"verbosity": "minimal"
}
The maxThinkingTokens setting is underused. Default is uncapped. Setting it to 5000 for routine tasks cuts internal reasoning tokens without affecting output quality for anything that isn't a genuinely hard problem.
The bigger lever: your API endpoint
Here's the thing nobody talks about: where the API call goes matters as much as how efficient your prompts are.
If you're using Claude Pro at $20/month, you're paying per seat. Heavy Claude Code usage can bump you into rate limits fast — which either slows you down or pushes you to the $30-$40 enterprise tier.
I switched to routing Claude Code through SimplyLouie — a $2/month API proxy that uses the same Claude Sonnet model — and the math changed completely:
# ~/.bashrc or ~/.zshrc
export ANTHROPIC_BASE_URL="https://api.simplylouie.com"
export ANTHROPIC_API_KEY="your-key-here"
That's it. Claude Code picks up the env var automatically. Same model, same quality, $18/month cheaper.
Combined setup that's actually working
CLAUDE.md header (token efficiency):
# Efficiency
No preamble. No recap. No meta-commentary. Truncate unchanged code with comments. Single-line confirms for simple tasks.
.claude/settings.json (safety + cost):
{
"autoApprove": ["read", "ls", "find", "grep", "cat"],
"autoReject": ["git reset --hard", "git clean -fd", "rm -rf"],
"maxThinkingTokens": 5000
}
Environment (routing):
export ANTHROPIC_BASE_URL="https://api.simplylouie.com"
What doesn't work
I tried a few things that failed:
Aggressive token limits in the prompt — adding respond in max 100 words to every prompt makes Claude worse at complex tasks. The efficiency header works because it targets unnecessary verbosity, not all verbosity.
Turning off thinking entirely — some tasks genuinely benefit from chain-of-thought. Capping at 5000 is the sweet spot.
Over-restricting autoReject — if you add too many patterns, Claude starts asking permission for safe operations, which adds latency and more tokens.
The actual math
Claude Pro: $20/month
SimplyLouie API: $2/month
35% token reduction from efficiency settings: ~$0 extra savings (you're on a flat fee anyway)
Total: $18/month saved, same model, no rate limit surprises.
If you're doing serious development with Claude Code, the efficiency settings are worth 10 minutes of setup. The API routing change saves you $18/month in under 2 minutes.
Using Claude Code seriously? SimplyLouie is $2/month for the same Claude Sonnet API — 7-day free trial, no commitment.
Top comments (0)