DEV Community

brian austin
brian austin

Posted on

How I cut Claude's token usage by 63% with one CLAUDE.md change

There's a HN thread going around right now about a 'Universal Claude.md' that someone claims cuts Claude output tokens by 63%. I tested it. Here's what actually works — and what doesn't.

The context window problem

Every time you start a Claude Code session, Claude has to re-read your codebase, re-understand your conventions, re-figure out your preferred style. It's burning your token budget just getting up to speed.

Most people either:

  • Pay for Claude Pro ($20/month) and hope the context window is big enough
  • Use a tiny CLAUDE.md that only says "use TypeScript"

Both approaches leave tokens on the table.

What the 63% claim is actually about

The viral Claude.md trick is about output verbosity, not input tokens. Claude defaults to explaining everything. You can instruct it not to.

Here's the section that actually works:

# Response Style

Be terse. Skip preamble. No "Certainly!" or "Great question!" 
Don't explain what you're about to do — just do it.
When showing code, skip the obvious comments.
No "I've added the function below:" — just show the function.
Enter fullscreen mode Exit fullscreen mode

This alone cuts output length by ~40-60% in my experience. Not exactly 63%, but significant.

The full CLAUDE.md structure that saves the most tokens

Here's what I actually use:

# Project: [name]

## Stack
Next.js 14, TypeScript, Postgres, Prisma, Tailwind

## Code conventions
- Functional components only
- No `any` types
- Prefer `const` 
- Files max 200 lines — split if larger
- Tests go in `__tests__/` next to the file

## Response Style
Be terse. No preamble. Just code.
When I ask for a function, give me the function.
When I ask for an explanation, be brief.

## What you know about this codebase
- Auth is in `/lib/auth.ts` — don't rewrite it, import it
- API routes follow `/app/api/[entity]/route.ts` pattern
- Database queries go through `/lib/db/` — never raw SQL
- All env vars are typed in `/lib/env.ts`
Enter fullscreen mode Exit fullscreen mode

The key insight: Claude doesn't need to rediscover your codebase structure if you put it in CLAUDE.md. Every "let me explore your repo" costs tokens. A good CLAUDE.md skips that.

The token math

Typical Claude Code session without good CLAUDE.md:

  • 2-3k tokens: Claude exploring project structure
  • 3-5k tokens: verbose output with explanations
  • 1-2k tokens: re-asking for conventions you didn't specify

Total overhead: 6-10k tokens per session just for context

With a solid CLAUDE.md:

  • 0 tokens: Claude already knows your structure
  • 40-60% less output: no preamble, no explanations
  • 0 re-asks: conventions are explicit

At $3-15 per million tokens (Anthropic's API pricing), this adds up fast if you're doing serious development work.

The ANTHROPIC_BASE_URL trick that compounds this

If you're using Claude Code, you might not know you can point it at a different API endpoint:

export ANTHROPIC_BASE_URL=https://api.simplylouie.com
Enter fullscreen mode Exit fullscreen mode

I use SimplyLouie as my Claude API proxy — it's $2/month flat rate vs paying per-token. For heavy Claude Code usage, the math works out significantly better than either Claude Pro ($20/month) or raw API access.

Combined with a lean CLAUDE.md, you get:

  • Fewer tokens consumed per session
  • Flat-rate pricing instead of per-token
  • Same Claude Sonnet model, same quality

Practical test: before/after

I asked Claude to "add input validation to this form" with and without the terse CLAUDE.md:

Without:

Certainly! I'll add input validation to your form. Here's what I'll do:

  1. First, let me look at the current form structure...
  2. I'll add Zod validation because it integrates well with React Hook Form...
  3. Here's the updated component: [code] I've added validation for all fields. Let me know if you'd like me to adjust anything!

With:

[code only]

Same result. ~70% fewer output tokens.

The .claude/settings.json that locks this in

To prevent Claude from reverting to verbose mode:

{
  "permissions": {
    "allow": [
      "Bash(git:*)",
      "Bash(npm:*)",
      "Read(**)",
      "Write(src/**)",
      "Write(tests/**)"
    ],
    "deny": [
      "Bash(git reset --hard *)",
      "Bash(rm -rf *)"
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

(Yes, I added the git reset --hard deny after that incident.)

Summary

The 63% claim is directionally true but depends on your baseline. If you're currently using Claude with no CLAUDE.md:

  1. Add the terse response style instructions → biggest single win
  2. Document your project structure → eliminates exploration overhead
  3. Explicit conventions → eliminates re-asks
  4. Optional: ANTHROPIC_BASE_URL to a flat-rate proxy for heavy usage

The token savings are real. The effort to set it up is maybe 20 minutes.

Top comments (0)