Boucle

Posted on Mar 6

7 Ways to Cut Your Claude Code Token Usage

#claudecode #ai #devtools #productivity

Claude Code is powerful but expensive. Every file read, every command output, every re-exploration of code you already looked at costs tokens. After running an autonomous agent loop for weeks on Claude Code, I've tracked where tokens actually go and what you can do about it.

These are practical techniques, ordered from easiest to hardest.

1. Use .claudeignore

Claude Code will read and index files you never want it to touch: build artifacts, lock files, generated code, vendored dependencies. A .claudeignore file works like .gitignore:

node_modules/
dist/
*.lock
*.min.js
target/
__pycache__/

This prevents Claude from reading these files when exploring your project. The savings compound: every time Claude would have read a 5,000-line lock file, you save those tokens.

2. Be specific in your prompts

"Fix the bug in the authentication flow" makes Claude explore your entire codebase looking for auth-related code. "Fix the JWT validation in src/auth/validate.ts line 42 where expired tokens aren't rejected" sends it straight to the file.

The difference can be 10x in token usage. Claude reads files to understand context. The more context you give it upfront, the fewer files it needs to read.

3. Stop re-reading files

This is the single biggest source of token waste I've found. Claude Code re-reads the same files repeatedly within a session. It reads main.rs, makes a change, then reads main.rs again to verify. Then reads it again when working on a related file. Each read costs the full token count of the file.

For my agent loop, I built read-once, a Claude Code hook that tracks which files have been read in a session and blocks redundant re-reads. In testing, it cuts 60-90% of Read tool token usage.

Install it in one line:

curl -fsSL https://raw.githubusercontent.com/Bande-a-Bonnot/Boucle-framework/main/tools/read-once/install.sh | bash

It's open source and has no dependencies. It also supports a diff mode that, instead of blocking re-reads entirely, returns only the lines that changed since the last read. Useful when you need to verify edits.

4. Use compact mode for exploration

When you're exploring a codebase and don't need detailed explanations, use the /compact command. Claude's responses will be shorter, which means less output token usage. Switch back to normal mode when you need detailed reasoning.

5. Structure your sessions

Starting a new Claude Code session means Claude re-learns your project context from scratch. But sessions that run too long fill the context window and Claude starts forgetting earlier work.

One session per logical task works best. One bug fix, one feature, one refactor. Don't try to fix three bugs and add two features in one conversation.

Between sessions, leave breadcrumbs. A CLAUDE.md file with key architecture decisions, file locations, and conventions means Claude spends fewer tokens rediscovering your project structure each time.

6. Limit command output

When Claude runs shell commands, the full output enters the context window. A git log with 200 commits, a test suite with verbose output, a build log with warnings: all tokens.

You can use hooks to truncate command output, or just be mindful when asking Claude to run commands. "Run the tests for the auth module" is better than "run all tests" if you're only working on auth.

Some people pipe command output through head or tail in their hooks to cap it automatically.

7. Write CLAUDE.md well

This file is read at session start and persists in context. Every token in CLAUDE.md is consumed every session. Keep it lean:

Project architecture (what lives where)
Key conventions (naming, patterns, testing approach)
Common commands (build, test, deploy)
Known gotchas

Don't put your project's entire history in there. Don't include documentation that Claude can find by reading the relevant source file when needed. The goal is to prevent Claude from exploring, not to front-load every detail.

The real cost equation

Token usage in Claude Code follows a power law. A small number of behaviors cause most of the waste:

Re-reading files (often 40-60% of Read tokens)
Exploring irrelevant code (poor prompts or missing .claudeignore)
Verbose command output (test suites, build logs)

Fix those three and you'll likely cut your bill in half. The other techniques help at the margins.

I track my agent's token usage across every loop iteration. The single biggest improvement came from preventing redundant file reads. Everything else was incremental by comparison.

Top comments (4)

Henry Godnick • Mar 29

The file re-reading stat is eye-opening — 40-60% of Read tokens going to redundant reads. That's wild but matches what I've seen.

One thing I'd add to the "awareness" side: just knowing how many tokens you're burning in real-time changes behavior. I started using TokenBar (tokenbar.site), a $5 macOS menu bar app that shows live token usage. It's like having a speedometer — you naturally slow down when you can see the number going up. Before that, I'd fire off vague prompts and not realize until I checked /cost that I'd burned through way more than expected.

Pairs well with your techniques — the structural optimizations (claudeignore, read-once) cut the floor, and ambient visibility keeps you from accidentally going above it.

Luc B. Perussault-Diallo • Apr 30

40-60% of Read tokens being re-reads. Brutal.

The thing is, the model re-reads files not because it forgot, but because it's uncertain about what matters. Block the re-reads and you might just push the uncertainty somewhere worse. Give it structural context upfront (what depends on what, what's actually in scope) and the re-reading drops on its own, because the model doesn't need to fish around anymore.

Your point about making costs visible changing behavior is spot on. I'd love to see that extended to context relevance, not just context volume. "You loaded 15 files but only 4 were connected to this change" would be a powerful nudge. (Sense - luuuc.github.io/sense if you're curious, I'm building in this direction.)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.