Claude Code is powerful but expensive. Every file read, every command output, every re-exploration of code you already looked at costs tokens. After running an autonomous agent loop for weeks on Claude Code, I've tracked where tokens actually go and what you can do about it.
These are practical techniques, ordered from easiest to hardest.
1. Use .claudeignore
Claude Code will read and index files you never want it to touch: build artifacts, lock files, generated code, vendored dependencies. A .claudeignore file works like .gitignore:
node_modules/
dist/
*.lock
*.min.js
target/
__pycache__/
This prevents Claude from reading these files when exploring your project. The savings compound: every time Claude would have read a 5,000-line lock file, you save those tokens.
2. Be specific in your prompts
"Fix the bug in the authentication flow" makes Claude explore your entire codebase looking for auth-related code. "Fix the JWT validation in src/auth/validate.ts line 42 where expired tokens aren't rejected" sends it straight to the file.
The difference can be 10x in token usage. Claude reads files to understand context. The more context you give it upfront, the fewer files it needs to read.
3. Stop re-reading files
This is the single biggest source of token waste I've found. Claude Code re-reads the same files repeatedly within a session. It reads main.rs, makes a change, then reads main.rs again to verify. Then reads it again when working on a related file. Each read costs the full token count of the file.
For my agent loop, I built read-once, a Claude Code hook that tracks which files have been read in a session and blocks redundant re-reads. In testing, it cuts 60-90% of Read tool token usage.
Install it in one line:
curl -fsSL https://raw.githubusercontent.com/Bande-a-Bonnot/Boucle-framework/main/tools/read-once/install.sh | bash
It's open source and has no dependencies. It also supports a diff mode that, instead of blocking re-reads entirely, returns only the lines that changed since the last read. Useful when you need to verify edits.
4. Use compact mode for exploration
When you're exploring a codebase and don't need detailed explanations, use the /compact command. Claude's responses will be shorter, which means less output token usage. Switch back to normal mode when you need detailed reasoning.
5. Structure your sessions
Starting a new Claude Code session means Claude re-learns your project context from scratch. But sessions that run too long fill the context window and Claude starts forgetting earlier work.
One session per logical task works best. One bug fix, one feature, one refactor. Don't try to fix three bugs and add two features in one conversation.
Between sessions, leave breadcrumbs. A CLAUDE.md file with key architecture decisions, file locations, and conventions means Claude spends fewer tokens rediscovering your project structure each time.
6. Limit command output
When Claude runs shell commands, the full output enters the context window. A git log with 200 commits, a test suite with verbose output, a build log with warnings: all tokens.
You can use hooks to truncate command output, or just be mindful when asking Claude to run commands. "Run the tests for the auth module" is better than "run all tests" if you're only working on auth.
Some people pipe command output through head or tail in their hooks to cap it automatically.
7. Write CLAUDE.md well
This file is read at session start and persists in context. Every token in CLAUDE.md is consumed every session. Keep it lean:
- Project architecture (what lives where)
- Key conventions (naming, patterns, testing approach)
- Common commands (build, test, deploy)
- Known gotchas
Don't put your project's entire history in there. Don't include documentation that Claude can find by reading the relevant source file when needed. The goal is to prevent Claude from exploring, not to front-load every detail.
The real cost equation
Token usage in Claude Code follows a power law. A small number of behaviors cause most of the waste:
- Re-reading files (often 40-60% of Read tokens)
- Exploring irrelevant code (poor prompts or missing .claudeignore)
- Verbose command output (test suites, build logs)
Fix those three and you'll likely cut your bill in half. The other techniques help at the margins.
I track my agent's token usage across every loop iteration. The single biggest improvement came from preventing redundant file reads. Everything else was incremental by comparison.
Top comments (0)