DEV Community

myougaTheAxo
myougaTheAxo

Posted on • Originally published at zenn.dev

Cut Claude Code Token Costs by 70%: Practical Optimization Guide

Cut Claude Code Token Costs by 70%: Practical Optimization Guide

Why Costs Spiral Out of Control

Common reasons Claude Code gets expensive:

  1. Bloated CLAUDE.md - loaded in every conversation
  2. Reading entire files - only need specific sections
  3. Using Opus for everything - Haiku/Sonnet often sufficient
  4. Repeating the same research - no caching strategy

Tip 1: Minimize CLAUDE.md

CLAUDE.md is included in every conversation context. Keep it under 50 lines.

Before (300-line CLAUDE.md):

## Environment
OS: Windows 11
Python: 3.13
Node.js: 20.x
...50 lines of setup instructions...
...100 lines of coding conventions...
Enter fullscreen mode Exit fullscreen mode

After (50 lines max):

## Core Rules
- Minimal changes only
- Run tests after changes
- See: `docs/patterns.md`

## Stack
Python 3.13 / Node 20 / Windows 11
Enter fullscreen mode Exit fullscreen mode

Reference detailed docs when needed. Claude reads them on demand.

Tip 2: Use Models Strategically

Task Opus Sonnet Haiku
File search Too expensive OK Best
Implementation OK Best Too slow
Architecture Best OK No
grep/list No No Best

Haiku costs ~1/15 of Opus. Route all simple searches to Haiku.

Tip 3: Strategic /clear Usage

# After each task milestone
/clear

# Before clearing, save state to memory
Enter fullscreen mode Exit fullscreen mode
"Summarize current progress in memory/progress.md for next session handoff"
Enter fullscreen mode Exit fullscreen mode

Tip 4: Use Grep/Glob Over Read

Wrong: "Read all files under src/ and find..."
Right: "Search for `UserService` class using Grep"
Enter fullscreen mode Exit fullscreen mode
# Wrong: Read entire 2000-line file
# Right: Read specific lines
read(file_path="src/service.py", offset=150, limit=50)
Enter fullscreen mode Exit fullscreen mode

Tip 5: Cache Research Results

# memory/api-endpoints.md
## UserAPI endpoints (researched 2026-03-11)
- GET /users/:id → UserController.getById (src/controllers/user.ts:42)
- POST /users → UserController.create (src/controllers/user.ts:67)
Enter fullscreen mode Exit fullscreen mode

Never research the same thing twice.

Cost Reduction Summary

Strategy Reduction
Minimize CLAUDE.md 20-30%
Use Haiku 15-25%
Strategic /clear 10-15%
Grep/Glob first 10-20%
Memory caching 5-15%
Total 60-70%

Summary

Three principles: read less, use cheaper models, cache repeated work. Keeping CLAUDE.md under 50 lines gives immediate results.


This article is an excerpt from the Claude Code Complete Guide (7 chapters), available on note.com.
myouga (@myougatheaxo) - VTuber axolotl. Sharing practical AI development tips.

Top comments (0)