myougaTheAxo

Posted on Mar 11 • Originally published at zenn.dev

Cut Claude Code Token Costs by 70%: Practical Optimization Guide

#claudecode #ai #performance #tips

Cut Claude Code Token Costs by 70%: Practical Optimization Guide

Why Costs Spiral Out of Control

Common reasons Claude Code gets expensive:

Bloated CLAUDE.md - loaded in every conversation
Reading entire files - only need specific sections
Using Opus for everything - Haiku/Sonnet often sufficient
Repeating the same research - no caching strategy

Tip 1: Minimize CLAUDE.md

CLAUDE.md is included in every conversation context. Keep it under 50 lines.

Before (300-line CLAUDE.md):

## Environment
OS: Windows 11
Python: 3.13
Node.js: 20.x
...50 lines of setup instructions...
...100 lines of coding conventions...

After (50 lines max):

## Core Rules
- Minimal changes only
- Run tests after changes
- See: `docs/patterns.md`

## Stack
Python 3.13 / Node 20 / Windows 11

Reference detailed docs when needed. Claude reads them on demand.

Tip 2: Use Models Strategically

Task	Opus	Sonnet	Haiku
File search	Too expensive	OK	Best
Implementation	OK	Best	Too slow
Architecture	Best	OK	No
grep/list	No	No	Best

Haiku costs ~1/15 of Opus. Route all simple searches to Haiku.

Tip 3: Strategic /clear Usage

# After each task milestone
/clear

# Before clearing, save state to memory

"Summarize current progress in memory/progress.md for next session handoff"

Tip 4: Use Grep/Glob Over Read

Wrong: "Read all files under src/ and find..."
Right: "Search for `UserService` class using Grep"

# Wrong: Read entire 2000-line file
# Right: Read specific lines
read(file_path="src/service.py", offset=150, limit=50)

Tip 5: Cache Research Results

# memory/api-endpoints.md
## UserAPI endpoints (researched 2026-03-11)
- GET /users/:id → UserController.getById (src/controllers/user.ts:42)
- POST /users → UserController.create (src/controllers/user.ts:67)

Never research the same thing twice.

Cost Reduction Summary

Strategy	Reduction
Minimize CLAUDE.md	20-30%
Use Haiku	15-25%
Strategic /clear	10-15%
Grep/Glob first	10-20%
Memory caching	5-15%
Total	60-70%

Summary

Three principles: read less, use cheaper models, cache repeated work. Keeping CLAUDE.md under 50 lines gives immediate results.

This article is an excerpt from the Claude Code Complete Guide (7 chapters), available on note.com.
myouga (@myougatheaxo) - VTuber axolotl. Sharing practical AI development tips.

Top comments (3)

Harjot Singh • May 31

Practical and specific to Claude Code - good, because the generic "use a cheaper model" advice doesn't apply cleanly to an agentic coding tool where the spend is dominated by context re-sends across a long session. For Claude Code specifically the highest-leverage moves are usually: keep CLAUDE.md / project context lean (it rides along every turn), don't let it re-read files already in context, compact aggressively, and scope each task tightly instead of one mega-session that drags an ever-growing transcript.

The Claude-specific lever worth emphasizing: prompt caching. Claude's cache makes the stable parts of your context (system prompt, project files) dramatically cheaper on repeat, so structuring your session to maximize cache hits is often a bigger win than any single trim. That cache-aware structuring is part of how I keep Moonshift (a multi-agent pipeline that ships a prompt to a deployed SaaS) at ~$3 flat - the stable context is cached, only the deltas are full-price. Really useful guide. Did your 70% lean more on context trimming or on cache utilization? For Claude specifically I find caching is the sleeper win people underuse.

Henry Godnick • Mar 14

The model routing strategy is spot on. One thing that makes all of these tips way more actionable is having a live token counter running while you work. Knowing your CLAUDE.md is bloated is one thing, but actually watching the token count jump on every prompt because of it makes you trim it immediately. Same with the grep vs read tip -- when you can see 50k tokens burned on a full file read vs 2k on a targeted grep in real time, the habit change is instant. I built a macOS menu bar tool that does exactly this. DM me if you want the link.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.