HIROKI II

Posted on Jun 8

CodeGraph — The Tool That Cut My Claude Code Token Usage by 64%

#ai #claude #codegraph #devtools

CodeGraph — The Tool That Cut My Claude Code Token Usage by 64%

64% fewer tokens. 81% fewer tool calls. Zero file reads. Same answer quality.

That's what happened when I ran CodeGraph against VS Code's 10,000-file codebase. The Claude Code agent without CodeGraph took 21 tool calls and burned through 1.79 million tokens. With CodeGraph? 4 tool calls. 640K tokens. Same architectural question, same answer — 18% cheaper.

And it's not just VS Code. Across 7 codebases spanning TypeScript, Python, Rust, Java, Go, and Swift, the numbers average out to 47% fewer tokens, 58% fewer tool calls, and 16% lower costs.

Here's how it works — and why your AI coding agent has been bleeding money on something you probably never thought about.

The Hidden Tax: Where Your Agent's Tokens Actually Go

When you ask Claude Code "how does a payment request reach the database," here's what actually happens under the hood:

grep "payment"         → 47 results         → 800 tokens
read payment.service.ts → found something    → 1,200 tokens
grep "processPayment"  → 3 results          → 700 tokens
read order.handler.ts  → closer, but not it → 950 tokens
grep "db.query"        → 8 results          → 650 tokens
read db.repository.ts  → there it is        → 1,100 tokens

Total: 6 tool calls, 5,400 tokens, just to FIND the right files.

This is what I call the exploration tax. Your agent spends 50-70% of its token budget on discovering code — not understanding it, not writing it. Just finding it. Each grep is a tool call. Each read loads an entire file into context. When the agent doesn't know what it's looking for, it guesses, greps, reads, discovers it was wrong, and greps again.

Over a day of heavy usage? That's real money.

What CodeGraph Does Differently

Instead of scanning files at query time, CodeGraph pre-indexes your entire codebase into a knowledge graph — once. After that, your agent queries a local SQLite database instead of the filesystem.

Think of it this way:

Without CodeGraph	With CodeGraph
Walk every library shelf, checking every book	Check the card catalog first, walk straight to the right shelf
grep → read → grep → read → grep → read	`codegraph_explore` → answer
6-21 tool calls per question	1-3 tool calls
Raw text from files	Structured: symbols + relationships + source code

The difference isn't small. In the VS Code benchmark, the "without" agent made 9 file reads and 11 grep calls. The "with" agent made zero of either. It just asked the graph.

The 4-Layer Architecture

CodeGraph builds this knowledge graph in four stages:

Layer 1 — Source Code

codegraph init -i scans your project. It skips node_modules, build artifacts, and anything in .gitignore by default.

Layer 2 — tree-sitter AST Parsing

Each file gets parsed into an Abstract Syntax Tree. CodeGraph uses tree-sitter — an incremental parser that understands 20+ languages. It's not regex-based grep. It knows the difference between a function call and a variable named after a function.

Layer 3 — SQLite Knowledge Graph + FTS5

Extracted nodes (functions, classes, methods) and edges (calls, imports, extends, implements) go into a local SQLite database with full-text search. Everything stays on your machine — 100% local, zero data leakage.

Layer 4 — MCP Server

When your agent starts, CodeGraph's MCP server connects automatically. Eight tools expose the graph:

Tool	What it answers
`codegraph_explore`	"How does X work?" — returns relevant symbols + source grouped by file
`codegraph_search`	"Where is function X?"
`codegraph_callers`	"Who calls this?"
`codegraph_callees`	"What does this call?"
`codegraph_impact`	"What breaks if I change this?"
`codegraph_node`	"Show me this symbol's full source"
`codegraph_files`	"What's the file structure?"
`codegraph_status`	"Is the index up to date?"

The key difference from grep: the graph tells you relationships, not just locations. grep says "this string appears in these files." CodeGraph says "this function is called by A, B, and C, calls D and E, and changing it impacts these 12 files."

The Numbers: 7 Real Codebases Tested

I ran the same architectural question across 7 open-source projects, comparing Claude Opus 4.8 with and without CodeGraph. Each test was run 4 times; the table shows the median.

Codebase	Language	Files	Token Reduction	Tool Call Reduction	Cost Savings
VS Code	TypeScript	~10k	-64%	-81%	-18%
Alamofire	Swift	~110	-64%	-58%	-40%
Django	Python	~3k	-60%	-77%	-8%
OkHttp	Java	~645	-54%	-50%	-25%
Tokio	Rust	~790	-38%	-57%	even
Gin	Go	~110	-23%	-44%	-19%
Excalidraw	TS	~640	-25%	-40%	even

Three things jump out:

Bigger codebase, bigger savings. VS Code (10k files) saw the most dramatic improvement — the exploration tax scales with project size.
Small projects benefit too. Alamofire (110 files) saved 40% on cost. You don't need a monorepo to see returns.
Cost stays flat-to-cheaper everywhere. Even the break-even cases (Tokio, Excalidraw) saw 38-40% fewer tool calls and 25-38% fewer tokens. The cost parity comes from CodeGraph's responses being slightly more verbose (it returns structured data with context), but the time and token savings are real regardless.

Honest Talk: When You Do (and Don't) Need CodeGraph

Strongly Recommended

Projects over 500 files. The exploration tax grows linearly with codebase size. Above 500 files, the grep-read loop becomes genuinely expensive.
Heavy Claude Code / Cursor / Codex users. These agents spawn Explore sub-agents that multiply the tool call overhead.
Cross-language projects. Swift+ObjC bridging, React Native JS+Native — grep can't cross language boundaries. CodeGraph's bridge support handles this.
Teams using CI/CD. The codegraph affected command tells you exactly which tests to run when files change.

Probably Not Worth It

Micro-projects (< 50 files). The index overhead isn't justified.
Simple CRUD-only work. If you never ask architectural questions, you don't need an architecture map.
ChatGPT Web users. No MCP support, so CodeGraph can't connect.

The ROI Math

Setup: codegraph init -i takes 1-3 minutes on a large project
Running cost: $0 (local SQLite, no API, no external service)
Break-even: roughly 2-3 architectural questions

If you ask your agent 20 questions a day at $0.83 each (VS Code benchmark), going to $0.68 saves $3/day, $90/month, $1,080/year — from a tool that took 3 minutes to set up.

5-Minute Setup

# 1. Install (no Node.js required — bundles its own runtime)
# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.sh | sh

# Windows (PowerShell)
irm https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.ps1 | iex

# 2. Connect to your agent (auto-detects Claude Code, Cursor, Codex, etc.)
codegraph install

# 3. Initialize your project
cd your-project
codegraph init -i

# 4. Restart your agent — done!

Verify it's working:

codegraph status          # Check index health
codegraph query <symbol>  # Quick CLI search

Ask your agent an architectural question and watch — the tool calls should switch from grep and read to codegraph_explore.

Why This Matters Beyond Cost

The cost argument is compelling, but there's something deeper here.

When your agent spends 70% of its token budget on discovery, it has less "mental bandwidth" for reasoning. Token context windows aren't infinite. Every grep result and file read consumes context that could have been used for deeper analysis.

CodeGraph shifts the ratio: less budget on finding, more on thinking.

And because it's 100% local — all SQLite, no API calls, no data leaving your machine — there's no privacy tradeoff. Your code never touches CodeGraph's servers because CodeGraph doesn't have servers.

The Bottom Line

CodeGraph isn't magic. It's a knowledge graph — your codebase, pre-indexed, queryable in milliseconds. Its value comes from a simple insight: grep is the wrong tool for understanding code structure. It tells you where words appear, not how things connect.

For Claude Code, Cursor, and Codex users working on non-trivial codebases, the math is straightforward: 3 minutes of setup for 16-40% cost reduction, permanently.

Is it for everyone? No. If you're building a todo app in 30 files, skip it. But if your agent spends its first 30 seconds grep-ing through your monorepo every time you ask a question? You're paying for exploration you don't need.

Resources: