HIROKI II

Posted on May 29

CodeGraph: The Open-Source Tool That Cut My AI Coding Token Usage by 57%

#opensource #ai #devtools #programming

CodeGraph: The Open-Source Tool That Cut My AI Coding Token Usage by 57%

I hit the same wall a lot of developers hit with AI coding tools. Claude Code is brilliant at understanding isolated code, but on a mid-sized codebase, asking "how does the authentication flow work here?" triggers a cascade of file reads, symbol searches, and import tracing. By the time it returns an answer, a few hundred thousand tokens are gone. Do that a few times a day and the API bill starts feeling less like a tool and more like a tax.

The frustrating part isn't the cost itself. It's that most of those tokens are wasted on exploration, not reasoning. The AI isn't thinking hard about your question — it's just trying to find the relevant files.

CodeGraph solves this at the architectural level. 32.1K GitHub stars, MIT license, and it doesn't try to make the AI smarter. It gives the AI a pre-built map of your codebase.

How It Works

CodeGraph builds a knowledge graph by parsing your source code with tree-sitter and extracting symbols (functions, classes, methods) and edges (calls, imports, inheritance, implementations). Everything goes into a local SQLite database with FTS5 full-text search.

tree-sitter is worth a brief mention because the choice matters. It's an incremental parsing library originally developed by GitHub for Atom and now used by Neovim as its core syntax engine. Unlike traditional compiler frontends, tree-sitter is fault-tolerant — it produces partial ASTs even when the code has syntax errors — and it's fast. CodeGraph chose it because real-world projects aren't always in a compilable state when you're exploring them.

When your AI agent asks "what calls this function?" or "trace the impact of changing this module", CodeGraph responds in a single MCP tool call with entry points, related symbols, and code snippets. No file-by-file exploration, no agent loops, no wasted context.

The installer auto-detects which AI tools you have and configures the MCP integration. Currently supported: Claude Code, Cursor, Codex CLI, OpenCode, Hermes Agent, Gemini CLI, Antigravity, and Kiro.

How It's Different From Cursor's Index

Cursor uses semantic search — vector embeddings that return similarity-ranked results. This is genuinely useful for discovery and exploration. But semantic search doesn't understand structure. It doesn't know that handleAuth() calls validateToken() which imports from jwt_utils. It just knows that these functions contain similar words.

This means the AI still has to trace through files to verify relationships. It gets leads, not answers.

CodeGraph builds an explicit call graph. Function A → calls → Function B → imported from → Module C. No fuzzy matching, no ranking. The AI receives definitive structural answers rather than similarity scores. For code exploration tasks, this is a fundamentally different information architecture.

Versus Other Code Intelligence Tools

It's worth situating CodeGraph among the other approaches to this problem.

Gemini Code Assist (formerly Google Cloud Code) does cloud-hosted code understanding. It can index enormous repositories, but your code leaves your machine. For regulated industries or proprietary codebases, this is a non-starter.

Sourcegraph is powerful but heavy. It requires server deployment, indexer configuration, and infrastructure maintenance. CodeGraph is a single binary and a SQLite file.

GitHub Copilot's codebase indexing is still in beta and cloud-only, with limited rollout.

CodeGraph's niche is local-first, zero-config, structured-graph intelligence. No servers, no uploads, no API keys. And unlike semantic approaches, it returns deterministic call relationships — not probability scores.

The Benchmarks

The team benchmarked across seven real open-source projects in seven languages, using Claude Opus 4.7 in headless mode (claude -p), four runs per arm, median reported. Same tasks, same model, with and without CodeGraph.

Codebase	Language	Size	Token Reduction	Cost Savings	Speedup	Ops Cut
VS Code	TypeScript	~10K files	78%	26%	52%	85%
Excalidraw	TypeScript	~640 files	90%	52%	73%	96%
Tokio	Rust	~790 files	86%	82%	71%	92%
Django	Python	~3K files	36%	12%	19%	53%
Alamofire	Swift	~110 files	64%	47%	48%	83%
OkHttp	Java	~645 files	13%	2%	31%	45%
Gin	Go	~110 files	34%	21%	27%	40%

Aggregate: ~35% cheaper, ~57% fewer tokens, ~46% faster, ~71% fewer tool calls.

A few patterns stand out:

Project size matters. On VS Code (~10K TypeScript files), tokens dropped from 2.8M to 601K per benchmark run. On Excalidraw, from 3.5M to 344K. The larger the haystack, the more time the AI spends searching without a map.

Rust benefits disproportionately. Tokio went from $2.41 to $0.42 — an 82% cost reduction. My theory: Rust's module system (mod, use, pub use, re-exports) creates labyrinthine import paths that exhaust AI agents during exploration. CodeGraph cuts through them.

Small projects see less dramatic gains. OkHttp (Java, 645 files) only saved 13% tokens. Gin (Go, 110 files) saved 34%. At smaller scales, the brute-force approach isn't as punishing, so the graph adds less marginal value.

What's important: these aren't model improvements. Claude Opus 4.7 is the same model in both arms. Every token saved is a token that would have been burned on mechanical, repetitive file exploration.

Framework Route Recognition

CodeGraph natively recognizes routing patterns across 14 web frameworks, covering everything from annotation-based (Spring, NestJS), decorator-based (Flask, FastAPI), DSL-based (Rails, Laravel), and file-convention-based (Django URLconf, SvelteKit) approaches.

The value: ask any supported AI tool "which handler processes POST /api/users?" and CodeGraph maps the URL directly to the function — navigating through middleware chains, route includes, and decorator stacks that would typically require multiple search-and-verify cycles.

For backend developers working across microservices or large API surfaces, this alone can be worth the installation. It's not just about saving tokens — it's about not chasing routing configuration through three layers of abstraction.

Auto-Sync That Stays Out of Your Way

CodeGraph's index synchronization is a three-layer system:

Layer 1 — Native OS file watchers. macOS FSEvents, Windows ReadDirectoryChangesW, Linux inotify. These are the lowest-level file change notification mechanisms each OS provides. No polling. No CPU burn. The moment you save a file, the OS notifies CodeGraph.

Layer 2 — Debounced batching. A 2-second quiet window. If you save five files in rapid succession (common during refactoring), they collapse into a single incremental sync. No per-save rebuilds.

Layer 3 — Staleness flags + reconnect reconciliation. Files that haven't been synced yet are explicitly marked as stale so AI agents know to read them directly. On reconnect, a fast (size, mtime) + content-hash check identifies what's changed.

The result: you write code normally, the index updates silently in the background. No "wait, let me rebuild the index" moments.

Installation

No Node.js prerequisite. One command:

# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.sh | sh

# Windows (PowerShell)
irm https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.ps1 | iex

npm works too: npx @colbymchenry/codegraph

Then in your project:

codegraph init -i

The initial -i flag builds the full graph. For very large projects (100K+ files), this can take several minutes — but it's a one-time cost. After that, everything is incremental.

A few practical notes:

macOS: install Xcode CLI tools first (xcode-select --install). Without them, CodeGraph falls back to compatibility mode at 5-10x slower speeds.
Default exclusions: node_modules, vendor, dist, build, target, .venv, .gitignore entries, and files > 1 MB.
The graph lives in .codegraph/codegraph.db — a single SQLite file. No scattered cache files, no background services.

Current Limitations

After using it for a while, a few things to be aware of:

Initial indexing time. On massive codebases (100K+ files), the first index build can take 10+ minutes. This is bounded by tree-sitter's parsing speed and your disk I/O. Incremental syncs after that are fast.

Uneven language support. TypeScript, Python, Rust, and Go have the deepest coverage (and the strongest benchmark results). Objective-C is listed as "partial support." If your project relies heavily on less common languages, test first before committing.

It's an enhancer, not a replacement. CodeGraph saves tokens on code exploration. It doesn't help with reasoning-intensive tasks. If you're asking your AI "why is this code slow?" or "design a caching strategy," the heavy lifting is still on the model — CodeGraph just helps it gather context faster.

The Bigger Picture

The AI coding tool space spent the last year racing to build smarter models. Claude vs. GPT vs. Gemini — each release claims a benchmark lead. But as these tools move into daily development workflows, a different bottleneck is emerging.

A model that takes 50 file reads to answer a question, burns 3 million tokens, and runs for 2.5 minutes will never be used casually, no matter how smart it is. The friction is too high, the cost too visible.

The smarter play might be: give a reasonably smart model an accurate map and let it answer in 10 file reads, 600K tokens, and 1 minute.

CodeGraph provides the map.

GitHub: colbymchenry/codegraph