Akhilesh Pothuri

Posted on May 12 • Originally published at Medium

I Burned a Month's AI Budget in a Week — So I Built a Code Graph

#ai #tooling #mcp #typescript

Seven days into the month, I'd burned through 75% of my AI API budget. Nothing had changed about how I was working — same codebase, same questions, same tools. But the token meter was spinning like I'd left a garden hose running.

I dug in. The culprit wasn't my prompts. It was the context.

The Problem With File-Based Retrieval

When you ask Claude or GPT "how does my auth middleware work?", most tools respond by grabbing the entire auth.ts file and stuffing it into the prompt. Sometimes two or three files. That's 300–800 lines of code when you probably needed 30.

I call this the Confusion Tax — you're paying for tokens that actively make the AI worse. More irrelevant code means more noise, more hallucinations, and a higher bill.

Traditional RAG treats code like a document. It doesn't understand that validateToken() calls checkExpiry() which imports from crypto/utils.ts. It just sees text.

Code Isn't Text — It's a Graph

Every codebase is a directed graph. Functions call other functions. Classes extend other classes. Modules import from modules. This structure exists whether your tooling understands it or not.

If you want to answer "how does the auth middleware work?", you don't need the whole file. You need:

The authMiddleware function body
The functions it directly calls
The types/interfaces it depends on

That's a k-step neighbourhood traversal from an anchor node — not a file dump.

What I Built: Nexus-Graph

Nexus-Graph is a local-first code intelligence engine that parses your codebase into a directed symbol graph and serves token-budgeted context to AI assistants via the Model Context Protocol (MCP).

It supports Python, TypeScript, and JavaScript out of the box using tree-sitter for parsing — so it understands your code structurally, not just lexically.

How it works

1. Indexing — Nexus walks your project and parses every .py, .ts, .tsx, .js, .jsx file into a graph of symbols and edges stored in SQLite:

-- Symbols: functions, classes, methods, variables, interfaces
CREATE TABLE symbols (
  id          TEXT PRIMARY KEY,  -- sha1(file:name:line)
  symbol_name TEXT,
  symbol_type TEXT,
  file_path   TEXT,
  start_line  INTEGER,
  end_line    INTEGER,
  signature   TEXT,
  body_hash   TEXT,
  edit_count  INTEGER
);

-- Directed edges between symbols
CREATE TABLE edges (
  from_id   TEXT,
  to_id     TEXT,
  edge_type TEXT  -- imports | calls | extends | implements
);

2. Query — When your AI asks for context, Nexus:

Finds anchor symbols via FTS5 full-text search
BFS-traverses the graph k steps from those anchors
Scores nodes by proximity + recency (recently edited files score higher)
Fills a token budget greedily: full bodies first, then definition-only, then drops leaf nodes

3. Serve — Results go back to the AI via MCP — the open protocol that Claude Code, Cursor, and Gemini all speak natively.

The Numbers

After switching to graph-based retrieval:

70% fewer tokens per query — surgical context vs. whole-file dumps
5–10x smaller context blocks compared to file-based retrieval
Indexes 1,000+ files in under 15 seconds on an M1 MacBook Air
Sub-100ms context queries on an indexed project

Getting Started

# Install globally
npm install -g @costline/nexus-graph

# Index your project
nexus-graph index --project .

# Start the MCP server
nexus-graph server --project .

Then wire it into Claude Code:

// ~/.claude/settings.json
{
  "mcpServers": {
    "nexus": {
      "command": "nexus-graph",
      "args": ["server", "--project", "/path/to/your/project"]
    }
  }
}

Claude will automatically call get_context_for_query before answering questions about your codebase. You can also call it explicitly:

use get_context_for_query for "how does the auth middleware work?"

It Also Watches for Changes

Nexus ships with a file watcher (--watch) that incrementally re-indexes changed files in real time. No need to re-run the full index after every edit — only the affected symbols and edges are updated.

Open Source

Nexus-Graph is MIT licensed and available on GitHub. Install from npm:

npm install -g @costline/nexus-graph

If you're spending too much on AI API costs or your assistant keeps losing track of your codebase, give it a try. The context problem is solvable — it just needs the right data structure.

Top comments (3)

Raju Dandigam • May 12

This is exactly the direction AI coding tools need to move in. Treating code as flat text works for demos, but it breaks down quickly in real systems where the relevant context is usually a call path, dependency chain, or type boundary rather than an entire file.

I like the k-step neighborhood idea because it maps well to how engineers actually debug: start from the symbol, walk callers/callees/imports, then expand only when needed. Serving that through MCP also makes the tool more composable instead of locking the context strategy into one assistant.

One area I’d be interested in is observability for the retrieval layer itself: logging which symbols were selected, why they were included, token budget pressure, and whether the final answer actually used the retrieved nodes. That feedback loop could make graph-based context even stronger over time.

HARD IN SOFT OUT • May 13

What if the graph could be pre‑warmed per task and sent as a compact “context pack” alongside the prompt, so the model can request additional nodes on‑demand rather than guessing the entire scope upfront? That could make token usage near‑optimal. Burned budget in a week — and then built a solution instead of just cutting access. That’s a productive kind of anger. A code graph to trim context is a pragmatic idea that I haven’t seen enough people try. One thing I’ve found when using code graphs: the graph is only as good as its dependency resolution. If your tool misjudges what’s “relevant”, it can omit the one function that’s actually critical.

How did you handle this — static analysis, runtime profiling, or a mix?

Akhilesh Pothuri • May 13

Really good points, and the pre-warming idea is something I've been thinking about too.
On dependency resolution — Nexus uses static analysis via tree-sitter, which means it understands the structure of your code (imports, calls, extends, implements) without executing it. The tradeoff you identified is real: static analysis can miss dynamic dispatch, runtime-injected dependencies, and decorator-wrapped functions where the call graph isn't obvious from the source.
The way I've handled the "omitted critical function" problem is layered:
FTS5 full-text search as the anchor — rather than guessing the entry point, it finds symbols by name/signature match first, then BFS-traverses from there. So if the critical function is mentioned in the query, it gets pulled in directly.
Recency scoring — recently edited files score higher, on the assumption that what you've been touching is what's currently relevant.
Graceful degradation — if the budget runs out, it drops leaf nodes first and keeps definitions of closer nodes, so the model at least knows the interface exists even if it can't see the full body.
That said, it's not perfect — if a critical function has a generic name or isn't reachable via static import edges, it can get missed. Runtime profiling would close that gap but adds significant complexity.
The pre-warmed context pack idea is interesting — essentially a session-scoped subgraph that gets refined as the task evolves rather than rebuilt per query. That's roughly where I want to take the createSession / scoreNodes layer. Happy to dig into that more if you want to compare notes.