DEV Community

Cover image for I Burned a Month's AI Budget in a Week — So I Built a Code Graph
Akhilesh Pothuri
Akhilesh Pothuri

Posted on • Originally published at Medium

I Burned a Month's AI Budget in a Week — So I Built a Code Graph

Seven days into the month, I'd burned through 75% of my AI API budget. Nothing had changed about how I was working — same codebase, same questions, same tools. But the token meter was spinning like I'd left a garden hose running.

I dug in. The culprit wasn't my prompts. It was the context.


The Problem With File-Based Retrieval

When you ask Claude or GPT "how does my auth middleware work?", most tools respond by grabbing the entire auth.ts file and stuffing it into the prompt. Sometimes two or three files. That's 300–800 lines of code when you probably needed 30.

I call this the Confusion Tax — you're paying for tokens that actively make the AI worse. More irrelevant code means more noise, more hallucinations, and a higher bill.

Traditional RAG treats code like a document. It doesn't understand that validateToken() calls checkExpiry() which imports from crypto/utils.ts. It just sees text.


Code Isn't Text — It's a Graph

Every codebase is a directed graph. Functions call other functions. Classes extend other classes. Modules import from modules. This structure exists whether your tooling understands it or not.

If you want to answer "how does the auth middleware work?", you don't need the whole file. You need:

  • The authMiddleware function body
  • The functions it directly calls
  • The types/interfaces it depends on

That's a k-step neighbourhood traversal from an anchor node — not a file dump.


What I Built: Nexus-Graph

Nexus-Graph is a local-first code intelligence engine that parses your codebase into a directed symbol graph and serves token-budgeted context to AI assistants via the Model Context Protocol (MCP).

It supports Python, TypeScript, and JavaScript out of the box using tree-sitter for parsing — so it understands your code structurally, not just lexically.

How it works

1. Indexing — Nexus walks your project and parses every .py, .ts, .tsx, .js, .jsx file into a graph of symbols and edges stored in SQLite:

-- Symbols: functions, classes, methods, variables, interfaces
CREATE TABLE symbols (
  id          TEXT PRIMARY KEY,  -- sha1(file:name:line)
  symbol_name TEXT,
  symbol_type TEXT,
  file_path   TEXT,
  start_line  INTEGER,
  end_line    INTEGER,
  signature   TEXT,
  body_hash   TEXT,
  edit_count  INTEGER
);

-- Directed edges between symbols
CREATE TABLE edges (
  from_id   TEXT,
  to_id     TEXT,
  edge_type TEXT  -- imports | calls | extends | implements
);
Enter fullscreen mode Exit fullscreen mode

2. Query — When your AI asks for context, Nexus:

  1. Finds anchor symbols via FTS5 full-text search
  2. BFS-traverses the graph k steps from those anchors
  3. Scores nodes by proximity + recency (recently edited files score higher)
  4. Fills a token budget greedily: full bodies first, then definition-only, then drops leaf nodes

3. Serve — Results go back to the AI via MCP — the open protocol that Claude Code, Cursor, and Gemini all speak natively.


The Numbers

After switching to graph-based retrieval:

  • 70% fewer tokens per query — surgical context vs. whole-file dumps
  • 5–10x smaller context blocks compared to file-based retrieval
  • Indexes 1,000+ files in under 15 seconds on an M1 MacBook Air
  • Sub-100ms context queries on an indexed project

Getting Started

# Install globally
npm install -g @costline/nexus-graph

# Index your project
nexus-graph index --project .

# Start the MCP server
nexus-graph server --project .
Enter fullscreen mode Exit fullscreen mode

Then wire it into Claude Code:

// ~/.claude/settings.json
{
  "mcpServers": {
    "nexus": {
      "command": "nexus-graph",
      "args": ["server", "--project", "/path/to/your/project"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Claude will automatically call get_context_for_query before answering questions about your codebase. You can also call it explicitly:

use get_context_for_query for "how does the auth middleware work?"
Enter fullscreen mode Exit fullscreen mode

It Also Watches for Changes

Nexus ships with a file watcher (--watch) that incrementally re-indexes changed files in real time. No need to re-run the full index after every edit — only the affected symbols and edges are updated.


Open Source

Nexus-Graph is MIT licensed and available on GitHub. Install from npm:

npm install -g @costline/nexus-graph
Enter fullscreen mode Exit fullscreen mode

If you're spending too much on AI API costs or your assistant keeps losing track of your codebase, give it a try. The context problem is solvable — it just needs the right data structure.

Top comments (1)

Collapse
 
raju_dandigam profile image
Raju Dandigam

This is exactly the direction AI coding tools need to move in. Treating code as flat text works for demos, but it breaks down quickly in real systems where the relevant context is usually a call path, dependency chain, or type boundary rather than an entire file.

I like the k-step neighborhood idea because it maps well to how engineers actually debug: start from the symbol, walk callers/callees/imports, then expand only when needed. Serving that through MCP also makes the tool more composable instead of locking the context strategy into one assistant.

One area I’d be interested in is observability for the retrieval layer itself: logging which symbols were selected, why they were included, token budget pressure, and whether the final answer actually used the retrieved nodes. That feedback loop could make graph-based context even stronger over time.