AI coding agents are powerful — but they're also blind. Every time Claude Code, Codex, or Gemini CLI needs to understand your codebase, they explore it file by file. Grep here, read there, grep again. For a simple question like "what calls ProcessOrder?", an agent might burn through 45,000 tokens just opening files and scanning for matches.
I built codebase-memory-mcp to fix this. It parses your codebase into a persistent knowledge graph — functions, classes, call chains, imports, HTTP routes — and exposes it through 14 MCP tools. The same question now costs ~200 tokens and answers in under 1ms.
The Problem: File-by-File Exploration Doesn't Scale
Here's what actually happens when you ask an AI agent "trace the callers of ProcessOrder":
- Agent greps for
ProcessOrderacross all files (~15,000 tokens) - Reads each matching file to understand context (~25,000 tokens)
- Follows imports to find indirect callers (~20,000 tokens)
- Gives up after hitting context limits, missing half the call chain
Multiply this by every question in a coding session and you're burning hundreds of thousands of tokens per hour — most of it reading files that aren't relevant.
The Fix: Parse Once, Query Forever
codebase-memory-mcp runs a one-time indexing pass using tree-sitter AST parsing. It extracts every function, class, method, import, call relationship, and HTTP route into a SQLite-backed graph. After that, the graph stays fresh automatically via a background watcher that detects file changes.
You: "what calls ProcessOrder?"
Agent calls: trace_call_path(function_name="ProcessOrder", direction="inbound")
→ Returns structured call chain in ~200 tokens, <1ms
No LLM is embedded in the server. Your agent is the intelligence layer — it just gets precise structural answers instead of raw file contents.
Benchmarks: 120x Token Reduction
I ran agent-vs-agent testing across 31 languages (372 questions). Five representative structural queries on a real multi-service project:
| Query Type | Knowledge Graph | File-by-File Search | Savings |
|---|---|---|---|
| Find function by pattern | ~200 tokens | ~45,000 tokens | 225x |
| Trace call chain (depth 3) | ~800 tokens | ~120,000 tokens | 150x |
| Dead code detection | ~500 tokens | ~85,000 tokens | 170x |
| List all HTTP routes | ~400 tokens | ~62,000 tokens | 155x |
| Architecture overview | ~1,500 tokens | ~100,000 tokens | 67x |
| Total | ~3,400 | ~412,000 | 121x |
That's a 99.2% reduction. The cost difference between graph queries and file exploration adds up fast over a full development session.
It Handles the Linux Kernel
The stress test I'm most proud of: indexing the entire Linux kernel.
- 28 million lines of code, 75,000 files
- 2.1 million nodes, 4.9 million edges
- Indexed in 1 minute on an M3 Pro in fast mode, 5mins for advanced indexing also including large files and digging a bit more. The average repo will be indexed sub second depending on your hardware (more cpu the better).
The pipeline is RAM-first: LZ4-compressed bulk read, in-memory SQLite, fused Aho-Corasick pattern matching, single dump at the end. Memory is released back to the OS after indexing completes. Average-sized repos index in milliseconds.
64 Languages, Zero Dependencies
All 64 language grammars are vendored as C source and compiled into a single static binary. Nothing to install, nothing that breaks when tree-sitter updates a grammar upstream.
Programming languages (39): Python, Go, JavaScript, TypeScript, TSX, Rust, Java, C++, C#, C, PHP, Ruby, Kotlin, Scala, Swift, Dart, Zig, Elixir, Haskell, OCaml, Objective-C, Lua, Bash, Perl, Groovy, Erlang, R, Clojure, F#, Julia, Vim Script, Nix, Common Lisp, Elm, Fortran, CUDA, COBOL, Verilog, Emacs Lisp
Scientific (5): MATLAB, Lean 4, FORM, Magma, Wolfram
Config/markup (20): HTML, CSS, SCSS, YAML, TOML, HCL, SQL, Dockerfile, JSON, XML, Markdown, Makefile, CMake, Protobuf, GraphQL, Vue, Svelte, Meson, GLSL, INI
This matters because real-world codebases aren't monolingual. A typical project has Go backends, TypeScript frontends, SQL migrations, Dockerfiles, YAML configs, and shell scripts. One indexing pass captures all of it. We also already introduced more advanced indexing using LSP like techniques, basically creating a "LSP + Tree-Sitter" hybrid approach. Currently only supported for Go, C and C++, more supported languages coming soon.
14 MCP Tools
The full tool surface:
| Tool | What it does |
|---|---|
search_graph |
Find functions/classes by name pattern, label, degree |
trace_call_path |
Follow callers/callees at configurable depth |
get_architecture |
Languages, packages, entry points, routes, hotspots, clusters |
detect_changes |
Map git diff to affected symbols with risk classification |
query_graph |
Raw Cypher queries (MATCH (f:Function)-[:CALLS]->(g)...) |
search_code |
Full-text search across indexed source |
get_code_snippet |
Read a specific function/class by qualified name |
get_graph_schema |
Inspect available node/edge types |
manage_adr |
Architecture Decision Records that persist across sessions |
index_repository |
Trigger initial index (auto-sync handles the rest) |
list_projects |
Show all indexed repos with stats |
delete_project |
Clean up a project's graph data |
index_status |
Check indexing progress |
ingest_traces |
Import OpenTelemetry traces into the graph |
Works With 8 AI Agents
One install command auto-detects and configures all of these:
- Claude Code — full integration with skills + PreToolUse hooks
- Codex CLI — MCP config + AGENTS.md instructions
- Gemini CLI — MCP config + BeforeTool hooks
- Zed — JSONC settings integration
- OpenCode — MCP config + AGENTS.md
- Antigravity — MCP config + AGENTS.md
- Aider — CONVENTIONS.md instructions
- KiloCode — MCP settings + rules
The hooks are advisory — they remind agents to check the graph before reaching for grep/glob/read, without blocking anything.
Setup: 3 Commands
# Download (or use the one-liner: curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/scripts/setup.sh | bash)
tar xzf codebase-memory-mcp-*.tar.gz
mv codebase-memory-mcp ~/.local/bin/
# Auto-configure all detected agents
codebase-memory-mcp install
# Restart your agent, then:
"Index this project"
That's it. No Docker, no API keys, no npm install, no runtime dependencies. A ~15MB static binary for macOS (arm64/amd64), Linux (arm64/amd64), or Windows (amd64).
Built-In Graph Visualization
If you download the UI variant, you get a 3D interactive graph explorer at localhost:9749:
It runs as a background thread alongside the MCP server — available whenever your agent is connected.
The Design Philosophy
A lot of code intelligence tools embed an LLM for natural language → graph query translation. This means extra API keys, extra cost, and another model to configure and keep updated.
With MCP, the agent you're already talking to is the query translator. It reads tool descriptions, understands your question, and makes the right tool call. No intermediate LLM needed.
Similarly, the tool focuses on structural precision over semantic fuzziness. When an agent asks "what calls X?", it needs an exact answer — not a ranked list of "probably related" functions. The graph gives exact call chains with import-aware, type-inferred resolution.
What's Next
- LSP-style hybrid type resolution — already live for Go, C, and C++ (more languages coming)
- Cross-service HTTP linking — discovers REST routes and matches them to HTTP call sites with confidence scoring
- Louvain community detection — automatically discovers functional modules by clustering call edges
Try It
- GitHub: github.com/DeusData/codebase-memory-mcp
- Website: deusdata.github.io/codebase-memory-mcp
- MIT licensed — use it commercially, fork it, contribute
If you're burning tokens on file-by-file exploration, give it a shot. Index your project and ask your agent a structural question — the difference is immediate.
Built with pure C, tree-sitter, and SQLite. No runtime dependencies. 780+ stars and growing. We built it for developers using coding agents. We want to reach here the most performant solution in this space as we believe it will enable more efficient coding for everyone and vice versa will translate in more good solutions coming up, faster and cheaper in token burn

Top comments (0)