GDS K S

Posted on Apr 10

I tested 4 codebase-to-AI tools on FastAPI (108k lines). Here are the token costs.

#ai #productivity #opensource #programming

Stacklit on GitHub -- the tool I built after running these tests.

I have been using AI agents on real projects for the past year. Claude Code, Cursor, Aider. The one problem that never goes away: every session starts by the agent reading files to understand the codebase. Same files. Same tokens. Every time.

So I tested four tools that claim to solve this. I ran them on FastAPI (108,075 lines of Python, 1,131 files) and measured what actually came out.

The four tools

Repomix (23k stars) -- packs your entire repo into one XML or Markdown file. Every line of source code in a single output.

Aider repo-map (part of Aider, 43k stars) -- generates an ephemeral text map of functions and classes ranked by relevance. Built into Aider, not available separately.

Codebase Memory MCP (1.4k stars) -- builds a SQLite knowledge graph from tree-sitter ASTs. 66 languages. Queryable through 14 MCP tools.

Stacklit (new, my project) -- generates a committed JSON index with module graph, exports, types, and activity. One file, committed to git.

The big comparison table

Feature	Repomix	Aider repo-map	CB Memory MCP	Stacklit
What it produces	Full code dump (XML/MD)	Ephemeral text map	SQLite knowledge graph	JSON index file
Output size (FastAPI)	~800k tokens	~8k-15k tokens	per-query	4,142 tokens
Committed to repo	No (too large)	No (ephemeral)	No (local DB)	Yes
Works with Claude Code	Yes (paste)	No	Yes (MCP)	Yes (file + MCP)
Works with Cursor	Yes (paste)	No	Yes (MCP)	Yes (file + MCP)
Works with Aider	Yes (paste)	Built-in	No	Yes (reads file)
Works with Copilot	Manual paste	No	No	Yes (reads file)
Dependency graph	No	No	Yes	Yes
Module detection	No	No	Yes	Yes
Export/function signatures	Full source code	Function names only	Full signatures	Signatures with types
Type definitions	Full source code	No	Yes	Yes (struct/class fields)
Git activity heatmap	No	No	Partial	Yes (90-day)
Visual output	No	No	No	HTML with 4 views
MCP server	No	No	Yes (14 tools)	Yes (7 tools)
Monorepo aware	No	No	No	Yes (8 formats)
Incremental updates	No	No	Partial	Yes (Merkle hash)
Languages (full parsing)	N/A (dumps everything)	Many (tree-sitter)	66 (tree-sitter)	11 (tree-sitter)
Languages (basic)	N/A	N/A	N/A	Any (line count)
Runtime required	Node.js	Python	C binary running	None
Install	`npx repomix`	Built into Aider	Download + run	`npx stacklit init`
Binary size	~50MB (Node)	Python env	~2MB (C)	32MB (Go, no CGO)
Configuration	repomix.config.json	In Aider config	CLI flags	stacklit.toml
Open source	MIT	Apache 2.0	MIT	MIT

Token cost breakdown on FastAPI

This is the number that matters. I ran each tool on FastAPI and counted tokens using tiktoken (cl100k_base encoding, same as GPT-4/Claude):

Tool	Output tokens	Context windows used	Time
Repomix (XML)	~800,000	4-6 windows (overflows 200k)	~8s
Repomix (compressed)	~400,000	2-3 windows	~12s
Aider repo-map	~8,000-15,000	Fits in one	Per-prompt
CB Memory MCP	Varies per query	N/A (streaming)	Sub-ms per query
Stacklit	4,142	Fits in one	0.4s

Stacklit produces the smallest static output. It does not include source code. It includes structure: which modules exist, what they export, how they connect, what changed recently.

Token cost across 4 projects

Not just FastAPI. I ran Stacklit on four popular open source repos:

Project	Language	Files	Lines	Stacklit tokens
Express.js	JavaScript	141	21,346	3,765
FastAPI	Python	1,131	108,075	4,142
Gin	Go	100	23,829	3,361
Axum	Rust	300	43,997	14,371

Full outputs are in the examples directory.

When to use each tool

Repomix: the brute force approach

Best for: pasting a small repo into ChatGPT for a one-shot question. Repos under 50 files where token cost does not matter.

The problem at scale: a 500-file repo produces 500,000+ tokens. That overflows most context windows. The agent gets all the code but no structural understanding. It still has to figure out the architecture from raw source.

Aider repo-map: the smart but locked approach

Best for: people who already use Aider. The repo-map is genuinely good. It ranks code by relevance to your current task using a PageRank-style algorithm.

The catch: it only works inside Aider. You cannot use it with Claude Code, Cursor, or Copilot. The map regenerates every prompt and is not shareable.

Codebase Memory MCP: the power user approach

Best for: large codebases where you need deep semantic queries. Call path tracing, dead code detection, relationship traversal across 66 languages.

The trade-off: you run a server process. The knowledge graph lives in a local database. Switch machines or share with a teammate? They rebuild locally. There is no committed artifact.

Stacklit: the committed index approach

Best for: teams where multiple people (or multiple agents) work on the same repo. The index is a JSON file you commit to git. Clone the repo, the index is there.

It works with every tool without per-tool configuration. Claude Code reads it as a file. Cursor reads it as a file. The MCP server is optional, for tools that prefer querying.

What Stacklit extracts per language

Language	Parser	What you get in the index
Go	stdlib AST	imports, exports with full signatures, struct fields, interface methods
TypeScript/JS	tree-sitter	ESM/CJS/dynamic imports, classes, interfaces, type aliases, enums
Python	tree-sitter	imports, classes with all methods, type hints, decorators, `__main__`
Rust	tree-sitter	use/mod/crate, pub items with generics, trait methods, struct fields
Java	tree-sitter	imports, public classes, method signatures with parameter types
C#	tree-sitter	using directives, public types, method signatures
Ruby	tree-sitter	require/require_relative, classes, modules, methods
PHP	tree-sitter	namespace use, classes, traits, public methods
Kotlin	tree-sitter	imports, classes, objects, data classes, functions
Swift	tree-sitter	imports, structs, classes, protocols, enums
C/C++	tree-sitter	#include, functions, structs, unions, typedefs

Everything else gets basic support: language detection and line count per module.

Where Stacklit falls short

I want to be upfront about this:

11 languages with full extraction, not 66. Codebase Memory MCP covers more languages deeply. If your stack is Elixir or Haskell, Stacklit gives you line counts, not full extraction.
No function-level call graphs. Stacklit maps module dependencies, not "which function calls which." CB Memory MCP and Axon do this.
No runtime queries. The index is a snapshot. It does not answer questions about the codebase on demand the way a running MCP server does. (Though Stacklit does have an MCP server that reads from the index.)
No source code in the output. Repomix gives the agent actual code. Stacklit gives a map. Sometimes the agent needs the code and will still read files.

My actual recommendation

Use Stacklit as a baseline for every repo. It takes 90 milliseconds to generate and costs nothing to maintain with a git hook.

Then layer other tools on top for specific needs:

Repomix for one-shot full-codebase prompts
Aider if that is your daily driver
CB Memory MCP for deep semantic analysis on large codebases

They are not mutually exclusive. A committed stacklit.json makes every other tool work better because the agent starts with context instead of from zero.

Try it now

npx stacklit init

One command. Scans your codebase. Generates the index. Opens a visual map in your browser.

Works on macOS, Linux, Windows. MIT licensed. Zero runtime dependencies.

github.com/glincker/stacklit

The examples directory has full outputs from Express.js, FastAPI, Gin, and Axum so you can see what the index looks like before running it.

Which of these tools do you use? Have you tried combining them? Genuinely curious what setups people have landed on.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.