The problem nobody admits
When you give Claude Code, Cursor, or Codex a task like "fix the login validation bug", here's what they usually do:
- Run
grep -l login src/→ 17 files - Read all 17 files top-to-bottom (because context is "free")
- Spend 80% of the model's context window on irrelevant imports, type aliases, and helper functions the bug doesn't touch
- Generate a fix using whatever 20% of attention is left
This works. Sort of. But it's wasteful — and on big codebases, it's wrong: the agent runs out of context before it sees the actual buggy function.
The instinct is to throw a bigger model at it. Bigger context window, fancier RAG, vector embeddings. All of which trade real cost for diminishing returns.
There's a better answer that's been sitting in classical CS the whole time: treat the repo as a graph.
The idea, in one paragraph
Your codebase already is a graph. Functions call functions. Modules import modules. Classes extend classes. Pick a node (the symbol your task is about), and the structurally-closest neighborhood is almost certainly what an agent needs to see.
So I built mincut-context — an npm package that:
- Parses your repo into a symbol graph (tree-sitter, supports TS/JS/Vue/Python/PHP)
- Derives seed nodes from your task description (keyword IDF on symbol names + file paths)
- Runs personalized PageRank with the seeds as the restart vector
- Picks the minimum-cut subgraph that fits a token budget you choose
The output: a list of files + line ranges that an agent should look at. Nothing more, nothing less.
Show me the numbers
I built an evaluation suite into the repo itself. 28 hand-labeled tasks across 3 real codebases at a 4,000-token budget:
| strategy | precision | recall | F1 | token-efficiency |
|---|---|---|---|---|
| mincut | 0.27 | 0.83 | 0.39 | 0.270 |
mincut + --embed (semantic) |
0.27 | 0.83 | 0.39 | 0.270 |
| grep keyword baseline | 0.11 | 0.42 | 0.16 | 0.105 |
| random selection (control) | 0.01 | 0.04 | 0.01 | 0.009 |
Per-repo breakdown:
| repo | tasks | mincut recall | grep recall | mincut F1 | grep F1 |
|---|---|---|---|---|---|
| mincut-context (self) | 12 | 0.97 | 0.56 | 0.44 | 0.30 |
| FluentForm (PHP+Vue+JS) | 8 | 0.88 | 0.13 | 0.43 | 0.04 |
| Fluent Player (TS/JSX) | 8 | 0.63 | 0.56 | 0.31 | 0.13 |
mincut catches ~2× more of the correct files than grep, at ~2.5× better token efficiency. Reproducible with npm run eval. Add your own labeled tasks under eval/fixtures/ to score against your own codebase.
The math, briefly
Given a symbol graph $G = (V, E, w)$ where:
- $V$ are code units (functions, classes, methods)
- $E$ are dependency edges (imports, calls, references)
- $w(v)$ is the token cost of including symbol $v$
- $B$ is your token budget
- $S \subseteq V$ are seed nodes derived from the task
Find $T \supseteq S$ with $\sum_{v \in T} w(v) \le B$ minimizing the boundary cut cost:
$$\text{cut}(T, V \setminus T) = \sum_{e \in E, \text{ crossing}} w(e)$$
In plain English: pick a connected, low-token region that has few "loose ends" pointing outside it. The inside of the cut is what the agent needs; the outside is safely ignorable.
The objective is submodular, so a greedy algorithm gives a $(1 - 1/e) \approx 0.63$ approximation guarantee. The full pseudocode is in the README; the implementation is ~200 lines in src/core/select.ts.
Three ways to use it
1. As an MCP server — recommended for agents
Drop this block into your Claude Code / Codex / Cursor settings:
{
"mcpServers": {
"mincut-context": {
"command": "npx",
"args": ["-y", "mincut-context", "mcp"]
}
}
}
Your agent now has six new tools: pack_context, expand_node, find_callers, find_callees, search_symbols, explain_selection. They operate on the cached graph from the most recent pack_context call — effectively free traversal after the first pack.
2. As a CLI
npm install -g mincut-context
mcx pack "fix the login validation bug" --budget 4000 # plain output
mcx pack "..." --format tree # directory-grouped
mcx pack "..." --format json | jq # pipe to anything
mcx pack "..." --interactive # Ink TUI: vim keys + preview
mcx pack "..." --embed # semantic seeding
mcx pack "..." --cache # 5× warm-run speedup
mcx watch "..." --debounce 300 # re-pack on file change
mcx doctor # environment self-check
mcx doctor is my favorite — it tells you in 6 lines what's installed and what isn't:
3. As a library
import { pack } from 'mincut-context';
const result = await pack({
task: 'fix the login validation bug',
repo: process.cwd(),
budget: 4000,
cache: true,
parallel: 4,
chunk: { enabled: true, maxTokens: 400 },
});
for (const f of result.files) {
console.log(f.path, f.score.toFixed(3), f.tokens, '·', f.reasons[0]);
}
// → src/auth/login.ts 0.541 612 · seed — matched directly by task
// → src/auth/session.ts 0.408 483 · attached (60%)
What I learned by building this
1. Embeddings are oversold for this problem
Adding semantic embeddings (--embed flag, via @xenova/transformers running locally) did not improve recall on any of my three eval task sets. Why? Because the labels were named honestly. When you label "stripe payment processor" → StripeProcessor.php, the keyword match catches it without help. Embeddings only earn their keep when your task vocabulary diverges from the code's — "centrality and ranking" → PageRank, that kind of gap.
I left --embed in because it doesn't hurt, and there are real users whose mental model doesn't match the code. But the marketing-friendly "AI-powered" framing for this stuff is mostly noise.
2. Greedy beats CELF for this objective
I implemented CELF (Cost-Effective Lazy Forward, Leskovec 2007) hoping for a free speedup over the naive greedy. It diverged — not just slower (8× slower on FluentForm) but wrong: it produced smaller, structurally weaker selections.
Why: our "no isolated nodes" acceptance rule (a candidate must have at least one edge into the current selection) breaks CELF's submodular-monotone assumption. A candidate's eligibility flips discontinuously when a node with an edge to it joins T. The lazy cache becomes unreliable.
I wrote the dead end up in eval/ALGORITHM-RESEARCH.md so nobody re-treads it. Honest negative results are worth shipping.
3. Sub-symbol chunking matters more than I expected
Big legacy codebases have huge functions. A 500-line function is one symbol in the graph, and if it gets selected, the whole thing eats your budget. So --chunk splits big functions at statement boundaries — each chunk becomes its own sub-symbol, individually selectable.
On FluentForm: indexing without chunking → 4,333 symbols. With --chunk → 4,878 symbols (+545 chunks). Same budget, much finer-grained selection. The greedy can pick just the relevant if/for/try block instead of all-or-nothing.
4. Test coverage of 88% isn't the whole story
The CI gates on 85% statements / 80% branches / 90% functions / 85% lines. But the genuinely-untestable files — worker scripts, lazy-loaded LSP clients — are excluded from the calc. Honest reporting means saying what is tested, not just the headline number.
The honest tradeoffs
| Honest tradeoff | What we do |
|---|---|
| True optimal min-cut is NP-hard | Greedy submodular — (1−1/e) bound |
| Tree-sitter symbols are syntactic, not type-aware |
--lsp refines TS/JS via typescript-language-server |
| Embedding model adds ~22 MB on first run | Opt-in behind --embed flag |
| LSP startup is slow (~1–5s) | Opt-in; cached after init |
| Cold start parses whole repo |
--cache (5× speedup) + --parallel n (2.7× speedup) |
What I'd build next if you asked
The roadmap that's not checked off yet:
- Pyright / Intelephense LSP adapters — type-aware calls for Python and PHP (~1–2 days each on the existing LSP infrastructure)
- Svelte / Rust / Go parsers — one file each on the parser template
-
Incremental neighborhood caching in the greedy — keep
attach(v, T)cached and update only when a node with an edge to v is added. Expected 3–5× speedup on graphs with bounded degree.
Each is bounded effort and additive. The core is done.
Stop building, start using
The hardest lesson: a tool's value comes from someone actually using it on real work, not from feature count. mincut-context is at v1.7.0 — 261 tests, 88.6% coverage, CI green on Ubuntu + macOS × Node 18/20/22. There's no honest "but it's not ready" excuse left.
If you've watched an AI agent burn 80% of its 200k-token context on imports it doesn't care about, install it now and tell me what breaks:
npm install -g mincut-context
🔗 GitHub: github.com/dhrupo/mincut-context
📦 npm: npmjs.com/package/mincut-context
📊 Reproducible benchmarks: eval/CROSS-REPO-RESULTS.md
I'd love feedback — especially "your numbers don't replicate on my codebase" feedback. That's literally what the eval suite is for.
If you got value from this, ⭐ the repo or drop a comment about a tooling problem you're solving. mincut-context is open-source MIT; the eval suite welcomes new fixtures.


Top comments (0)