...and we built something to stop it.
There's a pattern every developer using AI agents eventually notices. You ask the agent to find where authentication is handled. It opens a file. Skims 2,000 lines. Opens another file. Skims that. Opens a third. By the time it answers, it's consumed 40,000 tokens — most of them irrelevant — and your context window is half-gone before the real work starts.
We call this dumpster diving. The agent isn't reading strategically. It's digging through everything looking for something edible.
We've been watching this happen across millions of sessions with jCodeMunch and jDocMunch. And we built something to fix it: jMRI — the jMunch Retrieval Interface.
Today we're publishing the spec, the benchmark, and an open-source SDK. All Apache 2.0.
The Numbers
We ran the benchmark against two real codebases: FastAPI and Flask. Three methods compared: naive file reading, chunk RAG, and jMRI retrieval via jCodeMunch. Ten queries per repo. Here's what came out:
FastAPI (~950K source tokens)
| Method | Avg Tokens | Cost/Query | Precision |
|---|---|---|---|
| Naive (read all files) | 949,904 | $2.85 | 100% |
| Chunk RAG | 330,372 | $0.99 | 74% |
| jMRI | 480 | $0.0014 | 96% |
Flask (~148K source tokens)
| Method | Avg Tokens | Cost/Query | Precision |
|---|---|---|---|
| Naive (read all files) | 147,854 | $0.44 | 100% |
| Chunk RAG | 55,251 | $0.17 | 80% |
| jMRI | 480 | $0.0014 | 96% |
jMRI uses 1,979x fewer tokens than naive on FastAPI. It also beats chunk RAG on precision — 96% vs 74%.
That last point matters. The usual assumption is that precision is the tradeoff you make for efficiency. Chunk RAG is cheaper than naive but misses more. jMRI is cheaper than both and misses less. That's not a coincidence — it's a consequence of using structure instead of text similarity.
Reproduce it yourself in under 5 minutes:
git clone https://github.com/jgravelle/mcp-retrieval-spec
cd mcp-retrieval-spec/benchmark/munch-benchmark
python benchmark.py --all
Why Chunk RAG Loses on Precision
Chunk RAG splits files into overlapping windows of text and ranks them by keyword overlap or embedding similarity. A chunk boundary might fall in the middle of a function. The top-ranked chunk might contain the right words but not the right code. The retrieval is approximate by design.
jMRI retrieval is structurally exact. jCodeMunch parses source files into an AST-derived index: every function, class, and method is a named, addressable symbol with a stable ID. When you search for "OAuth2 password bearer authentication", you get back IDs like fastapi/security/oauth2.py::OAuth2PasswordBearer#class. When you retrieve that ID, you get exactly the class — no more, no less. No boundary accidents. No half-functions.
The 96% precision figure reflects cases where the top search result was the correct symbol for the query. The 4% where it wasn't were genuinely ambiguous queries — where even a human would have debated the right answer.
What Is jMRI?
jMRI (jMunch Retrieval Interface) is an open specification for MCP servers that do retrieval right.
Four operations. One response envelope. Two compliance levels.
Agent
│
├─ discover() → What knowledge sources are available?
├─ search(query) → Which symbols/sections are relevant? (IDs + summaries only)
├─ retrieve(id) → Give me the exact source for this ID.
└─ metadata(id?) → What would naive reading have cost?
Every response includes a _meta block:
{
"source": "def get_db():\n db = SessionLocal()\n try:\n yield db\n finally:\n db.close()\n",
"_meta": {
"tokens_saved": 42318,
"total_tokens_saved": 1284950,
"cost_avoided": { "claude-sonnet-4-6": 0.127 },
"timing_ms": 12
}
}
The agent doesn't have to guess whether it's being efficient. It knows, on every call.
The spec is deliberately minimal. We're not trying to build a platform. We're trying to name a pattern that already works at scale and make it easy for others to implement.
The Implementations
The spec is open. The best implementations are commercial.
| Implementation | Domain | Stars | Install |
|---|---|---|---|
| jCodeMunch | Code (30+ languages) | 900+ | uvx jcodemunch-mcp |
| jDocMunch | Docs (MD, RST, HTML, notebooks) | 45+ | uvx jdocmunch-mcp |
Both implement jMRI-Full — the complete spec including batch retrieval, hash-based drift detection, byte-offset addressing, and the full _meta envelope.
The two servers have collectively saved over 18 billion tokens across user sessions the first week of March 2026. That number is computed on-device from real session telemetry — every participating response reports tokens_saved via os.stat, no estimation.
Getting Started
Claude Code
Add to ~/.claude.json:
{
"mcpServers": {
"jcodemunch-mcp": {
"command": "uvx",
"args": ["jcodemunch-mcp"]
},
"jdocmunch-mcp": {
"command": "uvx",
"args": ["jdocmunch-mcp"]
}
}
}
Python SDK
pip install jmri-sdk
from jmri.client import MRIClient
client = MRIClient()
# What's indexed?
sources = client.discover()
# Find it
results = client.search("database session dependency injection", repo="fastapi/fastapi")
# Get exactly that
symbol = client.retrieve(results[0]["id"], repo="fastapi/fastapi")
print(symbol["source"])
print(f"Tokens saved this call: {symbol['_meta']['tokens_saved']:,}")
TypeScript SDK
import { MRIClient } from "mri-client";
const client = new MRIClient();
const results = await client.search("OAuth2 bearer auth", "fastapi/fastapi");
const symbol = await client.retrieve(results[0].id, "fastapi/fastapi");
The Open Spec
Everything is at github.com/jgravelle/mcp-retrieval-spec.
-
SPEC.md— the full jMRI v1.0 specification (Apache 2.0) -
sdk/python/— Python client helper -
sdk/typescript/— TypeScript client -
reference/server.py— minimal jMRI-compliant MCP server -
examples/— Claude Code, Cursor, and generic agent integrations
The spec is intentionally minimal. PRs that improve examples or add language SDKs are welcome. PRs that extend the core interface need a strong argument.
If you're building a retrieval MCP server, implement jMRI-Core. Your users' agents will thank you.
— J. Gravelle, March 2026
Benchmark source: github.com/jgravelle/mcp-retrieval-spec/benchmark
SDK: pip install jmri-sdk | npm install mri-client
Spec: github.com/jgravelle/mcp-retrieval-spec
Top comments (0)