DEV Community

J. Gravelle
J. Gravelle

Posted on

Your AI Agent Is Dumpster Diving Through Your Code,,,

...and we built something to stop it.


There's a pattern every developer using AI agents eventually notices. You ask the agent to find where authentication is handled. It opens a file. Skims 2,000 lines. Opens another file. Skims that. Opens a third. By the time it answers, it's consumed 40,000 tokens — most of them irrelevant — and your context window is half-gone before the real work starts.

We call this dumpster diving. The agent isn't reading strategically. It's digging through everything looking for something edible.

We've been watching this happen across millions of sessions with jCodeMunch and jDocMunch. And we built something to fix it: jMRI — the jMunch Retrieval Interface.

Today we're publishing the spec, the benchmark, and an open-source SDK. All Apache 2.0.


The Numbers

We ran the benchmark against two real codebases: FastAPI and Flask. Three methods compared: naive file reading, chunk RAG, and jMRI retrieval via jCodeMunch. Ten queries per repo. Here's what came out:

FastAPI (~950K source tokens)

Method Avg Tokens Cost/Query Precision
Naive (read all files) 949,904 $2.85 100%
Chunk RAG 330,372 $0.99 74%
jMRI 480 $0.0014 96%

Flask (~148K source tokens)

Method Avg Tokens Cost/Query Precision
Naive (read all files) 147,854 $0.44 100%
Chunk RAG 55,251 $0.17 80%
jMRI 480 $0.0014 96%

jMRI uses 1,979x fewer tokens than naive on FastAPI. It also beats chunk RAG on precision — 96% vs 74%.

That last point matters. The usual assumption is that precision is the tradeoff you make for efficiency. Chunk RAG is cheaper than naive but misses more. jMRI is cheaper than both and misses less. That's not a coincidence — it's a consequence of using structure instead of text similarity.

Reproduce it yourself in under 5 minutes:

git clone https://github.com/jgravelle/mcp-retrieval-spec
cd mcp-retrieval-spec/benchmark/munch-benchmark
python benchmark.py --all
Enter fullscreen mode Exit fullscreen mode

Why Chunk RAG Loses on Precision

Chunk RAG splits files into overlapping windows of text and ranks them by keyword overlap or embedding similarity. A chunk boundary might fall in the middle of a function. The top-ranked chunk might contain the right words but not the right code. The retrieval is approximate by design.

jMRI retrieval is structurally exact. jCodeMunch parses source files into an AST-derived index: every function, class, and method is a named, addressable symbol with a stable ID. When you search for "OAuth2 password bearer authentication", you get back IDs like fastapi/security/oauth2.py::OAuth2PasswordBearer#class. When you retrieve that ID, you get exactly the class — no more, no less. No boundary accidents. No half-functions.

The 96% precision figure reflects cases where the top search result was the correct symbol for the query. The 4% where it wasn't were genuinely ambiguous queries — where even a human would have debated the right answer.


What Is jMRI?

jMRI (jMunch Retrieval Interface) is an open specification for MCP servers that do retrieval right.

Four operations. One response envelope. Two compliance levels.

Agent
  │
  ├─ discover()    → What knowledge sources are available?
  ├─ search(query) → Which symbols/sections are relevant? (IDs + summaries only)
  ├─ retrieve(id)  → Give me the exact source for this ID.
  └─ metadata(id?) → What would naive reading have cost?
Enter fullscreen mode Exit fullscreen mode

Every response includes a _meta block:

{
  "source": "def get_db():\n    db = SessionLocal()\n    try:\n        yield db\n    finally:\n        db.close()\n",
  "_meta": {
    "tokens_saved": 42318,
    "total_tokens_saved": 1284950,
    "cost_avoided": { "claude-sonnet-4-6": 0.127 },
    "timing_ms": 12
  }
}
Enter fullscreen mode Exit fullscreen mode

The agent doesn't have to guess whether it's being efficient. It knows, on every call.

The spec is deliberately minimal. We're not trying to build a platform. We're trying to name a pattern that already works at scale and make it easy for others to implement.


The Implementations

The spec is open. The best implementations are commercial.

Implementation Domain Stars Install
jCodeMunch Code (30+ languages) 900+ uvx jcodemunch-mcp
jDocMunch Docs (MD, RST, HTML, notebooks) 45+ uvx jdocmunch-mcp

Both implement jMRI-Full — the complete spec including batch retrieval, hash-based drift detection, byte-offset addressing, and the full _meta envelope.

The two servers have collectively saved over 18 billion tokens across user sessions the first week of March 2026. That number is computed on-device from real session telemetry — every participating response reports tokens_saved via os.stat, no estimation.


Getting Started

Claude Code

Add to ~/.claude.json:

{
  "mcpServers": {
    "jcodemunch-mcp": {
      "command": "uvx",
      "args": ["jcodemunch-mcp"]
    },
    "jdocmunch-mcp": {
      "command": "uvx",
      "args": ["jdocmunch-mcp"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Python SDK

pip install jmri-sdk
Enter fullscreen mode Exit fullscreen mode
from jmri.client import MRIClient

client = MRIClient()

# What's indexed?
sources = client.discover()

# Find it
results = client.search("database session dependency injection", repo="fastapi/fastapi")

# Get exactly that
symbol = client.retrieve(results[0]["id"], repo="fastapi/fastapi")
print(symbol["source"])
print(f"Tokens saved this call: {symbol['_meta']['tokens_saved']:,}")
Enter fullscreen mode Exit fullscreen mode

TypeScript SDK

import { MRIClient } from "mri-client";

const client = new MRIClient();
const results = await client.search("OAuth2 bearer auth", "fastapi/fastapi");
const symbol = await client.retrieve(results[0].id, "fastapi/fastapi");
Enter fullscreen mode Exit fullscreen mode

The Open Spec

Everything is at github.com/jgravelle/mcp-retrieval-spec.

  • SPEC.md — the full jMRI v1.0 specification (Apache 2.0)
  • sdk/python/ — Python client helper
  • sdk/typescript/ — TypeScript client
  • reference/server.py — minimal jMRI-compliant MCP server
  • examples/ — Claude Code, Cursor, and generic agent integrations

The spec is intentionally minimal. PRs that improve examples or add language SDKs are welcome. PRs that extend the core interface need a strong argument.

If you're building a retrieval MCP server, implement jMRI-Core. Your users' agents will thank you.


— J. Gravelle, March 2026

Benchmark source: github.com/jgravelle/mcp-retrieval-spec/benchmark
SDK: pip install jmri-sdk | npm install mri-client
Spec: github.com/jgravelle/mcp-retrieval-spec

Top comments (0)