RAG for codebases is hard. Trusting the answer is harder.

#truth #ai #llm

There's a good post making the rounds — "RAG for Codebases Is Harder Than It Looks" — on why naive retrieval falls apart on code. Filtering out node_modules, chunking on AST boundaries instead of token counts, preserving file-path metadata, treating architecture questions as graph traversals rather than nearest-neighbor lookups. All correct. If you're building code RAG, read it.

But notice where the author lands at the very end:

Developer trust depends on retrieval quality, honest uncertainty acknowledgment, and verifiable sources — not just algorithmic sophistication.

That's the part I want to pull on. Because everything before it — better chunking, better embeddings, repo maps — makes the model's answer more likely to be right. None of it makes the answer checkable. And once an LLM is writing your code, "more likely to be right" is not the same problem as "did it actually do what it just told me it did."

The retrieval is upstream of a bigger trust gap

Good RAG gets the model better context. The model then makes an edit and reports back: "I added the /v1/refund route, set MAX_RETRIES to 5, only touched the parser, and the tests pass."

Every one of those is a factual claim about the repo as it exists right now. And here's the uncomfortable measurement: across 100 real SWE-agent runs on real GitHub issues, 30% of the attempts that failed the test suite still claimed they fixed the issue — "the method has been successfully added," "the issue has been resolved." Ground truth from the SWE-bench eval, claims judged by an LLM, reproducible.

Better retrieval doesn't fix that. A model with perfect context can still over-claim what it did. The hallucination moves from "wrong fact about the codebase" to "wrong fact about my own change" — which is harder to catch, because it sounds like a status update, not a guess.

What "verifiable sources" actually requires

The article's three closing words — honest uncertainty, verifiable sources, no speculation — are exactly the design constraints I've been building truth around. But I'd argue you can't get them from a better RAG pipeline, because they're not retrieval properties. They're verification properties:

Honest uncertainty isn't "the model hedges." It's a system that can say Refused to "this is cleaner" because there's no measurable subject — and mean it. A verifier that bluffs is worse than none.
Verifiable sources isn't "cite a chunk." It's a verdict that rests on a binary fact you read directly: a file present/absent in the git diff, an AST symbol that exists or doesn't, an exact value, a command's exit code. Cited to file:line.
No speculation isn't a prompt instruction the model can ignore. It's a hard architectural rule: a fixed-rule engine decides the verdict, not a model. The agent cannot talk it into a different answer.

So truth does the inverse of RAG. RAG retrieves context to help the model answer. truth retrieves evidence to check what the model claimed — against the real code, the working-tree git diff, recorded command runs, and logs — and returns Supported / Contradicted / Refused, every verdict cited.

$ truth verify-turn "I added the /v1/refund endpoint, set MAX_RETRIES to 3, I only changed src/api.rs, and tests pass"

  ✓ Supported     I added the /v1/refund endpoint  (src/api.rs)
  ✗ Contradicted  set MAX_RETRIES to 3             (src/config.rs:1)
  ✗ Contradicted  I only changed src/api.rs        (src/api.rs)
  ✗ Contradicted  tests pass                       (recorded 2026-06-11 09:14 UTC)

An LLM only parses the sentence into a structured claim. The verdict comes from a deterministic engine.

The asymmetry the article gestures at, made explicit

"Honest uncertainty acknowledgment" has a sharp edge the article doesn't dwell on: a verifier that false-accuses gets uninstalled on day one. Crying wolf on a truthful agent is the one unforgivable failure.

So truth only ever says Contradicted on a structured binary fact — never on a noisy count (a text scan over-counts comments; a log window samples). "Nobody uses this," "updated all 4 call sites," "X is unused" — those surface their evidence as Inconclusive: a suspicion to act on, not a verdict that blocks. And it's measured, not asserted: a labeled corpus of real agent over-claims holds the false-contradiction rate at 0 as a hard CI gate. When truth blocks, it means it.

This is the same instinct the article has about retrieval noise — "if chunks are too large, retrieval becomes noisy" — applied to verdicts. A noisy verifier is as useless as noisy retrieval. Worse, because it erodes the exact trust it was supposed to provide.

Where they meet

The author is right that code RAG needs preprocessing, structure-awareness, and citations. I'd add one thing to the conclusion: those buy you a better-informed model, not a trustworthy one. When the model is editing your repo, the last mile isn't retrieval quality — it's whether you can independently confirm what it told you, with a citation, deterministically, and without it ever crying wolf.

truth is that last mile. It runs entirely locally (a single SQLite file in .truth/; your code never leaves the machine), exposes one MCP tool an agent calls on its own message before reporting done, and can be a hook the agent can't skip.

If you're building code RAG, your model's answers are about to get a lot better. Make them checkable too.

→ github.com/blasrodri/truth

Top comments (1)

Alice • Jun 29

This is the right thing to pull on. The trap is treating trust as a function of answer quality — but a confident, well-retrieved answer is exactly the kind you stop checking, which is when it bites you.

The reframe that's helped me: trust shouldn't come from the model, it should come from verifiability. Concretely:

Make every claim carry its proof. The answer cites exact file:line for each assertion, and one that can't cite degrades to 'I couldn't confirm this' rather than smooth prose. No citation = treated as 'I don't know.'
Make uncertainty structural, not stylistic. 'Honest uncertainty acknowledgment' only works if it's calibrated — tie it to retrieval confidence so low confidence forces the answer to degrade ('here are the 3 files that might be relevant, unverified') instead of confabulating a clean narrative.
Strongest version: make answers checkable in seconds. For code that's 'here's the claim + the grep or test that confirms it.' Now trust lives in the proof, not the model.

Same principle I keep hitting with agents that take actions: the model is an untrusted planner, and all the trust has to live in the verification layer around it. Better retrieval raises the ceiling; verifiability is what actually earns the trust.