Mukunda Rao Katta

Posted on May 25

My agent cited a document that does not exist. Here is the library I wrote to catch that.

#hermeschallenge #ai #python #agents

I was debugging an agent that summarizes financial documents. The agent had done well on every test I threw at it. Then a user asked about Q3 revenue. The agent responded with three confident sentences and ended with "Source: Q3 2024 Earnings Report, page 4."

There was no Q3 2024 Earnings Report in the document set. The user had only uploaded the Q2 report and a board presentation. The agent had fabricated the citation entirely. Not the content, exactly, but the source. The claim about revenue was close to correct. The attributed document did not exist.

That is a specific, quietly terrible failure. The agent sounds credible. The output looks cited. Nothing flags as wrong. The user goes and looks for the Q3 report and either cannot find it or finds a different one, and now they do not know whether to trust the fact.

I needed a way to register sources as the agent processes them, attach citations to claims as the agent generates output, and then verify, before the response is sent, that every citation ID references a real source and that any quoted snippet actually appears in the source text.

That is agent-citation. It is on PyPI as agent-citation, zero dependencies, 42 tests.

The shape of the fix

from agent_citation import CitationStore, Citation, attribute, validate

# Build the store from the documents the agent actually has access to
store = CitationStore()
store.add_source("doc-001", "Q2 2024 Earnings Report", text="Revenue for Q2 was $4.2B, up 8% from Q1.")
store.add_source("doc-002", "Board Presentation May 2024", text="Strategic priorities include expansion into APAC markets.")

# Agent generates output, attaching citations as it goes
output = attribute(
    text="Revenue for Q2 was $4.2B, up 8% from Q1.",
    citation=Citation(source_id="doc-001", snippet="Revenue for Q2 was $4.2B, up 8% from Q1.")
)

# Before sending the response, validate every citation
report = validate(output, store)

if not report.valid:
    for error in report.errors:
        print(error)

If the agent had cited "Q3 2024 Earnings Report" instead, the source ID would not exist in the store. The validation report would catch it. The agent would know before sending anything that the citation is broken.

What it does NOT do

It does not call the LLM. Validation is a pure structural check against data you already have.
It does not verify whether the claim is accurate. It verifies whether the source exists and whether the snippet (if provided) appears verbatim in the source text. Semantic accuracy checking is a different problem.
It does not fetch documents from URLs or file paths. You add sources to the store explicitly. What gets registered is what gets checked.
It does not score citation quality or relevance. A citation either exists and matches, or it does not.

Inside the lib: one design choice worth showing

The most important decision was to keep validation free of the LLM.

Citation verification is tempting to hand back to the model. Ask the model "does this claim match the source?" and the model will give you a confident answer. The problem is that you are using the model to check the model. You are back to trusting the same system that fabricated the citation in the first place.

Pure structural validation has different guarantees. It cannot tell you if the claim is accurate. It can tell you if the source ID exists. It can tell you if the snippet appears verbatim in the source text. Those two checks are deterministic, cheap, and do not require a round-trip to an API.

Here is what the validate function actually does:

from agent_citation import CitationStore, Citation, AttributedOutput, validate

store = CitationStore()
store.add_source("rpt-q2", "Q2 Report", text="Total revenue was $4.2B for the quarter.")

# Fabricated source ID
bad_output = AttributedOutput(
    text="Revenue grew 15% year over year.",
    citations=[Citation(source_id="rpt-q3", snippet="Revenue grew 15% year over year.")]
)

report = validate(bad_output, store)
print(report.valid)          # False
print(report.errors[0])      # "Source ID 'rpt-q3' not found in store"

# Correct source ID but snippet does not appear in the source text
misquote_output = AttributedOutput(
    text="Revenue grew 15% year over year.",
    citations=[Citation(source_id="rpt-q2", snippet="Revenue grew 15% year over year.")]
)

report2 = validate(misquote_output, store)
print(report2.valid)         # False
print(report2.errors[0])     # "Snippet not found in source 'rpt-q2'"

Two failure modes, two specific error messages, no LLM call, no ambiguity. If valid is True, every cited source exists in the store and every provided snippet appears in that source's text. That is the full guarantee.

The snippet check is a substring match, not semantic similarity. That is intentional. Semantic similarity would require an embedding call, would introduce a threshold judgment, and would reintroduce model-level trust into what is supposed to be a structural gate. If the snippet is even slightly rephrased from what the source says, validation fails. That strictness is the point.

When this is useful

Your agent works with a bounded document set (uploaded files, retrieved chunks, a fixed knowledge base) and you need to guarantee that every citation points back to something in that set.
You are building a legal or compliance tool where "the model made it up" is an unacceptable failure mode.
You want to fail fast before sending a response rather than reviewing citations manually after the fact.
You are implementing a retrieval-augmented agent and want to close the loop: if the retrieval step found it, the citation step should be able to verify it.

When this is NOT what you want

If you need to verify whether a claim is factually correct, not just whether the cited source exists. That requires a different tool, probably another LLM call or a separate fact-checking step.
If your agent is working with dynamic web content and cannot build a CitationStore at generation time. The store is a closed set at validation time. Sources that arrive after validation is run will not be found.
If your sources are too large to store as text. The snippet check is a string search over the source text you registered. Very large documents may need chunking before registration.

Install

pip install agent-citation

Repo: https://github.com/MukundaKatta/agent-citation

Sibling libraries

Lib	Boundary	Repo
agent-citation	WHERE layer: citations for agent outputs	https://github.com/MukundaKatta/agent-citation
agent-decision-log	WHY layer: decision rationale	https://github.com/MukundaKatta/agent-decision-log
agentsnap	CALLS layer: tool-call trace snapshots	https://github.com/MukundaKatta/agentsnap
agenttrace	COST layer: cost and latency tracking	https://github.com/MukundaKatta/agenttrace
agentvet	Validate tool args before execution	https://github.com/MukundaKatta/agentvet
prompt-shield	Pattern-based prompt injection detection	https://github.com/MukundaKatta/prompt-shield

What's next

Two things on the backlog.

First, a CitationStore.from_chunks(chunks) convenience method that builds a store directly from the output of a retrieval pipeline. Most RAG pipelines already have a list of {id, text} chunks at generation time. Wiring those into a CitationStore should be one call.

Second, a validate_all(outputs, store) batch function for agents that generate multiple attributed outputs in a single run (for example, a document summarizer that produces one attributed paragraph per section). Right now you call validate once per output. Batch validation would produce a single report covering all outputs, with per-output error attribution.

The core library stays simple. Citation is a structural problem. The fix is a structural check. More features should follow that same principle.

DEV Community