DEV Community

Mukunda Rao Katta
Mukunda Rao Katta

Posted on

agent-citation: Track Where Every Agent Claim Came From

The Untraceable Summary Problem

Your agent calls three tools: a web search, a database query, and a document retriever. It produces a five-sentence summary. A user asks: "Which source says the number is 4.2 million?"

You have no answer. The summary looks correct, but you cannot point to a source for any individual claim. In a customer-facing product or regulated context, that is a serious gap.

This is not an LLM problem. The model may have returned citations as text in its output, but they are buried in prose. There is no structured way to query them, validate them, or attach them to specific claims in a way your application code can inspect.

agent-citation fixes this. It gives you a structured citation layer: attach a source to a claim, store both, and validate that every claim that needs a citation has one. The data stays in Python objects. You can serialize it, display it, or run checks on it before returning output to the user.


Main Code Example

Install:

pip install agent-citation
Enter fullscreen mode Exit fullscreen mode

Basic attribution pattern:

from agent_citation import CitationStore, Citation

store = CitationStore()

# After your web search tool returns
search_result = search_tool("revenue 2024 global EV market")
citation = Citation(
    source_id="web-search-1",
    source_type="web",
    url=search_result.url,
    title=search_result.title,
    retrieved_at=search_result.timestamp,
)
store.attribute(
    claim="Global EV revenue reached 4.2 million units in 2024.",
    citation=citation,
)

# After your database query tool returns
db_result = db_tool("SELECT revenue FROM ev_sales WHERE year=2024")
store.attribute(
    claim="Internal sales data shows consistent growth through Q3.",
    citation=Citation(
        source_id="db-query-1",
        source_type="database",
        table="ev_sales",
        query_hash=hash(str(db_result.query)),
    ),
)

# Validate before returning output
report = store.validate()
if report.uncited_claims:
    raise ValueError(f"Uncited claims found: {report.uncited_claims}")

# Serialize for API response
output = {
    "summary": build_summary(store.claims()),
    "citations": store.to_dict(),
}
Enter fullscreen mode Exit fullscreen mode

Marking a claim as not requiring a citation:

store.attribute(
    claim="The agent processed 3 sources.",
    citation=None,
    requires_citation=False,
)
Enter fullscreen mode Exit fullscreen mode

The validate() call checks every claim where requires_citation=True (the default) and returns a report listing any claims without citations. It does not raise by default. You decide whether to raise, log, or return a partial result.

Retrieving citations for a specific claim:

for entry in store.where(claim_contains="4.2 million"):
    print(entry.claim)
    print(entry.citation.url)
Enter fullscreen mode Exit fullscreen mode

Exporting to JSONL for storage or audit:

with open("run-citations.jsonl", "w") as f:
    for line in store.to_jsonl():
        f.write(line + "\n")
Enter fullscreen mode Exit fullscreen mode

What It Does NOT Do

agent-citation does not generate citations. It does not ask the LLM "where did you get that?" and parse the answer. That approach is unreliable because models hallucinate citations as readily as they hallucinate facts.

It does not extract claims from unstructured LLM output. If your model returns a paragraph, you still need to split it into claims. The library stores and validates structured claim-citation pairs. Structuring the output is your job, not the library's.

It does not deduplicate sources. If your search tool returns the same URL twice under different queries, you get two citations. You can normalize them before calling attribute(), or just keep both.

It does not validate that the source actually supports the claim. That is a separate retrieval-augmented generation (RAG) evaluation problem. This library tracks what you said and where you said it came from. Whether the source actually says that is out of scope.


Design Reasoning

The driving insight is that citation is a bookkeeping problem, not a generation problem. You already know where your data came from. Your tool calls have return values. The source metadata is right there. The gap is that nothing captures the connection between "this data came from source X" and "this sentence in the output uses that data."

CitationStore is the bridge. You populate it during tool execution, before the LLM synthesizes output. That way the citation metadata is authoritative. It reflects what your code actually fetched, not what the LLM thinks it fetched.

The validate() method is the enforcement mechanism. Run it before returning output and you have a hard gate: no uncited claims make it out. That is a stronger guarantee than reviewing LLM-generated citation prose.

The requires_citation=False flag exists because some claims are computational or structural. "The agent processed 3 sources" is derived from your code, not from a source. You still want it in the store for completeness, but it should not fail validation.

The JSONL export is designed for audit. If a user challenges a specific claim months later, you have a file that shows exactly what source was attributed to it at run time.


When This Applies (and When It Does Not)

This is a good fit when:

  • Your agent pulls from multiple tools and the output needs to be traceable to specific sources
  • You operate in a regulated industry where auditability of AI output is required
  • You are building a RAG system and want claim-level, not document-level, attribution
  • You want to enforce citation coverage as a hard gate before returning output to users

It is not a good fit when:

  • Your agent is purely generative with no tool calls (there are no external sources to cite)
  • You only need document-level attribution ("this answer came from these 3 docs") and do not need claim-level granularity
  • Your LLM returns structured output that already includes citation metadata in a schema you control

Install or Quick-Start

pip install agent-citation
Enter fullscreen mode Exit fullscreen mode

Minimal example:

from agent_citation import CitationStore, Citation

store = CitationStore()
store.attribute(
    claim="Python was released in 1991.",
    citation=Citation(source_id="wiki-python", source_type="web", url="https://en.wikipedia.org/wiki/Python_(programming_language)"),
)
report = store.validate()
print(report.is_valid)  # True
print(store.to_dict())
Enter fullscreen mode Exit fullscreen mode

GitHub: MukundaKatta/agent-citation


Siblings Table

Library What it does How it complements agent-citation
agenttap Wire-level capture of prompts and responses Capture raw LLM output; use alongside citation store
agent-event-bus In-process pub/sub for agent events Publish a citation.added event on each attribute() call
agent-step-log Per-step JSONL logger Log citation store state at each turn boundary
conversation-codec Persist conversation history with redaction Attach citation JSONL alongside conversation JSONL
llm-structured-retry Retry with error context injected Retry claim extraction if the model omits required claims

Enforcing Coverage as a Gate

The validate() call is useful on its own, but it becomes more powerful when you integrate it into a structured output schema check.

Here is a pattern for enforcing that every claim in a model's structured output has a citation before the response is returned to the user:

from agent_citation import CitationStore, Citation
from pydantic import BaseModel
from typing import List

class AgentClaim(BaseModel):
    text: str
    source_id: str

class AgentResponse(BaseModel):
    claims: List[AgentClaim]
    summary: str

def build_response_with_citations(raw_claims, sources_by_id):
    store = CitationStore()

    for claim in raw_claims:
        source = sources_by_id.get(claim.source_id)
        if source is None:
            # claim references unknown source; fail hard
            raise ValueError(f"Claim references unknown source: {claim.source_id}")

        store.attribute(
            claim=claim.text,
            citation=Citation(
                source_id=claim.source_id,
                source_type=source["type"],
                url=source.get("url"),
            ),
        )

    report = store.validate()
    if not report.is_valid:
        raise ValueError(f"Validation failed: {report.uncited_claims}")

    return {
        "summary": " ".join(c.text for c in raw_claims),
        "citations": store.to_dict(),
    }
Enter fullscreen mode Exit fullscreen mode

The key point here is that citation enforcement happens before you return the response, not after. That makes it a hard gate. If the model produced a claim and the tool pipeline did not attach a source for it, the validation fails and the caller sees an error rather than an uncited output.

This is more reliable than asking the model to cite its sources in the response text. The model can and does hallucinate in-line citations. This approach only accepts citations that your tool pipeline explicitly recorded.


What is Next

The next planned feature is a diff mode: compare citation stores across two runs of the same agent and surface which claims changed sources. This is useful when you update a retrieval index and want to know which answers are now backed by different data.

A Pydantic model export is also on the roadmap. Today to_dict() returns plain Python dicts. A Pydantic model per entry would make it easier to integrate with FastAPI response schemas.

A citation coverage score is another option under consideration. Instead of a binary valid/invalid report, the score would express what fraction of claims are cited. For exploratory agent builds where 100% coverage is too strict, a threshold like "90% of claims must be cited" is more practical.

For the Hermes Agent Challenge sprint, agent-citation fits into the traceability and auditability pillar. If you are building an agent that summarizes external data and your users or stakeholders need to trace any claim back to its source, this library is the structured layer that makes that possible without post-hoc LLM parsing.

Top comments (0)