Structured Citations for LLM Agents: Track Where Every Claim Came From

#hermeschallenge #ai #python #agents

Your agent searches documents and writes a report. The report contains claims. A reviewer asks: where does this claim come from? You check the agent's output. The claim is there. The source is not.

The model included the information but dropped the citation. Or it cited "the document" without specifying which one. Or the citation exists in the model's response but your code discards it before the report is assembled.

agent-citation is a structured WHERE-layer: attach sources to claims, carry them through the agent loop, and render them in whatever format the output requires.

The Shape of the Fix

from agent_citation import CitationStore, Citation

store = CitationStore()

# Agent records citations as it processes sources
store.add(Citation(
    text="Q3 revenue was $2.4M, up 18% year-over-year.",
    source_id="doc-001",
    source_title="Q3 2025 Earnings Report",
    source_url="https://company.com/reports/q3-2025.pdf",
    page=4,
    excerpt="Revenue: $2.4M (Q3 2025) vs $2.03M (Q3 2024)",
))

store.add(Citation(
    text="Gross margin improved to 67%.",
    source_id="doc-001",
    source_title="Q3 2025 Earnings Report",
    source_url="https://company.com/reports/q3-2025.pdf",
    page=7,
    excerpt="Gross margin: 67% (prev: 61%)",
))

# Render for the final report
report = store.render_markdown()
footnotes = store.render_footnotes()

Claims are linked to sources. Sources are real — specific documents, pages, and verbatim excerpts. The rendered output carries the attribution.

What It Does NOT Do

agent-citation does not automatically extract citations from LLM output. You record citations explicitly when your agent processes sources. The library does not parse the model's text to find citations. If you want to extract citations from model output, use a follow-up prompt asking the model to list its sources and then parse that.

It does not verify that claims are accurate. It records that the agent associated a claim with a source. Whether the claim actually appears in the source, and whether the excerpt is accurate, is the agent's responsibility.

It does not handle citation deduplication automatically. If the same source is cited multiple times, each citation is recorded separately. Use store.sources() to get a deduplicated list of source documents.

Inside the Library

Citations are stored in a list with source deduplication for rendering:

from dataclasses import dataclass, field
from typing import Any

@dataclass
class Citation:
    text: str           # The claim being made
    source_id: str      # Stable ID for deduplication
    source_title: str   # Human-readable title
    source_url: str = ""
    page: int | None = None
    excerpt: str = ""   # Verbatim text from source
    metadata: dict = field(default_factory=dict)

class CitationStore:
    def __init__(self):
        self._citations: list[Citation] = []
        self._sources: dict[str, Citation] = {}  # source_id -> first citation

    def add(self, citation: Citation) -> int:
        idx = len(self._citations)
        self._citations.append(citation)
        if citation.source_id not in self._sources:
            self._sources[citation.source_id] = citation
        return idx

    def render_markdown(self) -> str:
        lines = []
        for i, cite in enumerate(self._citations, 1):
            source_ref = f"[{i}]"
            lines.append(f"{cite.text} {source_ref}")

        if self._citations:
            lines.append("\n---\n**Sources**\n")
            for i, cite in enumerate(self._citations, 1):
                location = f", p. {cite.page}" if cite.page else ""
                url_part = f" ({cite.source_url})" if cite.source_url else ""
                lines.append(f"[{i}] {cite.source_title}{location}{url_part}")
                if cite.excerpt:
                    lines.append(f'    > "{cite.excerpt}"')

        return "\n".join(lines)

    def render_footnotes(self) -> list[dict]:
        return [
            {
                "index": i,
                "text": cite.text,
                "source_title": cite.source_title,
                "source_url": cite.source_url,
                "page": cite.page,
                "excerpt": cite.excerpt,
            }
            for i, cite in enumerate(self._citations, 1)
        ]

    def sources(self) -> list[Citation]:
        return list(self._sources.values())

    def to_dict(self) -> dict:
        return {
            "citations": [
                {k: v for k, v in vars(c).items()}
                for c in self._citations
            ],
        }

When to Use It

Use it for research agents that aggregate information from multiple documents. Without structured citations, the agent's output is a set of unsourced claims. With citations, every claim links to the specific document and page it came from.

Use it for compliance-sensitive outputs. Regulatory, legal, and medical domains require source attribution. An unsourced LLM output is not acceptable. Structured citations make the attribution explicit and auditable.

Use it when building a human-in-the-loop review step. The reviewer sees both the claim and the specific excerpt it came from. They can verify the excerpt against the source without reading the whole document.

Skip it for conversational agents that respond from training knowledge rather than retrieved documents. If the agent is not doing retrieval, there are no citations to record.

Install

pip install git+https://github.com/MukundaKatta/agent-citation

# Or from PyPI
pip install agent-citation

from agent_citation import CitationStore, Citation

store = CitationStore()

def search_and_cite(query: str) -> list[dict]:
    results = vector_db.search(query, top_k=5)
    cited_results = []

    for doc in results:
        store.add(Citation(
            text=doc["chunk"],
            source_id=doc["doc_id"],
            source_title=doc["title"],
            source_url=doc.get("url", ""),
            page=doc.get("page"),
            excerpt=doc["chunk"][:200],
        ))
        cited_results.append(doc)

    return cited_results

def compile_report(findings: list[str]) -> str:
    report_sections = "\n".join(findings)
    citations_md = store.render_markdown()

    return f"""# Research Report

## Findings

{report_sections}

## Sources

{citations_md}
"""

Sibling Libraries

Library	What it solves
`agent-decision-log`	WHY-layer: record why the agent made each decision
`agent-step-log`	WHAT-layer: record each step the agent took
`agent-run-id`	Correlation IDs to link citations across a run
`prompt-eval-rubric`	Score outputs including citation completeness
`agenttap`	Wire-level capture of the retrieval calls that produced citations

The attribution stack: agent-citation for source tracking, agent-decision-log for reasoning tracking, agent-run-id for tying both to a specific run.

What's Next

Citation confidence: an optional confidence: float field on Citation that reflects how strongly the agent believes the excerpt supports the claim. Useful for distinguishing direct quotes from inferred summaries.

Conflict detection: store.find_conflicts() that looks for two citations where the texts contradict each other (same source_id, different claims about the same quantity). Useful for research agents that aggregate conflicting data.

Export formats: store.render_apa(), store.render_chicago(), store.render_json_ld() for different output contexts. Academic papers, legal documents, and web pages all have different citation format requirements.

Built as part of the agent-stack family: composable Python primitives for production LLM agents.