Beyond RAG: Why I replaced similarity search with graph traversal for AI agent context

#ai #llm #knowledgegraph #agents

The problem RAG doesn't solve

RAG is good for question answering. It's bad for tasks that require knowing dependencies before you act.

When an AI coding agent asks "what breaks if I change RunnableSequence?" — RAG retrieves text chunks that mention RunnableSequence. Approximate. Probabilistic. It might miss the 23 modules that directly import it.

Same problem in life sciences. A PM asks an agent to draft a payer brief for GLP-1 coverage. The agent doesn't know that Insulin Resistance flows upstream to Metabolic Syndrome — which gates Prior Authorization — which determines formulary position. It retrieves similar text and approximates.

The wrong answer in both cases sounds right. That's the risk.

What I built

ckg-mcp — an open-source Model Context Protocol server that delivers compact knowledge graphs (CKGs) to any MCP-compatible agent orchestrator as pre-action structural context.

pip install ckg-mcp

MCP config:

{"mcpServers": {"ckg": {"command": "ckg-mcp"}}}

The orchestrator calls tools before dispatching any worker agent:

trace_upstream("RunnableSequence")    # exact blast radius
trace_downstream("BaseRunnable")      # everything that depends on this
find_path("Insulin Resistance", "Prior Authorization")  # causal chain
get_domain_summary()                  # full domain stats

Two demos

Codebase: Mapped LangChain Core — 180 modules, 650 dependency edges. Before a coding agent edits any module, it calls trace_upstream(module). Gets back the exact dependency subgraph. Then trace_downstream(module) for blast radius. Every hop is a real edge. No guessing.

Clinical: The GLP-1 Clinical Pathway is a 146-node graph with 200+ typed dependency edges: mechanism, market, regulatory, clinical. An agent drafting a payer brief queries the graph first and gets the causal chain from mechanism of action through prior authorization to formulary position. Then it writes.

Both use the same MCP interface. Swap the CSV file, everything else stays the same.

Why it's more efficient

System	BERT F1	Cost/Correct Answer	Tokens/Query
CKG	0.857	$0.000506	274
RAG	0.817	$0.013046	~3,100
GraphRAG	0.825	$0.020098	~10,000

65x more token-efficient than RAG. 40x cheaper per correct answer than Microsoft GraphRAG. Higher BERT F1 than both.

The token difference is structural: CKG retrieves exactly what was asked for. RAG passes large text chunks that need synthesis. The model cost difference follows directly.

Tested across 8,121 queries, 47 domains, BERTScore (roberta-large).

Domain-agnostic

The same MCP tools work for:

Software codebases — blast radius before any edit
Clinical pathways — causal chain before any payer document
Regulatory frameworks — dependency chain before any compliance draft
Financial instruments — prerequisite structure before any analysis
Educational curricula — learning graph before any content generation

52 domains live. Switch by swapping the CSV file.

Try it

Live demo: huggingface.co/spaces/danyarm/ckg-demo
GitHub: github.com/Yarmoluk/ckg-mcp
Benchmark: github.com/Yarmoluk/ckg-benchmark
Site (mapped by persona): graphifymd.com
Questions: graphifymd@protonmail.com

Top comments (1)

Tae Kim • May 29 • Edited

The 65x mostly shifts cost upstream rather than removing it. CKG only returns the right modules if entity extraction and resolution built those edges correctly, which is where the spend and the errors actually move.