Standard RAG has a ceiling. If your query requires connecting information across multiple documents — "How did decision A lead to outcome B, which caused problem C?" — vector similarity search fails.
GraphRAG, released by Microsoft Research in 2024, solves this by building a knowledge graph from your documents before any query runs.
Why Standard RAG Fails at Multi-Hop Questions
Vector search retrieves chunks that are semantically similar to the query. But similarity ≠ relationship.
❌ "What are all the indirect effects of policy X across departments?"
❌ "Which entities are connected to both A and B?"
❌ "What's the overall theme across this entire document corpus?"
These require traversing relationships between entities — exactly what graphs are built for.
How GraphRAG Works
Standard RAG:
Document → Chunks → Embeddings → Nearest-neighbor search → Answer
GraphRAG:
Document → Entity extraction (LLM) → Relationship extraction (LLM)
→ Knowledge graph → Community detection (Leiden algorithm)
→ Community summaries (LLM) → stored in Parquet
Query → Graph traversal OR community summary aggregation → Answer
Two Query Modes
| Mode | Mechanism | Best For |
|---|---|---|
| Local Search | Traverse subgraph around specific entities | "Who is X?", "What's X's relationship to Y?" |
| Global Search | Aggregate community summaries hierarchically | "What are the main themes?", "Give me the big picture" |
Setup (5 Minutes)
pip install graphrag
mkdir project && cd project
python -m graphrag init --root .
mkdir input && cp your_docs/*.txt input/
echo "GRAPHRAG_API_KEY=sk-..." > .env
Key config in settings.yaml:
llm:
model: gpt-4o-mini # Cost-efficient; use gpt-4o for higher quality
api_key: ${GRAPHRAG_API_KEY}
embeddings:
llm:
model: text-embedding-3-small # $0.02/1M tokens
chunks:
size: 1200
overlap: 100
Build the index:
python -m graphrag index --root .
# This calls the LLM to extract entities + relationships + build communities
# ~$0.50-5 per 100 pages (gpt-4o-mini)
Running Queries
import asyncio
import graphrag.api as api
from graphrag.config import GraphRagConfig
import yaml, pathlib, pandas as pd
config = GraphRagConfig.model_validate(
yaml.safe_load(pathlib.Path("settings.yaml").read_text())
)
# Pre-load the graph data
output_dir = pathlib.Path("output")
nodes = pd.read_parquet(output_dir / "nodes.parquet")
entities = pd.read_parquet(output_dir / "entities.parquet")
community_reports = pd.read_parquet(output_dir / "community_reports.parquet")
text_units = pd.read_parquet(output_dir / "text_units.parquet")
relationships = pd.read_parquet(output_dir / "relationships.parquet")
async def local_search(query: str) -> str:
result = await api.local_search(
config=config,
nodes=nodes, entities=entities,
community_reports=community_reports,
text_units=text_units,
relationships=relationships,
covariates=None,
community_level=2,
response_type="Single Paragraph",
query=query,
)
return result.response
async def global_search(query: str) -> str:
result = await api.global_search(
config=config,
nodes=nodes, entities=entities,
community_reports=community_reports,
community_level=2,
dynamic_community_selection=False,
response_type="Multiple Paragraphs",
query=query,
)
return result.response
# Examples
specific = asyncio.run(local_search("What is the relationship between GraphRAG and knowledge graphs?"))
overview = asyncio.run(global_search("Summarize the main themes in this research corpus"))
LightRAG: Simpler Alternative
If the full Microsoft GraphRAG pipeline is too heavy, LightRAG offers a lightweight alternative:
# pip install lightrag-hku
from lightrag import LightRAG, QueryParam
from lightrag.llm import gpt_4o_mini_complete, openai_embedding
rag = LightRAG(
working_dir="./cache",
llm_model_func=gpt_4o_mini_complete,
embedding_func=openai_embedding,
)
await rag.ainsert(open("docs.txt").read())
# Four modes in one API
naive = await rag.aquery("question", param=QueryParam(mode="naive")) # Standard RAG
local = await rag.aquery("question", param=QueryParam(mode="local")) # Local graph
global_ = await rag.aquery("question", param=QueryParam(mode="global")) # Global summaries
hybrid = await rag.aquery("question", param=QueryParam(mode="hybrid")) # Best of both
GraphRAG vs Standard RAG: Decision Matrix
| Factor | Standard RAG | GraphRAG |
|---|---|---|
| Corpus size | Up to ~500 pages | 500–10,000+ pages |
| Query type | Factual lookup | Relational, multi-hop |
| Latency | < 2 seconds | 5–30 seconds |
| Index cost | Low (embeddings only) | High (LLM extraction) |
| Maintenance | Easy (re-embed on update) | Complex (re-extract on update) |
| Sweet spot | FAQ, manuals, support docs | Research corpora, legal docs, knowledge bases |
Rule of thumb: Start with standard RAG. If multi-hop queries fail consistently, add GraphRAG for those query types.
Combining Both: Agentic Graph-RAG
The most powerful 2026 pattern routes queries dynamically:
from langchain.tools import tool
@tool
def graph_search(query: str) -> str:
"""Use when the question involves relationships, causality, or the big picture."""
return asyncio.run(global_search(query))
@tool
def vector_search(query: str) -> str:
"""Use when the question asks for specific facts or recent information."""
return retriever.invoke(query)
# Agent selects the right tool based on the question
from langchain.agents import create_react_agent
agent = create_react_agent(
llm=ChatOpenAI(model="gpt-4o"),
tools=[graph_search, vector_search],
prompt=agent_prompt
)
# Complex relational question → graph_search
# Simple factual question → vector_search
The Honest Tradeoff
GraphRAG is genuinely better for relationship-heavy corpora. But it's not a drop-in upgrade:
- Index build time: Minutes to hours depending on corpus size
- Rebuild cost: Any document update requires re-running extraction (expensive)
- Latency: Global search can take 15–30s — not suitable for real-time chat
For most teams: use standard RAG for 90% of queries and GraphRAG specifically for the "tell me about everything related to X" class of questions.
Explore 471+ AI tools including GraphRAG, LightRAG, and every major RAG infrastructure option at AgDex.ai
Top comments (0)