Agdex AI

Posted on Apr 29

GraphRAG in 2026: How Microsoft's Knowledge Graph Approach Beats Standard RAG

#rag #python #llm #aiagents

Standard RAG has a ceiling. If your query requires connecting information across multiple documents — "How did decision A lead to outcome B, which caused problem C?" — vector similarity search fails.

GraphRAG, released by Microsoft Research in 2024, solves this by building a knowledge graph from your documents before any query runs.

Why Standard RAG Fails at Multi-Hop Questions

Vector search retrieves chunks that are semantically similar to the query. But similarity ≠ relationship.

❌ "What are all the indirect effects of policy X across departments?"
❌ "Which entities are connected to both A and B?"
❌ "What's the overall theme across this entire document corpus?"

These require traversing relationships between entities — exactly what graphs are built for.

How GraphRAG Works

Standard RAG:
Document → Chunks → Embeddings → Nearest-neighbor search → Answer

GraphRAG:
Document → Entity extraction (LLM) → Relationship extraction (LLM)
         → Knowledge graph → Community detection (Leiden algorithm)
         → Community summaries (LLM) → stored in Parquet

Query → Graph traversal OR community summary aggregation → Answer

Two Query Modes

Mode	Mechanism	Best For
Local Search	Traverse subgraph around specific entities	"Who is X?", "What's X's relationship to Y?"
Global Search	Aggregate community summaries hierarchically	"What are the main themes?", "Give me the big picture"

Setup (5 Minutes)

pip install graphrag
mkdir project && cd project
python -m graphrag init --root .
mkdir input && cp your_docs/*.txt input/
echo "GRAPHRAG_API_KEY=sk-..." > .env

Key config in settings.yaml:

llm:
  model: gpt-4o-mini       # Cost-efficient; use gpt-4o for higher quality
  api_key: ${GRAPHRAG_API_KEY}

embeddings:
  llm:
    model: text-embedding-3-small   # $0.02/1M tokens

chunks:
  size: 1200
  overlap: 100

Build the index:

python -m graphrag index --root .
# This calls the LLM to extract entities + relationships + build communities
# ~$0.50-5 per 100 pages (gpt-4o-mini)

Running Queries

import asyncio
import graphrag.api as api
from graphrag.config import GraphRagConfig
import yaml, pathlib, pandas as pd

config = GraphRagConfig.model_validate(
    yaml.safe_load(pathlib.Path("settings.yaml").read_text())
)

# Pre-load the graph data
output_dir = pathlib.Path("output")
nodes = pd.read_parquet(output_dir / "nodes.parquet")
entities = pd.read_parquet(output_dir / "entities.parquet")
community_reports = pd.read_parquet(output_dir / "community_reports.parquet")
text_units = pd.read_parquet(output_dir / "text_units.parquet")
relationships = pd.read_parquet(output_dir / "relationships.parquet")

async def local_search(query: str) -> str:
    result = await api.local_search(
        config=config,
        nodes=nodes, entities=entities,
        community_reports=community_reports,
        text_units=text_units,
        relationships=relationships,
        covariates=None,
        community_level=2,
        response_type="Single Paragraph",
        query=query,
    )
    return result.response

async def global_search(query: str) -> str:
    result = await api.global_search(
        config=config,
        nodes=nodes, entities=entities,
        community_reports=community_reports,
        community_level=2,
        dynamic_community_selection=False,
        response_type="Multiple Paragraphs",
        query=query,
    )
    return result.response

# Examples
specific = asyncio.run(local_search("What is the relationship between GraphRAG and knowledge graphs?"))
overview = asyncio.run(global_search("Summarize the main themes in this research corpus"))

LightRAG: Simpler Alternative

If the full Microsoft GraphRAG pipeline is too heavy, LightRAG offers a lightweight alternative:

# pip install lightrag-hku
from lightrag import LightRAG, QueryParam
from lightrag.llm import gpt_4o_mini_complete, openai_embedding

rag = LightRAG(
    working_dir="./cache",
    llm_model_func=gpt_4o_mini_complete,
    embedding_func=openai_embedding,
)

await rag.ainsert(open("docs.txt").read())

# Four modes in one API
naive  = await rag.aquery("question", param=QueryParam(mode="naive"))   # Standard RAG
local  = await rag.aquery("question", param=QueryParam(mode="local"))   # Local graph
global_ = await rag.aquery("question", param=QueryParam(mode="global")) # Global summaries
hybrid = await rag.aquery("question", param=QueryParam(mode="hybrid"))  # Best of both

GraphRAG vs Standard RAG: Decision Matrix

Factor	Standard RAG	GraphRAG
Corpus size	Up to ~500 pages	500–10,000+ pages
Query type	Factual lookup	Relational, multi-hop
Latency	< 2 seconds	5–30 seconds
Index cost	Low (embeddings only)	High (LLM extraction)
Maintenance	Easy (re-embed on update)	Complex (re-extract on update)
Sweet spot	FAQ, manuals, support docs	Research corpora, legal docs, knowledge bases

Rule of thumb: Start with standard RAG. If multi-hop queries fail consistently, add GraphRAG for those query types.

Combining Both: Agentic Graph-RAG

The most powerful 2026 pattern routes queries dynamically:

from langchain.tools import tool

@tool
def graph_search(query: str) -> str:
    """Use when the question involves relationships, causality, or the big picture."""
    return asyncio.run(global_search(query))

@tool
def vector_search(query: str) -> str:
    """Use when the question asks for specific facts or recent information."""
    return retriever.invoke(query)

# Agent selects the right tool based on the question
from langchain.agents import create_react_agent

agent = create_react_agent(
    llm=ChatOpenAI(model="gpt-4o"),
    tools=[graph_search, vector_search],
    prompt=agent_prompt
)
# Complex relational question → graph_search
# Simple factual question → vector_search

The Honest Tradeoff

GraphRAG is genuinely better for relationship-heavy corpora. But it's not a drop-in upgrade:

Index build time: Minutes to hours depending on corpus size
Rebuild cost: Any document update requires re-running extraction (expensive)
Latency: Global search can take 15–30s — not suitable for real-time chat

For most teams: use standard RAG for 90% of queries and GraphRAG specifically for the "tell me about everything related to X" class of questions.

Explore 471+ AI tools including GraphRAG, LightRAG, and every major RAG infrastructure option at AgDex.ai

DEV Community