Jun Bae

Posted on May 9

Graphs for RAG: Knowledge Graph and GraphRAG (GraphDB)

#ai #machinelearning #rag #neo4j

Introduction

I introduced RAG for LLM inference in the previous post in this series. As I mentioned, RAG has some limitations, so there are several advanced methods to overcome them. One of the most well-known approaches is to utilize graphs. This post will cover what a Knowledge Graph is, how RAG utilizes KGs, and how to build one.

Knowledge Graph

Back in the early days of the Internet, search engines such as Yahoo, Netscape, and Lycos relied mainly on full-text search. This was not too clunky until the volume of information became too large. As online information accumulated, when users searched for something, search engines had to scan thousands of pages, check whether each page contained relevant content, and often return too many results.

Google and PageRank

To solve this problem, Google introduced PageRank in 2000.

Google introducing PageRank. (https://googlepress.blogspot.com/2000/06/google-launches-worlds-largest-search.html)

A visualization of PageRank algorithm. The percentage shows the perceived importance, and the arrows represent hyperlinks.

It was a technology that modeled the web as a Webgraph and ranked most relevant documents at the top of search results. It used graph algorithms based on probability distributions. It made Google one of the most fastest-growing companies in the world at that time. It beat the pants off companies still relying on traditional text search. Google dominated all internet search with this algorithm until 2012.

Google and Knowledge Graph

What beat Google was Google. Google published a famous blog post in 2012: Introducing the Knowledge Graph: things, not strings

What does this mean? During the PageRank era, the search engine mostly showed relevant documents that frequently referenced the keywords in your query or were statistically related to them. But the Knowledge Graph helps search engines understand your queries. It captures the concepts behind words. That is what Google meant by "things, not strings."

These days, when you search for something on Google, it doesn't only return information about the exact thing you searched for. It also shows you its related information, such as family members, coworkers, accomplishments, books, related locations, and so on.

So, what is a Knowledge Graph and how do we build it?

Knowledge Graph

A Knowledge Graph is a structured representation of information where real-world objects or concepts are stored as entities, and the connections between them are stored as relationships.

For example:

(Tom Hanks) -[:ACTED_IN]-> (Forrest Gump)
(Forrest Gump) -[:DIRECTED_BY]-> (Robert Zemeckis)

When you enter a query into Google, instead of simply asking "Which documents contain the words in your query?", Google might ask "Which entities are related to this entity, and how are they connected?"

Core Knowledge Graph Concepts

Entity

An entity is a real-world object or concept.

Examples:

Person: "Marie Curie"
Organization: "OpenAI"
Product: "iPhone"
Disease: "Diabetes"
Concept: "Retrieval-Augmented Generation"

In popular graph database systems such as Neo4j, entities are usually represented as nodes. Neo4j’s property graph model represents domain objects as nodes, relationships as directed connections, with additional information stored as properties.

Relationship

A relationship describes how two entities are connected.

Examples:

Marie Curie DISCOVERED Radium
OpenAI DEVELOPED ChatGPT
Employee WORKS_AT Company
Paper CITES Paper

Triple

A triple is a simple way to represent a fact:

(subject, predicate, object)

Example:

(Marie Curie) -[:DISCOVERED]-> (Radium)

Ontology and Schema

An ontology defines the types of entities and relationships allowed in your graph.

For example, in a medical KG:

Entity types:
- Patient
- Disease
- Medication
- Symptom

Relationship types:
- HAS_SYMPTOM
- DIAGNOSED_WITH
- PRESCRIBED
- INTERACTS_WITH

A schema is the implementation-level structure of the graph: labels, relationship types, property names, constraints, and indexes. Neo4j describes schema as the prescribed property existence and data types for nodes and relationships.

Entity and Relation Extraction Methods

so, how can we extract entities and relations from text?

A good KG extraction result should contain not only (head, relation, tail), but also entity types, evidence text, source document/chunk ID, confidence score, and sometimes properties.

Example:

{
  "head": {"id": "Brian Chesky", "type": "Person"},
  "relation": "FOUNDED",
  "tail": {"id": "Airbnb", "type": "Company"},
  "evidence": "Brian Chesky founded Airbnb in San Francisco in 2008.",
  "source_chunk_id": "doc1_chunk3",
  "confidence": 0.91
}

There are several methods for extracting entities and relations.

Extraction Method 1: spaCy

spaCy is best when you want a fast, local, deterministic, production-friendly NLP pipeline. Compared with an LLM-based approach, it is simpler and cheaper.

spaCy processes raw text into structured linguistic annotations through the following pipeline:

Raw text
  -> tokenizer
  -> tok2vec / transformer
  -> tagger / morphologizer
  -> dependency parser
  -> NER
  -> custom components, if any

How does it recognize entities and relations?

Entity Recognition: The simplest way is to use a neural model. spaCy uses a lightweight deep learning model, such as the one included in en_core_web_sm. The official docs describe EntityRecognizer as a transition-based NER component that identifies non-overlapping labeled spans and stores them in Doc.ents. Therefore, its entity types are restricted to the pretrained label set.

You can easily extract entities from text with this NER pipeline.

import spacy
from dataclasses import dataclass


@dataclass(frozen=True)
class Entity:
    text: str
    label: str
    start_char: int
    end_char: int


def extract_entities_spacy(text: str) -> list[Entity]:
    """Extract named entities with spaCy's pretrained NER pipeline."""
    nlp = spacy.load("en_core_web_sm")
    doc = nlp(text)

    return [
        Entity(
            text=ent.text,
            label=ent.label_,
            start_char=ent.start_char,
            end_char=ent.end_char,
        )
        for ent in doc.ents
    ]


if __name__ == "__main__":
    text = "Brian Chesky founded Airbnb in San Francisco in 2008."
    for ent in extract_entities_spacy(text):
        print(ent)

Output:

Entity(text='Brian Chesky', label='PERSON', start_char=0, end_char=12)
Entity(text='Airbnb', label='ORG', start_char=21, end_char=27)
Entity(text='San Francisco', label='GPE', start_char=31, end_char=44)
Entity(text='2008', label='DATE', start_char=48, end_char=52)

Another option is to use a non-neural, rule-based component such as EntityRuler which relies on explicit patterns.

Relation Recognition: For relations, spaCy usually relies on dependency parsing and rule-based patterns. You can also use a custom-trained relation extraction model.

As you can see in the example, it can easily handle a sentence like "Brian Chesky founded Airbnb in San Francisco in 2008."

But it may struggle with sentences like these: "Brian Chesky is one of the people behind Airbnb.", "The company was started in 2008 by Brian Chesky and others."

Extraction Method 2: GLiNER / GLiNER2

GLiNER is a lightweight NER framework designed to identify arbitrary entity types using label descriptions, rather than being limited to a fixed NER label set, which is one limitaition of spaCy. Its documentation describes it as a practical middle ground between traditional NER and expensive LLM-based extraction.

GLiNER2 extends this idea. It is designed as a schema-based information extraction framework that supports multiple tasks such as NER, text classification, structured extraction, and relation extraction.

The GLiNER2 GitHub page says it is a unified schema-based information extraction (IE) model for entity extraction, classification, structured extraction, and relation extraction in one efficient model.

So you can define your own entity types and relation types with GLiNER2.

Simple Example:

from gliner2 import GLiNER2


def extract_graph_gliner2(text: str) -> dict:
    """Extract entities and relations with a GLiNER2 schema."""
    extractor = GLiNER2.from_pretrained("fastino/gliner2-base-v1")

    schema = (
        extractor.create_schema()
        .entities(
            {
                "person": "Names of people, founders, executives, or researchers",
                "company": "Organizations, companies, startups, or labs",
                "location": "Cities, regions, or countries",
                "date": "Years or specific dates",
            }
        )
        .relations(
            {
                "founded": "Relationship where a person created or co-created a company",
                "located_in": "Relationship where an organization is based in a location",
                "works_for": "Employment or affiliation relationship",
            }
        )
    )

    return extractor.extract(text, schema, include_confidence=True)


if __name__ == "__main__":
    text = "Brian Chesky founded Airbnb in San Francisco in 2008."
    result = extract_graph_gliner2(text)
    print(result)

Output:

{'entities':
{'person': [{'text': 'Brian Chesky', 'confidence': 0.9999997615814209}], 'company': [{'text': 'Airbnb', 'confidence': 0.9999885559082031}], 'location': [{'text': 'San Francisco', 'confidence': 0.9999994039535522}], 'date': [{'text': '2008', 'confidence': 0.9999909400939941}]},
'relation_extraction':
{'founded': [{'head': {'text': 'Brian Chesky', 'confidence': 0.999990701675415}, 'tail': {'text': 'Airbnb', 'confidence': 0.9999719858169556}}], 'located_in': [{'head': {'text': 'Airbnb', 'confidence': 0.9986342787742615}, 'tail': {'text': 'San Francisco', 'confidence': 0.9998704195022583}}], 'works_for': []}}

Extraction Method 3: LLM-based Extraction

LLM-based extraction is currently the most flexible and widely used method for KG construction, especially when documents are complex, relationships are implicit, or the ontology is still evolving. Neo4j’s docs say modern LLMs can be instructed with prompts, examples, schemas, existing entities, and output formatting to extract and deduplicate entities and relationships from unstructured text.

If you define a structured output with Pydantic, the LLM can return data in a structure that fits KG triples.

Example of structured output:

from typing import Literal
from pydantic import BaseModel, Field


EntityType = Literal["Person", "Company", "Location", "Date"]
RelationType = Literal["FOUNDED", "LOCATED_IN", "WORKS_AT"]


class Entity(BaseModel):
    id: str = Field(description="Canonical entity name")
    type: EntityType
    evidence: str = Field(description="Text span supporting the entity")


class Relation(BaseModel):
    head: str = Field(description="Canonical head entity id")
    relation: RelationType
    tail: str = Field(description="Canonical tail entity id")
    evidence: str = Field(description="Exact sentence or phrase supporting the relation")


class KGExtraction(BaseModel):
    entities: list[Entity]
    relations: list[Relation]

Note: Even if you define a structured output schema, you should validate and deserialize the output. LLMs can still ignore the structured format and return invalid output. This happens more often with smaller models.

The langchain framework provides an LLMGraphTransformer class in langchain_experimental. You can use it like this:

from langchain_core.documents import Document
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain.chat_models import init_chat_model
from dotenv import load_dotenv

load_dotenv()

llm = init_chat_model("openai:gpt-5-nano")

documents = [
    Document(
        page_content="Brian Chesky founded Airbnb in San Francisco in 2008.",
        metadata={"source": "example_doc_1"},
    )
]

llm_transformer = LLMGraphTransformer(
    llm=llm,
    allowed_nodes=["Person", "Company", "Location", "Date"],
    allowed_relationships=[
        ("Person", "FOUNDED", "Company"),
        ("Company", "LOCATED_IN", "Location"),
    ],
    strict_mode=False,
    node_properties=["name"],
    relationship_properties=["evidence"],
    additional_instructions=(
        "Extract only relationships explicitly stated in the text. "
        "Use canonical entity names. "
        "Do not create vague nodes such as 'startup' or 'company'."
    ),
)

graph_documents = llm_transformer.convert_to_graph_documents(documents)

for graph_doc in graph_documents:
    print("Nodes:", graph_doc.nodes)
    print("Relationships:", graph_doc.relationships)

Output:

Nodes: [Node(id='Brian Chesky', type='Person', properties={'name': 'Brian Chesky'}), Node(id='Airbnb', type='Company', properties={'name': 'Airbnb'}), Node(id='San Francisco', type='Location', properties={'name': 'San Francisco'})]

Relationships: [Relationship(source=Node(id='Brian Chesky', type='Person', properties={}), target=Node(id='Airbnb', type='Company', properties={}), type='FOUNDED', properties={'evidence': '2008-01-01'}), Relationship(source=Node(id='Airbnb', type='Company', properties={}), target=Node(id='San Francisco', type='Location', properties={}), type='LOCATED_IN', properties={})]

Neo4j’s current guide shows essentially this pattern: use LLMGraphTransformer, pass allowed_nodes, allowed_relationships, node_properties, convert documents to graph documents, then add them to Neo4j with include_source=True.

Comparison: spaCy vs. GLiNER2 vs. LLM-based Extraction

Method	Strengths	Weaknesses	Best use case
spaCy	Fast, mature, low-cost, production-friendly	Fixed labels unless custom-trained; limited relation extraction	Standard NER, preprocessing, high-throughput pipelines
GLiNER2	Flexible schema-based extraction; lighter than LLMs	Newer ecosystem; may need evaluation for your domain	Custom entity extraction and lightweight IE
LLM-based extraction	Strong semantic understanding; flexible schema and relation extraction	Higher cost, higher latency, hallucination risk	Complex KG construction from messy documents
Hybrid	Balances speed, cost, and quality	More engineering complexity	Production GraphRAG systems

GraphRAG and GraphDB

What is GraphRAG? In February 2024, Microsoft published a blog post about GraphRAG: GraphRAG Blog Post; arXiv Paper

This approach is often described as using LLM-Derived Knowledge Graphs. It utilizes a Knowledge Graph and lets LLMs traverse the graph to retrieve information. So, the concept is simple: Connect the LLM to a Knowledge Graph. But how?

One approach is to have the model generate a graph query language called Cypher. If you want to build your own graph database, you need to learn Cypher. (Actually, I'm not that proficient at this.😒 But it's not so different from other DB query languages.) Then, where should I store the DB? There are several graph DB systems, and one of the most popular ones is Neo4j. There are several other DBs such as AWS Neptune, Azure Cosmos DB, Memgraph, NebulaGraph, etc. Neo4j is open-source only for Community Edition, so if you want an enterprise graph DB service, you should consider that.

How to build GraphDB

I will demonstrate GraphDB construction using Neo4j AuraDB. You can build a Neo4j graph locally, but AuraDB is convenient for a demo. (Of course, it is not free for enterprise production use.)

As in the first post, I will use the "Demon Slayer" series for the demonstration.

First, when you create an AuraDB instance, it will show you the connection details for your graph, such as the username, password, database name, and URI.

DB Connection Functions (Click to expand)

import os
from dataclasses import dataclass

from dotenv import load_dotenv
from langchain_neo4j import Neo4jGraph

@dataclass(frozen=True)
class Neo4jSettings:
    uri: str
    username: str
    password: str
    database: str = "neo4j"

def get_neo4j_settings() -> Neo4jSettings:
    uri = os.getenv("NEO4J_URI") or os.getenv("NEO4J_URL")
    username = os.getenv("NEO4J_USERNAME") or os.getenv("NEO4J_USER")
    password = os.getenv("NEO4J_PASSWORD")
    database = os.getenv("NEO4J_DATABASE", "neo4j")

    missing = [
        name
        for name, value in {
            "NEO4J_URI": uri,
            "NEO4J_USERNAME": username,
            "NEO4J_PASSWORD": password,
        }.items()
        if not value
    ]
    if missing:
        raise RuntimeError(f"Missing required Neo4j environment variables: {', '.join(missing)}")

    return Neo4jSettings(uri=uri, username=username, password=password, database=database)


def get_graph(refresh_schema: bool = False, enhanced_schema: bool = False) -> Neo4jGraph:
    settings = get_neo4j_settings()
    return Neo4jGraph(
        url=settings.uri,
        username=settings.username,
        password=settings.password,
        database=settings.database,
        refresh_schema=refresh_schema,
        enhanced_schema=enhanced_schema,
        sanitize=True,
    )

Langchain provides a class called LLMGraphTransformer, which lets you transform documents into graph-compatible triples very easily.

from langchain_experimental.graph_transformers import LLMGraphTransformer

llm_transformer = LLMGraphTransformer(
    llm=get_llm(),
    node_properties=["description"],
    relationship_properties=["evidence"],
)

graph_documents = llm_transformer.convert_to_graph_documents(batch)
graph.add_graph_documents(graph_documents, include_source=True, baseEntityLabel=True)

Nodes and Relationships Related to the Main Character, Tanjirou

You can search the graph with a query like this:

MATCH (n1) - [r] - (n2) 
WHERE n1.id =~ ".*Tanjirou.*" AND not n2:Document 
RETURN n1, r, n2

The result can look a bit cluttered because it includes so many minor objects such as doors and stone. You can specify the types of nodes and relationships. I set allowed_nodes like this:

llm_transformer = LLMGraphTransformer(
    llm=get_llm(model="xai:grok-4.3"),
    node_properties=["description"],
    relationship_properties=["evidence"],
    allowed_nodes=[
        "Person",
        "Organization",
        "Location",
        "Breathing",
        "Weapon",
        "Demon",
        "Object",
    ],
)

Then I fed the Season 1, EP 1 document. Let's see how the LLM generated the graph structure and built a graph. I'm tracking all my LLM calls on LangSmith, so let's look at some of them.

I used a total of 56.1k tokens, and the process took 367 seconds for one episode with Grok-4.3 model. Unsolicited tip: the API cost for this brand-new frontier model is so low that it might be the best model for this kind of demo or personal project. Thanks, Elon! 😊

The graph I built from Demon Slayer Season 1, Episode 1

The output for creating nodes and relationships

You can see the default instructions in LLMGraphTransformer, and you can also override them with your own prompt. The prompt is very simple, but I think it is enough to create a suitable graph. However, when you use this in production-grade projects, you'd be better off writing your own prompt meticulously. You may have noticed that if you read the prompt closely, you can find out that it doesn't directly mention the allowed nodes. It just checks and trims the output after generation. As we all know, garbage-in, garbage-out

Anyway, LLMs are powerful for this kind of job, but if you want to reduce costs, a hybrid method combining spaCy, GLiNER2, LLMs might be the best fit for you.

How to Traverse and Retrieve from a Graph

Last but not least, you need to traverse the graph and retrieve information relevant to your query. Whenever I try GraphRAG projects, I always realize that this part is the most difficult. You can build your graph very easily thanks to LLMs, but traversal is another story.

Of course, Langchain provides a retrieval class called GraphCypherQAChain, but it is not always smart or reliable enough out of the box. Let me show you.

def query_llm():
    return get_llm("xai:grok-4.3")

graph = get_graph(refresh_schema=True)
chain = GraphCypherQAChain.from_llm(
    llm=query_llm(),
    graph=graph,
    verbose=True,
    validate_cypher=True,
    return_intermediate_steps=True,
    top_k=top_k,
    allow_dangerous_requests=True,
)
result = chain.invoke({"query": question})
answer = result["result"]

When you retrieve from the graph and invoke the LLM in this way, the result may not be very satisfactory. If anything, it can be almost unusable.

When I run this query: uv run python -m graph_rag.query --question "Why did Giyu send Tanjirou to Mt.Sagiri?", it returns the following output:

=== Generated Cypher ===
MATCH (g:Person)-[r:INSTRUCTS|REFERS]->(t:Person)-[:TRAVELS_TO|GOES_TO|VISITS]->(l:Location)
WHERE (g.id CONTAINS "Giyu" OR g.description CONTAINS "Giyu") AND (t.id CONTAINS "Tanjirou" OR t.description CONTAINS "Tanjirou") AND (l.id CONTAINS "Sagiri" OR l.description CONTAINS "Sagiri")
RETURN r.evidence AS reason
=== Cypher Context ===
[]
=== Final Answer ===
I don't know the answer to that.

As you can see, it fails to retrieve any information. That is because the function mainly generates a Cypher query and sends it to AuraDB. So if you use it as-is, it doesn’t fully understand your graph structure or domain. To solve this, you need to build your own pipeline and write targeted queries. I tweaked the pipeline like this.

Pipeline Code (Click to expand)

ENTITY_QUERY = """
CALL db.index.fulltext.queryNodes("graph_entity_fulltext", $lucene_query, {limit: $limit})
YIELD node, score
RETURN elementId(node) AS element_id, node.id AS id, labels(node) AS labels, score
ORDER BY score DESC
"""

GRAPH_CONTEXT_QUERY = """
MATCH (seed)
WHERE elementId(seed) IN $seed_ids
CALL {
    WITH seed
    MATCH p = (seed)-[*1..2]-(neighbor)
    WHERE all(rel IN relationships(p) WHERE type(rel) <> "MENTIONS")
    UNWIND relationships(p) AS rel
    RETURN DISTINCT
        coalesce(startNode(rel).id, startNode(rel).name) AS source,
        labels(startNode(rel)) AS source_labels,
        type(rel) AS relationship,
        coalesce(endNode(rel).id, endNode(rel).name) AS target,
        labels(endNode(rel)) AS target_labels,
        properties(rel) AS properties
    LIMIT $relationship_limit
}
RETURN source, source_labels, relationship, target, target_labels, properties
"""

SOURCE_CONTEXT_QUERY = """
MATCH (seed)
WHERE elementId(seed) IN $seed_ids
MATCH (doc:Document)-[:MENTIONS]->(seed)
WITH doc, collect(DISTINCT seed.id) AS matched_entities, count(DISTINCT seed) AS entity_hits
RETURN DISTINCT
    doc.id AS id,
    doc.source AS source,
    doc.chunk_index AS chunk_index,
    coalesce(doc.text, doc.page_content, doc.content, "") AS text,
    matched_entities,
    entity_hits
ORDER BY entity_hits DESC
LIMIT $chunk_limit
"""

def rerank_chunks_by_similarity(
    question: str,
    chunks: list[dict],
    chunk_limit: int,
    embedding_model: str,
) -> list[dict]:
    text_chunks = [chunk for chunk in chunks if chunk.get("text")]
    if not text_chunks:
        return []

    embeddings = chunk_embeddings(embedding_model)
    query_vector = embeddings.embed_query(question)
    chunk_vectors = embeddings.embed_documents([chunk["text"] for chunk in text_chunks])

    scored_chunks = []
    for chunk, chunk_vector in zip(text_chunks, chunk_vectors):
        scored_chunk = dict(chunk)
        scored_chunk["similarity"] = cosine_similarity(query_vector, chunk_vector)
        scored_chunks.append(scored_chunk)

    return sorted(scored_chunks, key=lambda chunk: chunk["similarity"], reverse=True)[:chunk_limit]

def retrieve_context(
    graph,
    question: str,
    entity_limit: int,
    relationship_limit: int,
    chunk_candidate_limit: int,
    chunk_limit: int,
    embedding_model: str,
) -> GraphContext:
    entity_names = extract_question_entities(question)
    lexical = lucene_query(" ".join(entity_names) if entity_names else question)
    entities = graph.query(ENTITY_QUERY, {"lucene_query": lexical, "limit": entity_limit})

    if entity_names:
        exact_entities = graph.query(
            """
            MATCH (n:__Entity__)
            WHERE toLower(n.id) IN $entity_names
            RETURN elementId(n) AS element_id, n.id AS id, labels(n) AS labels, 100.0 AS score
            LIMIT $limit
            """,
            {"entity_names": [name.lower() for name in entity_names], "limit": entity_limit},
        )
        seen = set()
        entities = [
            entity
            for entity in exact_entities + entities
            if not (entity["element_id"] in seen or seen.add(entity["element_id"]))
        ][:entity_limit]

    seed_ids = [entity["element_id"] for entity in entities]
    if not seed_ids:
        return GraphContext(entities=[], relationships=[], chunks=[])

    relationships = graph.query(
        GRAPH_CONTEXT_QUERY,
        {"seed_ids": seed_ids, "relationship_limit": relationship_limit},
    )
    chunks = graph.query(
        SOURCE_CONTEXT_QUERY,
        {"seed_ids": seed_ids, "chunk_limit": chunk_candidate_limit},
    )
    chunks = rerank_chunks_by_similarity(question, chunks, chunk_limit, embedding_model)
    return GraphContext(entities=entities, relationships=relationships, chunks=chunks)

def format_context(context: GraphContext) -> str:
    entity_lines = [
        f"- {entity['id']} labels={entity['labels']} score={entity['score']:.2f}"
        for entity in context.entities
    ]
    relationship_lines = [
        (
            f"- ({rel['source']})-[:{rel['relationship']}]->({rel['target']}) "
            f"properties={rel['properties']}"
        )
        for rel in context.relationships
    ]
    chunk_lines = [
        (
            f"- {chunk.get('source')} chunk={chunk.get('chunk_index')} "
            f"similarity={chunk.get('similarity', 0.0):.4f}\n"
            f"  {chunk.get('text', '')}"
        )
        for chunk in context.chunks
        if chunk.get("text")
    ]

    return "\n".join(
        [
            "Matched entities:",
            *(entity_lines or ["- None"]),
            "",
            "Graph relationships:",
            *(relationship_lines or ["- None"]),
            "",
            "Source chunks:",
            *(chunk_lines or ["- None"]),
        ]
    )

def answer_question_manual(
    question: str,
    entity_limit: int,
    relationship_limit: int,
    chunk_candidate_limit: int,
    chunk_limit: int,
    embedding_model: str,
) -> None:
    trace_metadata = {
        "mode": "manual",
        "entity_limit": entity_limit,
        "relationship_limit": relationship_limit,
        "chunk_candidate_limit": chunk_candidate_limit,
        "chunk_limit": chunk_limit,
        "embedding_model": embedding_model,
    }
    with trace(
        "graph_rag.query",
        run_type="chain",
        inputs={"question": question},
        project_name=langsmith_project(),
        tags=["graph-rag", "query"],
        metadata=trace_metadata,
    ) as query_run:
        graph = get_graph(refresh_schema=False)
        context = retrieve_context(
            graph,
            question,
            entity_limit,
            relationship_limit,
            chunk_candidate_limit,
            chunk_limit,
            embedding_model,
        )
        formatted_context = format_context(context)

        prompt = ChatPromptTemplate.from_messages(
            [
                (
                    "system",
                    """You will get some parts of the certain series story. Considering this context, answer the user's question.
                    Don't hallucinate or guess with insufficient information.""",
                ),
                (
                    "human",
                    "Question:\n{question}\n\nGraph context:\n{context}\n\nAnswer:",
                ),
            ]
        )
        chain = prompt | query_llm() | StrOutputParser()
        answer_config: RunnableConfig = {
            "run_name": "graph_rag.query.answer",
            "tags": ["graph-rag", "query", "answer-generation"],
            "metadata": {
                "matched_entity_count": len(context.entities),
                "relationship_count": len(context.relationships),
                "chunk_count": len(context.chunks),
            },
        }
        answer = chain.invoke(
            {"question": question, "context": formatted_context},
            config=answer_config,
        )
        query_run.end(
            outputs={
                "answer": answer,
                "matched_entities": [entity["id"] for entity in context.entities],
                "relationship_count": len(context.relationships),
                "chunk_count": len(context.chunks),
            }
        )

        print("\n=== Entity Search Cypher ===")
        print(ENTITY_QUERY.strip())
        print("\n=== Graph Traversal Cypher ===")
        print(GRAPH_CONTEXT_QUERY.strip())
        print("\n=== Graph Context ===")
        print(formatted_context)
        print("\n=== Final Answer ===")
        print(answer)

Long code short, the process is like this:

It uses an LLM graph transformer to extract entity names from the question
It searches Neo4j’s graph_entity_fulltext index for matching graph entities.
The matched entities become seed nodes for retrieval.
From those seed nodes, it traverses graph relationships up to two hops away, excluding MENTIONS relationships.
It also finds source document chunks that mention the seed entities.
Finally, it embeds the question and candidate chunks, scores them with cosine similarity, and keeps the most relevant chunks.

This is the process I built. I combined GraphRAG with dense RAG. Let's see if it works. I ran the same query uv run python -m graph_rag.query --question "Why did Giyu send Tanjirou to Mt.Sagiri?" and this time, it returned:

=== Final Answer ===
Answer:
Giyu Tomioka (Tomioka) sends Tanjirou to Mt. Sagiri specifically so he can meet the old man Sakonji Urokodaki who lives at its foot. Tomioka instructs Tanjirou to tell Urokodaki that “Giyuu Tomioka sent you,” after Nezuko has been turned into a demon. He also warns Tanjirou never to let her be exposed to sunlight (though she is currently safe because it is cloudy). This direction is given right after Tomioka knocks Nezuko out, places a bamboo gag on her, and decides not to kill her, implying Urokodaki is meant to provide guidance, training, or protection for both Tanjirou and the demon Nezuko.
(The provided chunks contain no further details about Urokodaki’s exact role or what happens after they reach him.)

It answered very well.

You can try your own approaches. You can write your own Cypher queries, tweak the prompt, combine other methods, use a specific model for generating queries, try an agentic approach, and whatnot.

This might be the most stressful part, and even coding tools can't give you a definitive answer. You should take into account your domain, data, information, budget, and other constraints, and understand the whole process. This is the most important part, and it is what humans still have to deal with firsthand.

You can outsource your thinking, but you can't outsource your understanding — Andrej Karpathy

Note: There are some advanced, relatively plug-and-play GraphRAG frameworks, such as LightRAG and PathRAG. However, I would not consider them production-grade yet; they are still closer to research-driven frameworks. They can be useful for simple GraphRAG demos, prototypes, or experiments. However, they still have many limitations, so much so that they are not yet suitable for customer-facing or enterprise production systems.

Conclusion

In this post, I introduced Knowledge Graphs and GraphRAG. The basic algorithms were invented decades ago, but as LLMs have started using them, they have resurfaced. Sparse RAG and dense RAG are more straightforward and closer to plug-and-play, but GraphRAG requires more groundwork and scaffolding, as well as extra, sometimes frustrating steps. Still, it can be worthwhile in certain cases.

Top comments (0)

The discussion has been locked. New comments can't be added.