This tutorial walks through setting up and using LightRAG, a retrieval-augmented generation system that combines knowledge graphs with vector search for document retrieval.
What is LightRAG?
LightRAG is a RAG (Retrieval-Augmented Generation) system that builds knowledge graphs from your documents. Unlike classical RAG systems that rely solely on vector similarity search, LightRAG extracts entities and relationships from documents to create a structured knowledge graph, then uses both the graph and vector search for retrieval.
How LightRAG Differs from Classical RAG
Classical RAG:
- Uses vector embeddings to find semantically similar document chunks
- Retrieval is based on cosine similarity between query and document vectors
- No structured understanding of entities or relationships
LightRAG:
- Extracts entities (people, organizations, concepts) and relationships from documents
- Builds a knowledge graph that captures these relationships
- Uses both graph traversal and vector search for retrieval
- Offers multiple query modes: naive, local, global, hybrid, and mix
The knowledge graph enables more precise retrieval by understanding not just what documents are similar, but how concepts relate to each other. This can improve accuracy for queries that require understanding relationships between entities.
Prerequisites
- Docker and Docker Compose
- An LLM provider (OpenAI, Gemini, Ollama, Azure OpenAI, AWS Bedrock, or Jina)
- An embedding model (supports OpenAI, Gemini, Ollama, Jina, and others)
- Optional: A reranker model
Setup
1. Clone the Repository
git clone https://github.com/HKUDS/LightRAG.git
cd LightRAG
2. Configure Environment Variables
Copy the example environment file and configure it:
cp env.example .env
Edit .env with your configuration. Here's an example using Ollama for both LLM and embeddings:
LLM_BINDING=ollama
LLM_MODEL=llama3.2:latest
LLM_BINDING_HOST=http://host.docker.internal:11434
EMBEDDING_BINDING=ollama
EMBEDDING_MODEL=bge-m3:latest
EMBEDDING_DIM=1536
EMBEDDING_BINDING_HOST=http://host.docker.internal:11434
RERANK_BINDING=null
PORT=9621
3. Start the Server
docker compose up -d
The web UI will be available at http://localhost:9621/webui/
Using LightRAG: Step-by-Step Tutorial
Step 1: Access the Web UI
Navigate to http://localhost:9621/webui/ in your browser. You'll see the main interface with tabs for Documents, Knowledge Graph, Retrieval, and API.
Step 2: Upload a Document
Click on the Documents tab. Initially, you'll see "No Documents" if this is a fresh installation.
Place your document in the data/inputs/ directory, then click Scan/Retry. LightRAG will detect the document and queue it for processing.
Step 3: Document Processing
LightRAG processes the document by:
- Chunking the text
- Extracting entities and relationships
- Building embeddings
- Constructing the knowledge graph
You can monitor the processing status. Once complete, you'll see "Completed" with the number of chunks processed.
Step 4: Explore the Knowledge Graph
This is where LightRAG's key differentiator becomes visible. Click on the Knowledge Graph tab to explore the extracted entities and relationships.
Click on the Search node name dropdown to see all extracted entities. In our example, we can see entities like:
- LightRAG
- Knowledge Graph
- Entity-Relationship Extraction
- Query Modes
- LLM Providers
- And many more...
Select a node (e.g., "LightRAG") to view its details and relationships.
The visualization shows:
- Node Details Panel (right side): Displays the node's ID, labels, degree (number of connections), properties, and relations
- Graph Canvas (center): Visual representation of the knowledge graph showing nodes and their connections
In this example, the "LightRAG" node has:
- Degree: 14 - It's connected to 14 other entities
- Relations: Including "Retrieval-Augmented Generation (RAG)", "Knowledge Graph", "Entity-Relationship Extraction", "Vector Search", "Query Modes", "LLM Providers", and others
You can click on neighbor nodes to expand the graph and explore relationships:
This visualization makes it clear how LightRAG understands relationships between concepts, not just document similarity.
Step 5: Query the System
Switch to the Retrieval tab to query your documents. You can:
- Enter your query
- Select a query mode (naive, local, global, hybrid, or mix)
- Optionally provide custom instructions for the LLM
- Adjust token limits and other parameters
Example query: "What is LightRAG? Please provide a short answer (2-3 sentences maximum)."
The system retrieves relevant context using both the knowledge graph and vector search, then generates a response that follows your instructions.
Query Modes Explained
LightRAG offers five query modes:
- Naive: Simple retrieval without graph traversal
- Local: Uses local subgraph around query entities
- Global: Considers the entire knowledge graph
- Hybrid: Combines local and global approaches
- Mix: Advanced combination strategy
Each mode has different characteristics in terms of speed and accuracy, depending on your use case.
Key Observations
Knowledge Graph Construction: LightRAG automatically extracts entities and relationships during document processing. This happens in the background and doesn't require manual annotation.
Visual Exploration: The knowledge graph visualization makes it easy to understand how your documents are structured and how concepts relate to each other.
Dual Retrieval: LightRAG uses both graph-based and vector-based retrieval, which can provide more accurate results than vector search alone.
Flexible Configuration: Supports multiple LLM and embedding providers, making it adaptable to different infrastructure setups.
Configuration Tips
LLM Requirements: LightRAG recommends ≥32B parameter models for knowledge graph extraction. Smaller models may not extract relationships as effectively.
Embedding Models: Popular choices include
BAAI/bge-m3,text-embedding-3-large, andgemini-embedding-001.Local Setup: Using Ollama for both LLM and embeddings provides a fully local setup without API dependencies.
Production: For production deployments, LightRAG supports PostgreSQL, MongoDB, Neo4j, and other enterprise databases instead of JSON storage.
Conclusion
LightRAG provides a practical approach to RAG that combines the benefits of knowledge graphs with vector search. The automatic entity and relationship extraction, combined with the visual graph exploration, makes it easier to understand how your documents are structured and how concepts relate to each other.
The knowledge graph visualization is particularly useful for:
- Understanding document structure
- Debugging retrieval results
- Exploring relationships between concepts
- Validating that entities and relationships were extracted correctly
While classical RAG systems work well for many use cases, LightRAG's knowledge graph approach can provide advantages when queries require understanding relationships between entities or when you need more precise retrieval based on structured knowledge.
Resources
- GitHub Repository: https://github.com/HKUDS/LightRAG
-
PyPI Package:
pip install lightrag-hku - Documentation: See the repository README for detailed API documentation and advanced configuration options










Top comments (0)