Retrieval-Augmented Generation (RAG) sounds complex, but the core idea is simple: give your LLM access to your own documents. Here's how to build one in 50 lines.
What is RAG?
Instead of relying solely on the LLM's training data, RAG retrieves relevant documents first, then feeds them as context to the LLM. This means your AI can answer questions about YOUR data.
The Setup
pip install openai chromadb sentence-transformers
The Code
import chromadb
from sentence_transformers import SentenceTransformer
import openai
# 1. Initialize embedding model and vector DB
embedder = SentenceTransformer("all-MiniLM-L6-v2")
client = chromadb.Client()
collection = client.create_collection("docs")
# 2. Add your documents
docs = [
"Python 3.12 introduced type parameter syntax.",
"FastAPI is built on Starlette and Pydantic.",
"Docker containers share the host OS kernel.",
"PostgreSQL supports JSONB for document storage.",
"Redis can be used as a message broker with Pub/Sub.",
]
embeddings = embedder.encode(docs).tolist()
collection.add(
documents=docs,
embeddings=embeddings,
ids=[f"doc_{i}" for i in range(len(docs))]
)
# 3. Query function
def ask(question, n_results=2):
q_embedding = embedder.encode([question]).tolist()
results = collection.query(query_embeddings=q_embedding, n_results=n_results)
context = "\n".join(results["documents"][0])
response = openai.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": f"Answer based on this context:\n{context}"},
{"role": "user", "content": question}
]
)
return response.choices[0].message.content
# 4. Use it
print(ask("What is FastAPI built on?"))
How It Works
- Documents are converted to vectors (embeddings)
- When you ask a question, it's also converted to a vector
- ChromaDB finds the most similar documents
- Those documents are passed as context to the LLM
- The LLM answers based on YOUR data, not just its training
Scaling Up
For production, swap in:
- Pinecone/Weaviate instead of ChromaDB for persistence
- Chunking for large documents (split into 500-token chunks)
- Reranking to improve retrieval quality
But this 50-line version is enough to understand the concept and prototype quickly.
🚀 Level up your AI workflow! Check out my AI Developer Mega Prompt Pack — 80 battle-tested prompts for developers. $9.99
Top comments (0)