A Retrieval-Augmented Generation (RAG) agent combines document retrieval with LLM-based response generation to provide intelligent, context-aware answers. In this guide, youโll build a RAG system using LangChain, ChromaDB, and OpenAI or HuggingFace.
๐ ๏ธ Tech Stack
Python
LangChain
ChromaDB
OpenAI or HuggingFace LLMs
SentenceTransformers (all-MiniLM-L6-v2)
๐ฆ Install Dependencies
pip install langchain chromadb sentence-transformers openai
๐งฑ Folder Structure
.
โโโ rag_chroma_db/ # Chroma vector store
โโโ docs/
โ โโโ my_corpus.txt # Your source document
โโโ rag_agent.py # Main script
๐ Code: RAG Agent with ChromaDB
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI # You can also use HuggingFaceHub
# 1. Load documents
loader = TextLoader("docs/my_corpus.txt")
documents = loader.load()
# 2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)
# 3. Embed and store in Chroma
embedding = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
vectordb = Chroma.from_documents(documents=chunks, embedding=embedding, persist_directory="rag_chroma_db")
vectordb.persist()
# 4. Set up retriever
retriever = vectordb.as_retriever(search_kwargs={"k": 3})
# 5. Set up LLM
llm = OpenAI(temperature=0)
# 6. Create RAG chain
qa = RetrievalQA.from_chain_type(llm=llm, retriever=retriever, return_source_documents=True)
# 7. Ask questions
query = "What is the main topic of the document?"
result = qa({"query": query})
print("Answer:", result["result"])
print("Sources:", result["source_documents"])
๐ Set Your API Key
Make sure your environment is set with the OpenAI key:
export OPENAI_API_KEY="your-api-key"
Or in Python:
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
๐ Whatโs Next?
๐ Add PDF loader with PyMuPDF or pdfminer.six
๐ฅ๏ธ Add a UI with Streamlit or FastAPI
๐ค Wrap the retriever as a LangChain Tool + Agent
๐ Run offline using HuggingFace LLMs
๐ก Summary
You now have a working Retrieval-Augmented Generation (RAG) agent using:
A local document chunked + embedded with SentenceTransformers
Stored in ChromaDB vector store
Queried using LangChain RetrievalQA
Answered using OpenAI GPT
Top comments (2)
Super clean integration of Chroma! This makes RAG pipelines much more manageable and fast to deploy.
The real-world use case you mentioned gave me some great ideas for internal enterprise tools.