DEV Community

Cover image for Next-Gen Q&A: Retrieval-Augmented AI with Chroma Vector Store
Chandrani Mukherjee
Chandrani Mukherjee

Posted on

Next-Gen Q&A: Retrieval-Augmented AI with Chroma Vector Store

A Retrieval-Augmented Generation (RAG) agent combines document retrieval with LLM-based response generation to provide intelligent, context-aware answers. In this guide, youโ€™ll build a RAG system using LangChain, ChromaDB, and OpenAI or HuggingFace.

๐Ÿ› ๏ธ Tech Stack
Python

LangChain

ChromaDB

OpenAI or HuggingFace LLMs

SentenceTransformers (all-MiniLM-L6-v2)

๐Ÿ“ฆ Install Dependencies

pip install langchain chromadb sentence-transformers openai
Enter fullscreen mode Exit fullscreen mode

๐Ÿงฑ Folder Structure

.
โ”œโ”€โ”€ rag_chroma_db/        # Chroma vector store
โ”œโ”€โ”€ docs/
โ”‚   โ””โ”€โ”€ my_corpus.txt     # Your source document
โ””โ”€โ”€ rag_agent.py          # Main script

Enter fullscreen mode Exit fullscreen mode

๐Ÿ“„ Code: RAG Agent with ChromaDB

from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI  # You can also use HuggingFaceHub

# 1. Load documents
loader = TextLoader("docs/my_corpus.txt")
documents = loader.load()

# 2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)

# 3. Embed and store in Chroma
embedding = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
vectordb = Chroma.from_documents(documents=chunks, embedding=embedding, persist_directory="rag_chroma_db")
vectordb.persist()

# 4. Set up retriever
retriever = vectordb.as_retriever(search_kwargs={"k": 3})

# 5. Set up LLM
llm = OpenAI(temperature=0)

# 6. Create RAG chain
qa = RetrievalQA.from_chain_type(llm=llm, retriever=retriever, return_source_documents=True)

# 7. Ask questions
query = "What is the main topic of the document?"
result = qa({"query": query})

print("Answer:", result["result"])
print("Sources:", result["source_documents"])
Enter fullscreen mode Exit fullscreen mode

๐Ÿ” Set Your API Key
Make sure your environment is set with the OpenAI key:

export OPENAI_API_KEY="your-api-key"
Enter fullscreen mode Exit fullscreen mode

Or in Python:

import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
Enter fullscreen mode Exit fullscreen mode

๐Ÿ”„ Whatโ€™s Next?
๐Ÿ“„ Add PDF loader with PyMuPDF or pdfminer.six

๐Ÿ–ฅ๏ธ Add a UI with Streamlit or FastAPI

๐Ÿค– Wrap the retriever as a LangChain Tool + Agent

๐Ÿ”Œ Run offline using HuggingFace LLMs

๐Ÿ’ก Summary
You now have a working Retrieval-Augmented Generation (RAG) agent using:

A local document chunked + embedded with SentenceTransformers

Stored in ChromaDB vector store

Queried using LangChain RetrievalQA

Answered using OpenAI GPT

Top comments (2)

Collapse
 
aiden_benjamin_52c3f6771f profile image
Aiden Benjamin

Super clean integration of Chroma! This makes RAG pipelines much more manageable and fast to deploy.

Collapse
 
lucas_henry_57e7d4ec16689 profile image
Lucas Henry

The real-world use case you mentioned gave me some great ideas for internal enterprise tools.