As a CS student obsessed with LLMs, I wanted to build something that actually solved a real problem, not just another chatbot that hallucinated answers. Law felt like the perfect domain. The stakes are high, the documents are dense, and people genuinely need accurate answers from specific sources. So I built a RAG-based Law Assistant that answers legal questions directly from PDF documents.
What is RAG and Why Does It Matter Here?
RAG (Retrieval-Augmented Generation) — is a pattern where you don't just ask an LLM a question cold. Instead, you first retrieve relevant chunks from your own documents, then pass those chunks as context to the model. The model answers based on what's actually in your documents, not what it vaguely remembers from training.
For legal use cases, this is critical. You don't want the model guessing. You want it citing the right clause from the right document.
The Stack:
LangChain — orchestration and chaining
FAISS — vector store for fast similarity search
OpenAI / HuggingFace Embeddings — to convert text into vectors
PyPDF2 / pdfplumber — to extract text from PDFs
Python — everything glued together
How It Works:
Load and Split the PDFs
The first step is getting the text out of the PDF documents and splitting it into manageable chunks.
Embed and Store in FAISS
Next, each chunk gets converted into a vector embedding and stored in a FAISS index for fast retrieval later.
FAISS is fast, lightweight, and runs entirely locally — no external database needed. For a student project, that's a big win.Build Retrieval Chain
This is where LangChain shines. You wire up the retriever and the LLM into a chain that handles everything automatically.
temperature=0keeps the model grounded; for legal Q&A, you want deterministic, factual answers, not creative ones.Ask Questions
The retriever pulls the 4 most relevant chunks from the FAISS index, passes them to the LLM as context, and the model answers based solely on those chunks.
What I Learned:
Chunking strategy is everything; The quality of your answers is directly tied to how well you split your documents. I spent more time tuning chunk size and overlap than on anything else.
Temperature = 0 for factual domains; Any creativity from the model is a liability when you're answering legal questions.
FAISS is surprisingly powerful for local projects; No cloud setup, no API calls, just fast vector search on your machine.
RAG isn't magic, its more like garbage in, garbage out; If your PDF extraction is messy (and legal PDFs often are), your retrieval will be messy too. Clean extraction is the way to go.
What's Next: I want to extend this with;
- A proper UI using Streamlit or React
- Support for multiple documents simultaneously
- Source citation — showing exactly which page and clause the answer came from
If you're a student trying to build something real with LLMs, RAG is one of the best patterns to learn first. It's practical, it's in demand, and it forces you to think about the full pipeline and not just prompt engineering.
Feel free to connect or drop questions in the comments.
Top comments (0)