Hassan Ahmed

Posted on Jun 23

Built a Legal PDF Question-Answering Tool with AI (RAG + Streamlit)

Hey everyone,

I put together a small project that lets you upload legal PDFs and ask questions about them. The AI gives you answers and explains how it got there, so it's not just spitting out random stuff.

If you've ever had to read through a long legal document and thought, “I wish someone could just explain this part,” this might help.

You can check out the demo here:
https://shorturl.at/tZoEu

What It Does

Upload a legal document (PDF)
Ask natural language questions
Get an AI-generated answer with reasoning
All responses are grounded in the actual content of the document

It’s using Retrieval-Augmented Generation (RAG) under the hood, which just means the AI reads your file before trying to answer your question.

Run It Locally

Requirements

Python 3.8 or higher
Ollama (for generating embeddings)
A Groq API key (for the language model)

Setup

Clone the repo:

git clone https://github.com/Hassan123j/AI-Reasoning-Chatbot.git
cd ai-legal-assistant

Create a virtual environment and install the dependencies:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

If you don’t have the requirements.txt file, install manually:

pip install streamlit langchain-groq langchain-community \
langchain-text-splitters langchain-ollama langchain-core \
faiss-cpu python-dotenv

Ollama Setup

Install Ollama and pull the embedding model:

ollama pull all-minilm

Groq API Key

Put it in a .env file:

  GROQ_API_KEY="your_api_key_here"

Or export it in your terminal:

  export GROQ_API_KEY="your_api_key_here"

Add Your PDFs

Make a folder for your documents:

mkdir pdfs

Drop any legal PDFs you want to query into that folder.

Build the Vector Database

This step processes your PDFs into searchable chunks:

python vector_database.py

Launch the App

Start the Streamlit app:

streamlit run frontend.py

This will open up a local web interface. From there:

Upload your PDF
Type a question
Hit Ask

You’ll get an answer and a quick breakdown of how the AI found it.

File Overview

frontend.py: the Streamlit UI
rag_pipeline.py: where the AI logic happens
vector_database.py: breaks down PDFs and builds embeddings
pdfs/: your uploaded documents
vectorstore/: saved vector data for retrieval

Want to Customize It?

Everything is modular. You can:

Swap out the AI model
Change how PDFs are chunked
Tweak the prompt/response format

Just dive into the code and experiment.

Contributions Welcome

If you’ve got ideas, feedback, bug reports, or want to help improve the project:

Fork it
Open an issue
Submit a PR

Here’s the repo:
https://github.com/Hassan123j/AI-Reasoning-Chatbot

DEV Community