Hey everyone,
I put together a small project that lets you upload legal PDFs and ask questions about them. The AI gives you answers and explains how it got there, so it's not just spitting out random stuff.
If you've ever had to read through a long legal document and thought, “I wish someone could just explain this part,” this might help.
You can check out the demo here:
https://shorturl.at/tZoEu
What It Does
- Upload a legal document (PDF)
- Ask natural language questions
- Get an AI-generated answer with reasoning
- All responses are grounded in the actual content of the document
It’s using Retrieval-Augmented Generation (RAG) under the hood, which just means the AI reads your file before trying to answer your question.
Run It Locally
Requirements
- Python 3.8 or higher
- Ollama (for generating embeddings)
- A Groq API key (for the language model)
Setup
Clone the repo:
git clone https://github.com/Hassan123j/AI-Reasoning-Chatbot.git
cd ai-legal-assistant
Create a virtual environment and install the dependencies:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
If you don’t have the requirements.txt
file, install manually:
pip install streamlit langchain-groq langchain-community \
langchain-text-splitters langchain-ollama langchain-core \
faiss-cpu python-dotenv
Ollama Setup
Install Ollama and pull the embedding model:
ollama pull all-minilm
Groq API Key
Sign up at groq.com and get your API key. You can either:
- Put it in a
.env
file:
GROQ_API_KEY="your_api_key_here"
- Or export it in your terminal:
export GROQ_API_KEY="your_api_key_here"
Add Your PDFs
Make a folder for your documents:
mkdir pdfs
Drop any legal PDFs you want to query into that folder.
Build the Vector Database
This step processes your PDFs into searchable chunks:
python vector_database.py
Launch the App
Start the Streamlit app:
streamlit run frontend.py
This will open up a local web interface. From there:
- Upload your PDF
- Type a question
- Hit Ask
You’ll get an answer and a quick breakdown of how the AI found it.
File Overview
-
frontend.py
: the Streamlit UI -
rag_pipeline.py
: where the AI logic happens -
vector_database.py
: breaks down PDFs and builds embeddings -
pdfs/
: your uploaded documents -
vectorstore/
: saved vector data for retrieval
Want to Customize It?
Everything is modular. You can:
- Swap out the AI model
- Change how PDFs are chunked
- Tweak the prompt/response format
Just dive into the code and experiment.
Contributions Welcome
If you’ve got ideas, feedback, bug reports, or want to help improve the project:
- Fork it
- Open an issue
- Submit a PR
Here’s the repo:
https://github.com/Hassan123j/AI-Reasoning-Chatbot
Top comments (0)