DEV Community

Hassan Ahmed
Hassan Ahmed

Posted on

Built a Legal PDF Question-Answering Tool with AI (RAG + Streamlit)

Hey everyone,

I put together a small project that lets you upload legal PDFs and ask questions about them. The AI gives you answers and explains how it got there, so it's not just spitting out random stuff.

If you've ever had to read through a long legal document and thought, “I wish someone could just explain this part,” this might help.

You can check out the demo here:
https://shorturl.at/tZoEu


What It Does

  • Upload a legal document (PDF)
  • Ask natural language questions
  • Get an AI-generated answer with reasoning
  • All responses are grounded in the actual content of the document

It’s using Retrieval-Augmented Generation (RAG) under the hood, which just means the AI reads your file before trying to answer your question.


Run It Locally

Requirements

  • Python 3.8 or higher
  • Ollama (for generating embeddings)
  • A Groq API key (for the language model)

Setup

Clone the repo:

git clone https://github.com/Hassan123j/AI-Reasoning-Chatbot.git
cd ai-legal-assistant
Enter fullscreen mode Exit fullscreen mode

Create a virtual environment and install the dependencies:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

If you don’t have the requirements.txt file, install manually:

pip install streamlit langchain-groq langchain-community \
langchain-text-splitters langchain-ollama langchain-core \
faiss-cpu python-dotenv
Enter fullscreen mode Exit fullscreen mode

Ollama Setup

Install Ollama and pull the embedding model:

ollama pull all-minilm
Enter fullscreen mode Exit fullscreen mode

Groq API Key

Sign up at groq.com and get your API key. You can either:

  • Put it in a .env file:
  GROQ_API_KEY="your_api_key_here"
Enter fullscreen mode Exit fullscreen mode
  • Or export it in your terminal:
  export GROQ_API_KEY="your_api_key_here"
Enter fullscreen mode Exit fullscreen mode

Add Your PDFs

Make a folder for your documents:

mkdir pdfs
Enter fullscreen mode Exit fullscreen mode

Drop any legal PDFs you want to query into that folder.


Build the Vector Database

This step processes your PDFs into searchable chunks:

python vector_database.py
Enter fullscreen mode Exit fullscreen mode

Launch the App

Start the Streamlit app:

streamlit run frontend.py
Enter fullscreen mode Exit fullscreen mode

This will open up a local web interface. From there:

  • Upload your PDF
  • Type a question
  • Hit Ask

You’ll get an answer and a quick breakdown of how the AI found it.


File Overview

  • frontend.py: the Streamlit UI
  • rag_pipeline.py: where the AI logic happens
  • vector_database.py: breaks down PDFs and builds embeddings
  • pdfs/: your uploaded documents
  • vectorstore/: saved vector data for retrieval

Want to Customize It?

Everything is modular. You can:

  • Swap out the AI model
  • Change how PDFs are chunked
  • Tweak the prompt/response format

Just dive into the code and experiment.


Contributions Welcome

If you’ve got ideas, feedback, bug reports, or want to help improve the project:

  • Fork it
  • Open an issue
  • Submit a PR

Here’s the repo:
https://github.com/Hassan123j/AI-Reasoning-Chatbot


Top comments (0)