I Built a PDF Chat App in Under an Hour Using RAG- Here's How You Can Too

#python #ai #rag #langchain

🔗 Live Demo:
https://pdf-chat-rag-fx5nczbrwczzpou6qyczmj.streamlit.app/
📦 GitHub Repo:
https://github.com/aliabdm/pdf-chat-rag
🤔 The Idea
Ever wished you could talk to your documents instead of endlessly scrolling through pages?
That’s exactly what I built using Retrieval-Augmented Generation (RAG) and modern GenAI tools.
Upload a PDF → ask questions → get accurate, context-aware answers in seconds.
❌ The Problem
We’ve all been there:

50-page research papers
Long contracts
Dense technical docs
CVs in recruitment workflows

Ctrl + F isn’t enough when you need:

Summaries
Cross-section answers
Simple explanations
Context-aware responses

✅ The Solution: PDF Chat with RAG
I built a web app that lets you:

Upload any PDF
Ask questions in natural language
Get answers grounded only in your document

👉 Try it live:
https://pdf-chat-rag-fx5nczbrwczzpou6qyczmj.streamlit.app/
🧱 Tech Stack (Why Each Tool Matters)
🧩 LangChain — The RAG Backbone
LangChain makes RAG production-ready by handling:

Document chunking
Embeddings
Retrieval + generation orchestration

Pythonfrom langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)

chunks = text_splitter.split_text(text)
⚡ Groq — Lightning-Fast LLM Inference
Groq uses custom LPU hardware and delivers:

~2s response time
Models like Llama 3.3 70B
Generous free tier

Pythonfrom langchain_groq import ChatGroq

llm = ChatGroq(
model_name="llama-3.3-70b-versatile",
temperature=0,
groq_api_key=api_key
)
🔍 FAISS — Vector Similarity Search
When your PDF becomes 100+ chunks, FAISS finds the most relevant ones fast.
Pythonfrom langchain_community.vectorstores import FAISS

vector_store = FAISS.from_texts(chunks, embeddings)
🎨 Streamlit — UI in Minutes
Why Streamlit?

No frontend boilerplate
Built-in chat + file upload
Free deployment

Pythonimport streamlit as st

uploaded_file = st.file_uploader("Upload PDF", type=["pdf"])

if question := st.chat_input("Ask a question"):
pass
🧠 HuggingFace Embeddings
We use all-MiniLM-L6-v2:

Fast
High quality
Runs locally
No API cost

Pythonfrom langchain_community.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-L6-v2"
)
🔄 How RAG Works (Simple Breakdown)
Phase 1 — Document Processing

Upload PDF
Extract text
Split into chunks
Generate embeddings
Store in FAISS

Phase 2 — Question Answering

Embed the question
Retrieve top 3 relevant chunks
Build context
Send to LLM
Return grounded answer

Pythondocs = vector_store.similarity_search(question, k=3)

context = "\n\n".join([doc.page_content for doc in docs])

prompt = f"""
Context:
{context}

Question:
{question}

Answer ONLY based on the context above.
"""

answer = llm.invoke(prompt)
🧪 Core RAG Logic (That’s It)
Pythondef answer_question(question, vector_store, llm):
docs = vector_store.similarity_search(question, k=3)
context = "\n\n".join([doc.page_content for doc in docs])

prompt = ChatPromptTemplate.from_template("""
Context: {context}
Question: {question}

Provide a detailed answer based on the context.
""")

chain = prompt | llm | StrOutputParser()
return chain.invoke({"context": context, "question": question})

🧠 Key Design Decisions

Chunk overlap: avoids cutting context
Temperature = 0: deterministic answers
k = 3 chunks: best speed/accuracy balance

⚠️ Challenges & Fixes
PDF Text Extraction
Some PDFs return broken text.
✔️ Added validation + clear error messages.
Context Window Limits
Large docs exceeded limits.
✔️ Limited chunk size + retrieval count.
Answer Quality
Early answers were vague.
✔️ Strong prompt constraints.

📊 Performance
Metric,Value
PDF size,50 pages
Processing time,~15s
Response time,~2s
Chunks,87
Accuracy,⭐ 8.5 / 10

🚀 What’s Next?

Multi-PDF support
Conversation memory
Export chat history
Word / TXT support

🧑‍💻 Run It Locally
Bashgit clone https://github.com/aliabdm/pdf-chat-rag pip install -r requirements.txt streamlit run app.py
Deploy on Streamlit Cloud in one click 🚀
🧠 Lessons Learned