DEV Community

Cover image for I Built a PDF Chat App in Under an Hour Using RAG- Here's How You Can Too
Mohammad ALi Abd Alwahed
Mohammad ALi Abd Alwahed

Posted on

I Built a PDF Chat App in Under an Hour Using RAG- Here's How You Can Too

🔗 Live Demo:
https://pdf-chat-rag-fx5nczbrwczzpou6qyczmj.streamlit.app/
📦 GitHub Repo:
https://github.com/aliabdm/pdf-chat-rag
🤔 The Idea
Ever wished you could talk to your documents instead of endlessly scrolling through pages?
That’s exactly what I built using Retrieval-Augmented Generation (RAG) and modern GenAI tools.
Upload a PDF → ask questions → get accurate, context-aware answers in seconds.
❌ The Problem
We’ve all been there:

50-page research papers
Long contracts
Dense technical docs
CVs in recruitment workflows

Ctrl + F isn’t enough when you need:

Summaries
Cross-section answers
Simple explanations
Context-aware responses

✅ The Solution: PDF Chat with RAG
I built a web app that lets you:

Upload any PDF
Ask questions in natural language
Get answers grounded only in your document

👉 Try it live:
https://pdf-chat-rag-fx5nczbrwczzpou6qyczmj.streamlit.app/
🧱 Tech Stack (Why Each Tool Matters)
🧩 LangChain — The RAG Backbone
LangChain makes RAG production-ready by handling:

Document chunking
Embeddings
Retrieval + generation orchestration

Pythonfrom langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)

chunks = text_splitter.split_text(text)
⚡ Groq — Lightning-Fast LLM Inference
Groq uses custom LPU hardware and delivers:

~2s response time
Models like Llama 3.3 70B
Generous free tier

Pythonfrom langchain_groq import ChatGroq

llm = ChatGroq(
model_name="llama-3.3-70b-versatile",
temperature=0,
groq_api_key=api_key
)
🔍 FAISS — Vector Similarity Search
When your PDF becomes 100+ chunks, FAISS finds the most relevant ones fast.
Pythonfrom langchain_community.vectorstores import FAISS

vector_store = FAISS.from_texts(chunks, embeddings)
🎨 Streamlit — UI in Minutes
Why Streamlit?

No frontend boilerplate
Built-in chat + file upload
Free deployment

Pythonimport streamlit as st

uploaded_file = st.file_uploader("Upload PDF", type=["pdf"])

if question := st.chat_input("Ask a question"):
pass
🧠 HuggingFace Embeddings
We use all-MiniLM-L6-v2:

Fast
High quality
Runs locally
No API cost

Pythonfrom langchain_community.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-L6-v2"
)
🔄 How RAG Works (Simple Breakdown)
Phase 1 — Document Processing

Upload PDF
Extract text
Split into chunks
Generate embeddings
Store in FAISS

Phase 2 — Question Answering

Embed the question
Retrieve top 3 relevant chunks
Build context
Send to LLM
Return grounded answer

Pythondocs = vector_store.similarity_search(question, k=3)

context = "\n\n".join([doc.page_content for doc in docs])

prompt = f"""
Context:
{context}

Question:
{question}

Answer ONLY based on the context above.
"""

answer = llm.invoke(prompt)
🧪 Core RAG Logic (That’s It)
Pythondef answer_question(question, vector_store, llm):
docs = vector_store.similarity_search(question, k=3)
context = "\n\n".join([doc.page_content for doc in docs])

prompt = ChatPromptTemplate.from_template("""
Context: {context}
Question: {question}

Provide a detailed answer based on the context.
""")

chain = prompt | llm | StrOutputParser()
return chain.invoke({"context": context, "question": question})
Enter fullscreen mode Exit fullscreen mode

🧠 Key Design Decisions

Chunk overlap: avoids cutting context
Temperature = 0: deterministic answers
k = 3 chunks: best speed/accuracy balance

⚠️ Challenges & Fixes
PDF Text Extraction
Some PDFs return broken text.
✔️ Added validation + clear error messages.
Context Window Limits
Large docs exceeded limits.
✔️ Limited chunk size + retrieval count.
Answer Quality
Early answers were vague.
✔️ Strong prompt constraints.

📊 Performance
Metric,Value
PDF size,50 pages
Processing time,~15s
Response time,~2s
Chunks,87
Accuracy,⭐ 8.5 / 10

🚀 What’s Next?

  • Multi-PDF support
  • Conversation memory
  • Export chat history
  • Word / TXT support

🧑‍💻 Run It Locally
Bashgit clone https://github.com/aliabdm/pdf-chat-rag
pip install -r requirements.txt
streamlit run app.py

Deploy on Streamlit Cloud in one click 🚀
🧠 Lessons Learned

  • RAG is simpler than it looks
  • Speed > model size
  • Prompt engineering matters
  • Start simple, iterate fast

🔚 Final Thoughts
Modern AI is about orchestration, not reinventing tools.
If this helped you, consider giving the repo a ⭐
🔗 Connect With Me

LinkedIn: https://www.linkedin.com/in/mohammad-ali-abdul-wahed-1533b9171/
GitHub: https://github.com/aliabdm
Dev.to: https://dev.to/maliano63717738

Happy coding 🚀

Top comments (0)