I Built a RAG-Based PDF Reader Web App Using Node.js, LangChain, Ollama, and Hugging Face
Turn any PDF into an interactive AI knowledge base using Retrieval-Augmented Generation (RAG).
If you've ever wanted to upload a PDF and chat with it like ChatGPT, this project does exactly that.
I built a RAG-based PDF Reader Web App that allows users to:
- 📄 Upload a PDF file
- 🔍 Extract and process the content
- ✂️ Split the content into chunks
- 🧠 Generate embeddings locally
- 💾 Store them in a vector store
- 🎯 Retrieve relevant sections based on user questions
- 🤖 Generate grounded answers using a local LLM
This project combines traditional web development with modern AI application design, making it a great hands-on example of how RAG works in practice.
💡 Live Idea of the Project
The goal of this app is simple:
Upload a PDF and ask questions about its content in natural language.
Instead of manually searching through long reports, research papers, notes, books, or documentation, users can just ask:
- "What is the main topic of this document?"
- "Summarize chapter 2"
- "What are the important conclusions?"
- "What technologies are mentioned in the PDF?"
The application finds the most relevant content from the uploaded PDF and uses that context to answer the question.
📦 GitHub Repository
You can find the full source code here:
🔗 GitHub: https://github.com/SumitK25/rag-pdf-webapp
If you like the project, feel free to ⭐ star the repository, fork it, or contribute improvements!
🧩 What is RAG?
RAG stands for Retrieval-Augmented Generation.
It is a technique that improves LLM responses by combining two key steps:
- Retrieval – Search for the most relevant information from your own data
- Generation – Pass that retrieved information to a language model to generate an answer
Instead of relying only on what the model already knows, RAG helps the model answer based on specific external knowledge, such as:
- 📄 PDFs
- 📝 Documents
- 📒 Notes
- 🏢 Internal company data
- 📚 Research papers
- 📖 Support manuals
Why RAG is Useful
Large language models are powerful, but they have some limitations:
- ❌ They may hallucinate
- ❌ They may not know your private data
- ❌ They may not know newly added content
- ❌ They may answer confidently even when wrong
RAG helps solve this by providing the model with relevant context at query time.
In this project, that context comes from the uploaded PDF.
✨ Project Features
Here are the main features of this application:
- ✅ Upload PDF files from the browser
- ✅ Extract text content from PDF documents
- ✅ Split the document into manageable overlapping chunks
- ✅ Convert chunks into embeddings using a local embedding model
- ✅ Store embeddings in memory for semantic retrieval
- ✅ Ask questions about the uploaded PDF
- ✅ Retrieve the top relevant chunks
- ✅ Generate answers using Ollama
- ✅ Keep short chat history for conversational continuity
- ✅ Clear chat and reset the app when needed
- ✅ Modern and clean chat-style web interface
🛠️ Tech Stack
This project uses the following technologies.
Frontend
| Technology | Purpose |
|---|---|
| HTML | Page structure |
| CSS | Styling and layout |
| Vanilla JavaScript | Interactivity and API calls |
Backend
| Technology | Purpose |
|---|---|
| Node.js | Runtime environment |
| Express.js | Web server framework |
| Multer | File upload handling |
| CORS | Cross-origin support |
| dotenv | Environment variables |
AI / RAG Stack
| Technology | Purpose |
|---|---|
| LangChain | Orchestration framework |
| PDFLoader | PDF text extraction |
| RecursiveCharacterTextSplitter | Document chunking |
| MemoryVectorStore | In-memory vector database |
| HuggingFaceTransformersEmbeddings | Local embedding generation |
| ChatOllama | Local LLM integration |
Local Model Infrastructure
| Model | Role |
|---|---|
| Ollama | Local LLM server |
| llama3.2 | Answer generation |
| Xenova/all-MiniLM-L6-v2 | Embedding generation |
🤔 Why I Built This Project
I wanted to build something practical that demonstrates how AI can work with user-provided documents.
A lot of people hear about RAG in theory, but the best way to understand it is to build a complete project.
This project helped me explore:
- 📄 How to process PDF files
- 🔎 How semantic search works
- 🧠 How embeddings are generated
- 📦 How vector retrieval powers document Q&A
- 🤖 How to integrate local LLMs into a web app
- 🔗 How to connect a frontend chat UI to an AI backend
It's also a useful real-world application because PDFs are everywhere.
🎯 Use Cases
This kind of app can be useful for many scenarios.
🎓 For Students
- Ask questions from notes or textbooks
- Summarize chapters
- Clarify concepts from study material
🔬 For Researchers
- Query research papers quickly
- Identify key findings
- Summarize long documents
💻 For Developers
- Search technical documentation
- Ask questions from API docs
- Understand large manuals
🏢 For Businesses
- Interact with internal PDFs
- Extract insights from reports
- Build internal Q&A assistants
⚙️ How the Application Works
Let's walk through the full architecture and workflow of the application.
🔄 High-Level Flow
1. User uploads a PDF
2. Backend receives the file
3. PDF text is extracted using LangChain
4. Extracted text is split into chunks
5. Chunks are converted into embeddings
6. Embeddings are stored in a vector store
7. User asks a question
8. App retrieves the most relevant chunks
9. Retrieved chunks are used as context in the LLM prompt
10. Ollama generates the answer
11. Answer is shown in the chat UI
🖥️ Frontend
The frontend is designed to be simple, clean, and interactive.
It contains:
- 📌 A heading
- 📎 A PDF upload section
- ✅ Upload status messages
- 💬 A chat interface
- ⌨️ An input box for questions
- 🔘 A button to ask questions
- 🗑️ A clear chat button
📝 Frontend HTML Structure
The UI is built using HTML and styled with CSS.
1. Upload Section
This allows the user to select and upload a PDF.
<div class="upload-section">
<input type="file" id="pdfFile" accept=".pdf"/>
<button class="btn-upload" onclick="uploadPDF()">
Upload PDF
</button>
</div>
2. Upload Status
This displays upload progress or error messages.
<div id="uploadStatus"></div>
3. Chat Box
This is the main interface for user interaction.
<div class="chat-box">
<div class="chat-header">
💬 Chat with your PDF
<button class="btn-clear" onclick="clearChat()">
🗑 Clear
</button>
</div>
<div class="messages" id="messages">
<div class="message ai">
👋 Upload a PDF and ask me anything about it!
</div>
</div>
<div class="input-area">
<input
type="text"
id="questionInput"
placeholder="Ask a question about the PDF..."
onkeydown="if(event.key==='Enter') askQuestion()"
/>
<button class="btn-ask" id="askBtn" onclick="askQuestion()">
Ask
</button>
</div>
</div>
🎨 Frontend CSS Styling
The app uses a modern card-style design.
* {
box-sizing: border-box;
margin: 0;
padding: 0;
}
body {
font-family: 'Segoe UI', sans-serif;
background: #f0f2f5;
display: flex;
flex-direction: column;
align-items: center;
min-height: 100vh;
padding: 30px 20px;
}
h1 {
font-size: 26px;
margin-bottom: 20px;
color: #1a1a2e;
}
.upload-section {
background: white;
padding: 16px 24px;
border-radius: 12px;
box-shadow: 0 2px 10px rgba(0,0,0,0.1);
display: flex;
align-items: center;
gap: 12px;
margin-bottom: 12px;
width: 100%;
max-width: 720px;
}
.chat-box {
background: white;
border-radius: 12px;
box-shadow: 0 2px 10px rgba(0,0,0,0.1);
width: 100%;
max-width: 720px;
display: flex;
flex-direction: column;
height: 520px;
}
.chat-header {
padding: 14px 20px;
background: #4f46e5;
color: white;
border-radius: 12px 12px 0 0;
display: flex;
justify-content: space-between;
align-items: center;
font-size: 15px;
font-weight: 600;
}
.messages {
flex: 1;
overflow-y: auto;
padding: 20px;
display: flex;
flex-direction: column;
gap: 12px;
}
.message {
max-width: 82%;
padding: 10px 14px;
border-radius: 12px;
font-size: 14px;
line-height: 1.6;
white-space: pre-wrap;
word-break: break-word;
}
.message.user {
background: #4f46e5;
color: white;
align-self: flex-end;
border-bottom-right-radius: 3px;
}
.message.ai {
background: #f1f5f9;
color: #1a1a2e;
align-self: flex-start;
border-bottom-left-radius: 3px;
}
.message.loading {
background: #f1f5f9;
color: #94a3b8;
align-self: flex-start;
font-style: italic;
}
.input-area {
display: flex;
padding: 14px;
gap: 10px;
border-top: 1px solid #e2e8f0;
}
.input-area input {
flex: 1;
padding: 10px 14px;
border: 1px solid #cbd5e1;
border-radius: 8px;
font-size: 14px;
outline: none;
}
.input-area input:focus {
border-color: #4f46e5;
}
.btn-ask {
background: #4f46e5;
color: white;
border: none;
padding: 10px 22px;
border-radius: 8px;
cursor: pointer;
font-size: 14px;
font-weight: 600;
}
.btn-ask:hover {
background: #4338ca;
}
.btn-ask:disabled {
background: #a5b4fc;
cursor: not-allowed;
}
.btn-upload {
background: #4f46e5;
color: white;
border: none;
padding: 10px 20px;
border-radius: 8px;
cursor: pointer;
font-size: 14px;
font-weight: 600;
}
.btn-upload:hover {
background: #4338ca;
}
.btn-clear {
background: rgba(255,255,255,0.2);
color: white;
border: none;
padding: 6px 14px;
border-radius: 6px;
cursor: pointer;
font-size: 13px;
}
.btn-clear:hover {
background: rgba(255,255,255,0.35);
}
CSS Design Highlights
- ✅ Responsive centered layout
- ✅ Upload card with spacing and shadows
- ✅ Scrollable message container
- ✅ Separate styles for user and AI messages
- ✅ Loading state for thinking indicator
- ✅ Disabled state for the ask button
- ✅ Clean indigo color scheme
⚡ Frontend JavaScript Logic
The frontend JavaScript handles:
- 📤 PDF upload
- ✅ Status updates
- 📨 Sending user questions
- 💬 Showing chat messages
- ⏳ Displaying loading states
- 🗑️ Clearing chat
- 📜 Auto-scrolling the chat window
📤 Uploading a PDF
When the user selects a PDF and clicks Upload PDF, the file is sent to the server using FormData.
async function uploadPDF() {
const fileInput = document.getElementById("pdfFile");
const statusDiv = document.getElementById("uploadStatus");
if (!fileInput.files[0]) {
statusDiv.style.color = "red";
statusDiv.textContent = "❌ Please select a PDF.";
return;
}
statusDiv.style.color = "orange";
statusDiv.textContent = "⏳ Processing PDF...";
const formData = new FormData();
formData.append("pdf", fileInput.files[0]);
try {
const res = await fetch("/upload", {
method: "POST",
body: formData,
});
const data = await res.json();
if (res.ok) {
statusDiv.style.color = "green";
statusDiv.textContent =
`✅ Ready! ${data.chunks} chunks indexed.`;
addMessage(
"ai",
"✅ PDF loaded successfully! Ask me anything."
);
} else {
statusDiv.style.color = "red";
statusDiv.textContent = "❌ " + data.error;
}
} catch (e) {
statusDiv.style.color = "red";
statusDiv.textContent = "❌ Upload failed.";
}
}
What this does:
- ✅ Checks if a file is selected
- ✅ Updates UI status with color-coded messages
- ✅ Sends the PDF to
/uploadviaFormData - ✅ Handles success and failure responses
- ✅ Informs the user when the PDF is ready
❓ Asking Questions
Once the PDF is processed, the user can ask questions about it.
async function askQuestion() {
const input = document.getElementById("questionInput");
const askBtn = document.getElementById("askBtn");
const question = input.value.trim();
if (!question) return;
addMessage("user", question);
input.value = "";
askBtn.disabled = true;
const loadingId = addMessage("loading", "🤖 Thinking...");
try {
const res = await fetch("/ask", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ question }),
});
const data = await res.json();
removeMessage(loadingId);
if (res.ok) {
addMessage("ai", data.answer);
} else {
addMessage("ai", "❌ " + data.error);
}
} catch (e) {
removeMessage(loadingId);
addMessage("ai", "❌ Request failed.");
}
askBtn.disabled = false;
scrollToBottom();
}
What this does:
- ✅ Reads the user question
- ✅ Displays it in the chat as a user message
- ✅ Sends the question to the backend
- ✅ Shows a "🤖 Thinking..." message while waiting
- ✅ Displays the AI response
- ✅ Handles API errors gracefully
- ✅ Disables the ask button during processing
🛠️ Message Utility Functions
The UI uses helper functions to manage chat messages dynamically.
function addMessage(type, text) {
const box = document.getElementById("messages");
const div = document.createElement("div");
const id = "msg_" + Date.now() + Math.random();
div.id = id;
div.className = `message ${type}`;
div.textContent = text;
box.appendChild(div);
scrollToBottom();
return id;
}
function removeMessage(id) {
const el = document.getElementById(id);
if (el) el.remove();
}
function scrollToBottom() {
const box = document.getElementById("messages");
box.scrollTop = box.scrollHeight;
}
What these do:
-
addMessage()— Creates a new message bubble in the chat -
removeMessage()— Removes a specific message (used for loading indicators) -
scrollToBottom()— Auto-scrolls the chat to show the latest message
🗑️ Clearing the Chat
The app supports resetting everything with a single click.
async function clearChat() {
document.getElementById("messages").innerHTML =
'<div class="message ai">🗑️ Chat cleared! Upload a PDF to start.</div>';
document.getElementById("uploadStatus").textContent = "";
await fetch("/clear", { method: "POST" });
}
This clears:
- 💬 Chat UI messages
- ✅ Upload status text
- 🧠 Backend vector store
- 📝 Backend chat history
🔧 Backend
Now let's look at the backend, which handles all the AI and document processing logic.
📦 Backend Dependencies
The server imports the following modules:
import "dotenv/config";
import express from "express";
import multer from "multer";
import cors from "cors";
import fs from "fs";
import path from "path";
import { fileURLToPath } from "url";
import { PDFLoader }
from "@langchain/community/document_loaders/fs/pdf";
import { RecursiveCharacterTextSplitter }
from "@langchain/textsplitters";
import { MemoryVectorStore }
from "langchain/vectorstores/memory";
import { HuggingFaceTransformersEmbeddings }
from "@langchain/community/embeddings/huggingface_transformers";
import { ChatOllama }
from "@langchain/ollama";
Why These Packages Are Used
| Package | Purpose |
|---|---|
express |
Server framework |
multer |
File upload handling |
cors |
Cross-origin request support |
fs |
File system operations |
dotenv |
Environment variable management |
PDFLoader |
Reads PDF content |
RecursiveCharacterTextSplitter |
Splits long text into chunks |
MemoryVectorStore |
Stores embeddings in memory |
HuggingFaceTransformersEmbeddings |
Generates embeddings locally |
ChatOllama |
Connects to the local Ollama model |
🚀 Server Initialization
The app starts by setting up Express and middleware.
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const app = express();
app.use(cors());
app.use(express.json());
app.use(express.static("public"));
const upload = multer({ dest: "uploads/" });
What this does:
- ✅ Enables CORS for cross-origin requests
- ✅ Parses JSON request bodies
- ✅ Serves static frontend files from the
publicdirectory - ✅ Configures Multer to store uploads temporarily in the
uploads/folder
💾 Application State
Two in-memory variables are used to track the current session:
let vectorStore = null;
let chatHistory = [];
| Variable | Purpose |
|---|---|
vectorStore |
Stores the indexed PDF chunks as embeddings |
chatHistory |
Stores recent conversation pairs for context continuity |
Important note: Because these are stored in memory:
- ⚠️ They reset when the server restarts
- ✅ They are suitable for demos and prototypes
- ❌ They are not ideal for production persistence
🧠 Embedding Model Setup
The app uses a local Hugging Face embedding model:
const embeddings = new HuggingFaceTransformersEmbeddings({
modelName: "Xenova/all-MiniLM-L6-v2",
});
Why This Model?
Xenova/all-MiniLM-L6-v2 is popular because it is:
- ✅ Lightweight and fast
- ✅ Runs locally without external API
- ✅ Good for semantic similarity tasks
- ✅ Suitable for local RAG prototypes
- ✅ Well-supported in LangChain
Its role is to convert text chunks into numerical vectors (embeddings) that capture the semantic meaning of the text.
🤖 LLM Setup with Ollama
For answer generation, the app uses Ollama with the llama3.2 model.
const llm = new ChatOllama({
model: "llama3.2",
temperature: 0,
baseUrl: "http://127.0.0.1:11434",
});
Why Use Ollama?
Ollama allows you to run open-source LLMs locally.
- ✅ Local execution — no cloud dependency
- ✅ Privacy-friendly — your data stays on your machine
- ✅ No API costs
- ✅ Easy model switching
- ✅ Great for experimentation and learning
Why Temperature is 0?
A lower temperature makes the output more deterministic and focused, which is ideal for document Q&A where accuracy matters more than creativity.
📤 Upload Route: POST /upload
This route handles PDF uploading, text extraction, chunking, embedding, and indexing.
app.post("/upload", upload.single("pdf"), async (req, res) => {
try {
if (!req.file) {
return res.status(400).json({ error: "No file uploaded" });
}
const filePath = req.file.path;
chatHistory = [];
console.log("📄 Loading PDF...");
const loader = new PDFLoader(filePath);
const docs = await loader.load();
console.log(` → Loaded ${docs.length} page(s)`);
console.log("✂️ Splitting text...");
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const splits = await splitter.splitDocuments(docs);
console.log(` → Created ${splits.length} chunks`);
console.log("🧠 Creating LOCAL embeddings...");
vectorStore = await MemoryVectorStore.fromDocuments(
splits,
embeddings
);
fs.unlinkSync(filePath);
console.log("✅ PDF processed successfully!");
res.json({
message: "✅ PDF processed successfully!",
chunks: splits.length,
});
} catch (err) {
console.error("❌ Upload error:", err.message);
res.status(500).json({ error: "PDF processing failed" });
}
});
🔍 Step-by-Step Breakdown of /upload
Step 1: Validate the Upload
If no file is uploaded, the server returns an error immediately.
if (!req.file) {
return res.status(400).json({ error: "No file uploaded" });
}
Step 2: Reset Chat History
Uploading a new PDF starts a fresh document session.
chatHistory = [];
Step 3: Load the PDF
The PDF is parsed into LangChain document objects using PDFLoader.
const loader = new PDFLoader(filePath);
const docs = await loader.load();
Each page of the PDF becomes a separate document object containing:
-
pageContent— the text of the page -
metadata— page number and source info
Step 4: Split the Text into Chunks
The content is divided into manageable chunks using RecursiveCharacterTextSplitter.
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const splits = await splitter.splitDocuments(docs);
Step 5: Generate Embeddings and Create the Vector Store
Each chunk is embedded and stored for later retrieval.
vectorStore = await MemoryVectorStore.fromDocuments(
splits,
embeddings
);
Step 6: Delete the Uploaded File
The temporary file is removed after processing to save disk space.
fs.unlinkSync(filePath);
Step 7: Return Success Response
The server sends back the chunk count and a success message.
res.json({
message: "✅ PDF processed successfully!",
chunks: splits.length,
});
✂️ Why Chunking is Important
Chunking is a critical part of any RAG system.
If you pass an entire PDF into an LLM:
- ❌ It may exceed token limits
- ❌ Retrieval becomes inefficient
- ❌ Answers may become noisy or unfocused
Top comments (0)