Sumit Kumar

Posted on Mar 25

I Built a RAG-Based PDF Reader Web App Using Node.js, LangChain, Ollama, and Hugging Face

#ai #javascript #node #langchain

I Built a RAG-Based PDF Reader Web App Using Node.js, LangChain, Ollama, and Hugging Face

Turn any PDF into an interactive AI knowledge base using Retrieval-Augmented Generation (RAG).

If you've ever wanted to upload a PDF and chat with it like ChatGPT, this project does exactly that.

I built a RAG-based PDF Reader Web App that allows users to:

📄 Upload a PDF file
🔍 Extract and process the content
✂️ Split the content into chunks
🧠 Generate embeddings locally
💾 Store them in a vector store
🎯 Retrieve relevant sections based on user questions
🤖 Generate grounded answers using a local LLM

This project combines traditional web development with modern AI application design, making it a great hands-on example of how RAG works in practice.

💡 Live Idea of the Project

The goal of this app is simple:

Upload a PDF and ask questions about its content in natural language.

Instead of manually searching through long reports, research papers, notes, books, or documentation, users can just ask:

"What is the main topic of this document?"
"Summarize chapter 2"
"What are the important conclusions?"
"What technologies are mentioned in the PDF?"

The application finds the most relevant content from the uploaded PDF and uses that context to answer the question.

📦 GitHub Repository

You can find the full source code here:

🔗 GitHub: https://github.com/SumitK25/rag-pdf-webapp

If you like the project, feel free to ⭐ star the repository, fork it, or contribute improvements!

🧩 What is RAG?

RAG stands for Retrieval-Augmented Generation.

It is a technique that improves LLM responses by combining two key steps:

Retrieval – Search for the most relevant information from your own data
Generation – Pass that retrieved information to a language model to generate an answer

Instead of relying only on what the model already knows, RAG helps the model answer based on specific external knowledge, such as:

📄 PDFs
📝 Documents
📒 Notes
🏢 Internal company data
📚 Research papers
📖 Support manuals

Why RAG is Useful

Large language models are powerful, but they have some limitations:

❌ They may hallucinate
❌ They may not know your private data
❌ They may not know newly added content
❌ They may answer confidently even when wrong

RAG helps solve this by providing the model with relevant context at query time.

In this project, that context comes from the uploaded PDF.

✨ Project Features

Here are the main features of this application:

✅ Upload PDF files from the browser
✅ Extract text content from PDF documents
✅ Split the document into manageable overlapping chunks
✅ Convert chunks into embeddings using a local embedding model
✅ Store embeddings in memory for semantic retrieval
✅ Ask questions about the uploaded PDF
✅ Retrieve the top relevant chunks
✅ Generate answers using Ollama
✅ Keep short chat history for conversational continuity
✅ Clear chat and reset the app when needed
✅ Modern and clean chat-style web interface

🛠️ Tech Stack

This project uses the following technologies.

Frontend

Technology	Purpose
HTML	Page structure
CSS	Styling and layout
Vanilla JavaScript	Interactivity and API calls

Backend

Technology	Purpose
Node.js	Runtime environment
Express.js	Web server framework
Multer	File upload handling
CORS	Cross-origin support
dotenv	Environment variables

AI / RAG Stack

Technology	Purpose
LangChain	Orchestration framework
PDFLoader	PDF text extraction
RecursiveCharacterTextSplitter	Document chunking
MemoryVectorStore	In-memory vector database
HuggingFaceTransformersEmbeddings	Local embedding generation
ChatOllama	Local LLM integration

Local Model Infrastructure

Model	Role
Ollama	Local LLM server
llama3.2	Answer generation
Xenova/all-MiniLM-L6-v2	Embedding generation

🤔 Why I Built This Project

I wanted to build something practical that demonstrates how AI can work with user-provided documents.

A lot of people hear about RAG in theory, but the best way to understand it is to build a complete project.

This project helped me explore:

📄 How to process PDF files
🔎 How semantic search works
🧠 How embeddings are generated
📦 How vector retrieval powers document Q&A
🤖 How to integrate local LLMs into a web app
🔗 How to connect a frontend chat UI to an AI backend

It's also a useful real-world application because PDFs are everywhere.

🎯 Use Cases

This kind of app can be useful for many scenarios.

🎓 For Students

Ask questions from notes or textbooks
Summarize chapters
Clarify concepts from study material

🔬 For Researchers

Query research papers quickly
Identify key findings
Summarize long documents

💻 For Developers

Search technical documentation
Ask questions from API docs
Understand large manuals

🏢 For Businesses

Interact with internal PDFs
Extract insights from reports
Build internal Q&A assistants

⚙️ How the Application Works

Let's walk through the full architecture and workflow of the application.

🔄 High-Level Flow

1.  User uploads a PDF
2.  Backend receives the file
3.  PDF text is extracted using LangChain
4.  Extracted text is split into chunks
5.  Chunks are converted into embeddings
6.  Embeddings are stored in a vector store
7.  User asks a question
8.  App retrieves the most relevant chunks
9.  Retrieved chunks are used as context in the LLM prompt
10. Ollama generates the answer
11. Answer is shown in the chat UI

🖥️ Frontend

The frontend is designed to be simple, clean, and interactive.

It contains:

📌 A heading
📎 A PDF upload section
✅ Upload status messages
💬 A chat interface
⌨️ An input box for questions
🔘 A button to ask questions
🗑️ A clear chat button

📝 Frontend HTML Structure

The UI is built using HTML and styled with CSS.

1. Upload Section

This allows the user to select and upload a PDF.

<div class="upload-section">
  <input type="file" id="pdfFile" accept=".pdf"/>
  <button class="btn-upload" onclick="uploadPDF()">
    Upload PDF
  </button>
</div>

2. Upload Status

This displays upload progress or error messages.

<div id="uploadStatus"></div>

3. Chat Box

This is the main interface for user interaction.

<div class="chat-box">
  <div class="chat-header">
    💬 Chat with your PDF
    <button class="btn-clear" onclick="clearChat()">
      🗑 Clear
    </button>
  </div>
  <div class="messages" id="messages">
    <div class="message ai">
      👋 Upload a PDF and ask me anything about it!
    </div>
  </div>
  <div class="input-area">
    <input
      type="text"
      id="questionInput"
      placeholder="Ask a question about the PDF..."
      onkeydown="if(event.key==='Enter') askQuestion()"
    />
    <button class="btn-ask" id="askBtn" onclick="askQuestion()">
      Ask
    </button>
  </div>
</div>

🎨 Frontend CSS Styling

The app uses a modern card-style design.

* {
  box-sizing: border-box;
  margin: 0;
  padding: 0;
}

body {
  font-family: 'Segoe UI', sans-serif;
  background: #f0f2f5;
  display: flex;
  flex-direction: column;
  align-items: center;
  min-height: 100vh;
  padding: 30px 20px;
}

h1 {
  font-size: 26px;
  margin-bottom: 20px;
  color: #1a1a2e;
}

.upload-section {
  background: white;
  padding: 16px 24px;
  border-radius: 12px;
  box-shadow: 0 2px 10px rgba(0,0,0,0.1);
  display: flex;
  align-items: center;
  gap: 12px;
  margin-bottom: 12px;
  width: 100%;
  max-width: 720px;
}

.chat-box {
  background: white;
  border-radius: 12px;
  box-shadow: 0 2px 10px rgba(0,0,0,0.1);
  width: 100%;
  max-width: 720px;
  display: flex;
  flex-direction: column;
  height: 520px;
}

.chat-header {
  padding: 14px 20px;
  background: #4f46e5;
  color: white;
  border-radius: 12px 12px 0 0;
  display: flex;
  justify-content: space-between;
  align-items: center;
  font-size: 15px;
  font-weight: 600;
}

.messages {
  flex: 1;
  overflow-y: auto;
  padding: 20px;
  display: flex;
  flex-direction: column;
  gap: 12px;
}

.message {
  max-width: 82%;
  padding: 10px 14px;
  border-radius: 12px;
  font-size: 14px;
  line-height: 1.6;
  white-space: pre-wrap;
  word-break: break-word;
}

.message.user {
  background: #4f46e5;
  color: white;
  align-self: flex-end;
  border-bottom-right-radius: 3px;
}

.message.ai {
  background: #f1f5f9;
  color: #1a1a2e;
  align-self: flex-start;
  border-bottom-left-radius: 3px;
}

.message.loading {
  background: #f1f5f9;
  color: #94a3b8;
  align-self: flex-start;
  font-style: italic;
}

.input-area {
  display: flex;
  padding: 14px;
  gap: 10px;
  border-top: 1px solid #e2e8f0;
}

.input-area input {
  flex: 1;
  padding: 10px 14px;
  border: 1px solid #cbd5e1;
  border-radius: 8px;
  font-size: 14px;
  outline: none;
}

.input-area input:focus {
  border-color: #4f46e5;
}

.btn-ask {
  background: #4f46e5;
  color: white;
  border: none;
  padding: 10px 22px;
  border-radius: 8px;
  cursor: pointer;
  font-size: 14px;
  font-weight: 600;
}

.btn-ask:hover {
  background: #4338ca;
}

.btn-ask:disabled {
  background: #a5b4fc;
  cursor: not-allowed;
}

.btn-upload {
  background: #4f46e5;
  color: white;
  border: none;
  padding: 10px 20px;
  border-radius: 8px;
  cursor: pointer;
  font-size: 14px;
  font-weight: 600;
}

.btn-upload:hover {
  background: #4338ca;
}

.btn-clear {
  background: rgba(255,255,255,0.2);
  color: white;
  border: none;
  padding: 6px 14px;
  border-radius: 6px;
  cursor: pointer;
  font-size: 13px;
}

.btn-clear:hover {
  background: rgba(255,255,255,0.35);
}

CSS Design Highlights

✅ Responsive centered layout
✅ Upload card with spacing and shadows
✅ Scrollable message container
✅ Separate styles for user and AI messages
✅ Loading state for thinking indicator
✅ Disabled state for the ask button
✅ Clean indigo color scheme

⚡ Frontend JavaScript Logic

The frontend JavaScript handles:

📤 PDF upload
✅ Status updates
📨 Sending user questions
💬 Showing chat messages
⏳ Displaying loading states
🗑️ Clearing chat
📜 Auto-scrolling the chat window

📤 Uploading a PDF

When the user selects a PDF and clicks Upload PDF, the file is sent to the server using FormData.

async function uploadPDF() {
  const fileInput = document.getElementById("pdfFile");
  const statusDiv = document.getElementById("uploadStatus");

  if (!fileInput.files[0]) {
    statusDiv.style.color = "red";
    statusDiv.textContent = "❌ Please select a PDF.";
    return;
  }

  statusDiv.style.color = "orange";
  statusDiv.textContent = "⏳ Processing PDF...";

  const formData = new FormData();
  formData.append("pdf", fileInput.files[0]);

  try {
    const res = await fetch("/upload", {
      method: "POST",
      body: formData,
    });
    const data = await res.json();

    if (res.ok) {
      statusDiv.style.color = "green";
      statusDiv.textContent =
        `✅ Ready! ${data.chunks} chunks indexed.`;
      addMessage(
        "ai",
        "✅ PDF loaded successfully! Ask me anything."
      );
    } else {
      statusDiv.style.color = "red";
      statusDiv.textContent = "❌ " + data.error;
    }
  } catch (e) {
    statusDiv.style.color = "red";
    statusDiv.textContent = "❌ Upload failed.";
  }
}

What this does:

✅ Checks if a file is selected
✅ Updates UI status with color-coded messages
✅ Sends the PDF to /upload via FormData
✅ Handles success and failure responses
✅ Informs the user when the PDF is ready

❓ Asking Questions

Once the PDF is processed, the user can ask questions about it.

async function askQuestion() {
  const input = document.getElementById("questionInput");
  const askBtn = document.getElementById("askBtn");
  const question = input.value.trim();
  if (!question) return;

  addMessage("user", question);
  input.value = "";
  askBtn.disabled = true;

  const loadingId = addMessage("loading", "🤖 Thinking...");

  try {
    const res = await fetch("/ask", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ question }),
    });
    const data = await res.json();
    removeMessage(loadingId);

    if (res.ok) {
      addMessage("ai", data.answer);
    } else {
      addMessage("ai", "❌ " + data.error);
    }
  } catch (e) {
    removeMessage(loadingId);
    addMessage("ai", "❌ Request failed.");
  }

  askBtn.disabled = false;
  scrollToBottom();
}

What this does:

✅ Reads the user question
✅ Displays it in the chat as a user message
✅ Sends the question to the backend
✅ Shows a "🤖 Thinking..." message while waiting
✅ Displays the AI response
✅ Handles API errors gracefully
✅ Disables the ask button during processing

🛠️ Message Utility Functions

The UI uses helper functions to manage chat messages dynamically.

function addMessage(type, text) {
  const box = document.getElementById("messages");
  const div = document.createElement("div");
  const id = "msg_" + Date.now() + Math.random();
  div.id = id;
  div.className = `message ${type}`;
  div.textContent = text;
  box.appendChild(div);
  scrollToBottom();
  return id;
}

function removeMessage(id) {
  const el = document.getElementById(id);
  if (el) el.remove();
}

function scrollToBottom() {
  const box = document.getElementById("messages");
  box.scrollTop = box.scrollHeight;
}

What these do:

addMessage() — Creates a new message bubble in the chat
removeMessage() — Removes a specific message (used for loading indicators)
scrollToBottom() — Auto-scrolls the chat to show the latest message

🗑️ Clearing the Chat

The app supports resetting everything with a single click.

async function clearChat() {
  document.getElementById("messages").innerHTML =
    '<div class="message ai">🗑️ Chat cleared! Upload a PDF to start.</div>';
  document.getElementById("uploadStatus").textContent = "";
  await fetch("/clear", { method: "POST" });
}

This clears:

💬 Chat UI messages
✅ Upload status text
🧠 Backend vector store
📝 Backend chat history

🔧 Backend

Now let's look at the backend, which handles all the AI and document processing logic.

📦 Backend Dependencies

The server imports the following modules:

import "dotenv/config";

import express from "express";
import multer from "multer";
import cors from "cors";
import fs from "fs";
import path from "path";
import { fileURLToPath } from "url";

import { PDFLoader }
  from "@langchain/community/document_loaders/fs/pdf";
import { RecursiveCharacterTextSplitter }
  from "@langchain/textsplitters";
import { MemoryVectorStore }
  from "langchain/vectorstores/memory";
import { HuggingFaceTransformersEmbeddings }
  from "@langchain/community/embeddings/huggingface_transformers";
import { ChatOllama }
  from "@langchain/ollama";

Why These Packages Are Used

Package	Purpose
`express`	Server framework
`multer`	File upload handling
`cors`	Cross-origin request support
`fs`	File system operations
`dotenv`	Environment variable management
`PDFLoader`	Reads PDF content
`RecursiveCharacterTextSplitter`	Splits long text into chunks
`MemoryVectorStore`	Stores embeddings in memory
`HuggingFaceTransformersEmbeddings`	Generates embeddings locally
`ChatOllama`	Connects to the local Ollama model

🚀 Server Initialization

The app starts by setting up Express and middleware.

const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);

const app = express();
app.use(cors());
app.use(express.json());
app.use(express.static("public"));

const upload = multer({ dest: "uploads/" });

What this does:

✅ Enables CORS for cross-origin requests
✅ Parses JSON request bodies
✅ Serves static frontend files from the public directory
✅ Configures Multer to store uploads temporarily in the uploads/ folder

💾 Application State

Two in-memory variables are used to track the current session:

let vectorStore = null;
let chatHistory = [];

Variable	Purpose
`vectorStore`	Stores the indexed PDF chunks as embeddings
`chatHistory`	Stores recent conversation pairs for context continuity

Important note: Because these are stored in memory:

⚠️ They reset when the server restarts
✅ They are suitable for demos and prototypes
❌ They are not ideal for production persistence

🧠 Embedding Model Setup

The app uses a local Hugging Face embedding model:

const embeddings = new HuggingFaceTransformersEmbeddings({
  modelName: "Xenova/all-MiniLM-L6-v2",
});

Why This Model?

Xenova/all-MiniLM-L6-v2 is popular because it is:

✅ Lightweight and fast
✅ Runs locally without external API
✅ Good for semantic similarity tasks
✅ Suitable for local RAG prototypes
✅ Well-supported in LangChain

Its role is to convert text chunks into numerical vectors (embeddings) that capture the semantic meaning of the text.

🤖 LLM Setup with Ollama

For answer generation, the app uses Ollama with the llama3.2 model.

const llm = new ChatOllama({
  model: "llama3.2",
  temperature: 0,
  baseUrl: "http://127.0.0.1:11434",
});

Why Use Ollama?

Ollama allows you to run open-source LLMs locally.

✅ Local execution — no cloud dependency
✅ Privacy-friendly — your data stays on your machine
✅ No API costs
✅ Easy model switching
✅ Great for experimentation and learning

Why Temperature is 0?

A lower temperature makes the output more deterministic and focused, which is ideal for document Q&A where accuracy matters more than creativity.

📤 Upload Route: `POST /upload`

This route handles PDF uploading, text extraction, chunking, embedding, and indexing.

app.post("/upload", upload.single("pdf"), async (req, res) => {
  try {
    if (!req.file) {
      return res.status(400).json({ error: "No file uploaded" });
    }

    const filePath = req.file.path;
    chatHistory = [];

    console.log("📄 Loading PDF...");
    const loader = new PDFLoader(filePath);
    const docs = await loader.load();
    console.log(`   → Loaded ${docs.length} page(s)`);

    console.log("✂️  Splitting text...");
    const splitter = new RecursiveCharacterTextSplitter({
      chunkSize: 1000,
      chunkOverlap: 200,
    });
    const splits = await splitter.splitDocuments(docs);
    console.log(`   → Created ${splits.length} chunks`);

    console.log("🧠 Creating LOCAL embeddings...");
    vectorStore = await MemoryVectorStore.fromDocuments(
      splits,
      embeddings
    );

    fs.unlinkSync(filePath);

    console.log("✅ PDF processed successfully!");
    res.json({
      message: "✅ PDF processed successfully!",
      chunks: splits.length,
    });

  } catch (err) {
    console.error("❌ Upload error:", err.message);
    res.status(500).json({ error: "PDF processing failed" });
  }
});

🔍 Step-by-Step Breakdown of `/upload`

Step 1: Validate the Upload

If no file is uploaded, the server returns an error immediately.

if (!req.file) {
  return res.status(400).json({ error: "No file uploaded" });
}

Step 2: Reset Chat History

Uploading a new PDF starts a fresh document session.

chatHistory = [];

Step 3: Load the PDF

The PDF is parsed into LangChain document objects using PDFLoader.

const loader = new PDFLoader(filePath);
const docs = await loader.load();

Each page of the PDF becomes a separate document object containing:

pageContent — the text of the page
metadata — page number and source info

Step 4: Split the Text into Chunks

The content is divided into manageable chunks using RecursiveCharacterTextSplitter.

const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 1000,
  chunkOverlap: 200,
});
const splits = await splitter.splitDocuments(docs);

Step 5: Generate Embeddings and Create the Vector Store

Each chunk is embedded and stored for later retrieval.

vectorStore = await MemoryVectorStore.fromDocuments(
  splits,
  embeddings
);

Step 6: Delete the Uploaded File

The temporary file is removed after processing to save disk space.

fs.unlinkSync(filePath);

Step 7: Return Success Response

The server sends back the chunk count and a success message.

res.json({
  message: "✅ PDF processed successfully!",
  chunks: splits.length,
});

✂️ Why Chunking is Important

Chunking is a critical part of any RAG system.

If you pass an entire PDF into an LLM:

❌ It may exceed token limits
❌ Retrieval becomes inefficient
❌ Answers may become noisy or unfocused

I Built a RAG-Based PDF Reader Web App Using Node.js, LangChain, Ollama, and Hugging Face

💡 Live Idea of the Project

📦 GitHub Repository

🧩 What is RAG?

Why RAG is Useful

✨ Project Features

🛠️ Tech Stack

Frontend

Backend

AI / RAG Stack

Local Model Infrastructure

🤔 Why I Built This Project

🎯 Use Cases

🎓 For Students

🔬 For Researchers

💻 For Developers

🏢 For Businesses

⚙️ How the Application Works

🔄 High-Level Flow

🖥️ Frontend

📝 Frontend HTML Structure

1. Upload Section

2. Upload Status

3. Chat Box

🎨 Frontend CSS Styling

CSS Design Highlights

⚡ Frontend JavaScript Logic

📤 Uploading a PDF

❓ Asking Questions

🛠️ Message Utility Functions

🗑️ Clearing the Chat

🔧 Backend

📦 Backend Dependencies

Why These Packages Are Used

🚀 Server Initialization

💾 Application State

🧠 Embedding Model Setup

Why This Model?

🤖 LLM Setup with Ollama

Why Use Ollama?

Why Temperature is 0?

📤 Upload Route: POST /upload

🔍 Step-by-Step Breakdown of /upload

Step 1: Validate the Upload

Step 2: Reset Chat History

Step 3: Load the PDF

Step 4: Split the Text into Chunks

Step 5: Generate Embeddings and Create the Vector Store

Step 6: Delete the Uploaded File

Step 7: Return Success Response

✂️ Why Chunking is Important

📤 Upload Route: `POST /upload`

🔍 Step-by-Step Breakdown of `/upload`