# Building a RAG Pipeline in the Browser (No Vector DB Required)

#ai #javascript #rag #webdev

A few weeks ago I built StudyLens, an AI study assistant where you upload a PDF and get answers, flashcards, and summaries back. The interesting part wasn’t the UI or the Claude API integration. It was figuring out how to do retrieval entirely in the browser, for free, without hitting an embeddings API.

Here’s how it works and why I made the choices I did.

The Problem With Embeddings

The standard RAG setup goes like this: chunk your documents, embed each chunk using something like OpenAI’s embeddings API, store vectors in a database, then at query time embed the user’s question and find the closest chunks by cosine similarity.

That works great. But for a client-side app with no backend infrastructure, it has a real problem. Every chunk needs an API call to embed. A 20-page PDF might give you 50+ chunks. That’s 50+ API calls just to index one document, before the user has even asked a question. It adds latency, it adds cost, and it requires a server to proxy those calls securely.

For a study tool where the documents are lecture notes and textbooks, I figured there was a simpler way.

TF-IDF Is Actually Fine Here

TF-IDF stands for Term Frequency-Inverse Document Frequency. It’s a classic information retrieval technique from before neural networks were a thing. The idea is simple: a word is important to a document if it appears frequently in that document but rarely across all documents.

For study notes, this works surprisingly well. When a student asks “what is photosynthesis”, the chunks that actually talk about photosynthesis will score high because that word appears a lot in those chunks and not much elsewhere. The vocabulary overlap between questions and notes is high enough that semantic similarity is less important than it would be for, say, a general web search.

So instead of embeddings, StudyLens builds a TF-IDF index entirely in the browser using JavaScript. No API calls, no latency, no cost. The whole index is built the moment the PDF is parsed.

How the Pipeline Actually Works

The PDF is uploaded and parsed using PDF.js
The extracted text is split into chunks of around 400 words, with 80 word overlap between chunks so context doesn’t get cut off at boundaries
A TF-IDF index is built over all the chunks in-browser
When the user asks a question, the query is scored against every chunk using cosine-weighted term frequency
The top 4 chunks are pulled out and injected into the Claude API prompt as context
Claude generates an answer grounded in those passages

For PDFs that are image-heavy or scanned, like handwritten notes or formatted slides, there’s an OCR fallback. Pages get rendered to a canvas element, converted to base64, and sent to Claude’s vision API for extraction before indexing.

The only backend is a single Vercel Edge Function that proxies requests to the Anthropic API so the API key never touches client code.

What Works and What Doesn’t

For dense, text-heavy study material TF-IDF retrieval holds up well. Lecture notes, textbook chapters, written summaries all work great.

Where it struggles is with conceptual or paraphrased questions. If a student uploads notes that say “mitochondria produce ATP through oxidative phosphorylation” and then asks “how do cells make energy”, TF-IDF won’t necessarily connect those because the vocabulary doesn’t overlap. Embeddings would handle that easily since they capture semantic meaning rather than just term frequency.

The other limitation is scale. TF-IDF in the browser is fast for a single document but would get slow with large corpora. For multi-document support, you’d want a proper vector store.

What I’d Do Differently

The obvious next step is replacing TF-IDF with embedding-based retrieval. Something like storing chunks in a lightweight vector DB and doing approximate nearest neighbor search at query time would make the retrieval significantly more robust, especially for conceptual questions.

I’d also add multi-document support so you can index an entire course folder rather than one PDF at a time, and spaced repetition for the flashcards so they actually help with long-term retention rather than just cramming.

Try It

The app is live at studylens-theta.vercel.app. No sign-up, no API key needed. Drop in any PDF and it works immediately.

The whole thing is a proof of concept but it’s a useful one. TF-IDF retrieval in the browser is a genuine alternative to embeddings for constrained environments, and I think it’s underused in the RAG space where everyone defaults to vector databases from day one.