DEV Community

Cover image for ๐Ÿ›‘ Stop Re-Writing RAG Pipelines: My Next.js + Pinecone Architecture โšก๏ธ
Atul Tripathi
Atul Tripathi

Posted on

๐Ÿ›‘ Stop Re-Writing RAG Pipelines: My Next.js + Pinecone Architecture โšก๏ธ

Let's be real for a second. โ˜•๏ธ

I have built the "Chat with your PDF" feature for clients about five times in the last two months.

The frontend? Fun. streaming UI, Tailwind, those fancy typing effects... I love it. ๐ŸŽจ
The backend? Absolute headache. ๐Ÿคฏ

Every single time, I found myself staring at VS Code, copying and pasting the same boring boilerplate to handle:

โŒ The PDF Loader: Chunking text without breaking sentences mid-thought.
โŒ The Embeddings: Batching data to OpenAI so I don't hit rate limits.
โŒ The Vector Store: Upserting into Pinecone/Supabase without errors.
โŒ The Context Window: Calculating tokens so the AI doesn't crash.

After the 5th time, I realized I was wasting 40+ hours per project just setting up the "plumbing" before I could actually build the cool stuff.

So, I decided to fix it. Forever. ๐Ÿ› ๏ธ

๐Ÿ—๏ธ The Architecture

Here is the stack I finally settled on for a production-ready RAG (Retrieval Augmented Generation) app. It's typed, it scales, and it just works.

  • โ–ฒ Framework: Next.js (App Router)
  • ๐Ÿฆœ Orchestration: LangChain.js
  • ๐ŸŒฒ Vector DB: Pinecone (My go-to for speed)
  • โšก๏ธ Database/Auth: Supabase
  • ๐Ÿ’… Styling: Tailwind CSS

๐Ÿง  The Hard Part: Handling Vectors

The biggest pain point isn't the chat; it's the Ingestion Pipeline. You can't just dump a PDF into ChatGPT. You have to slice and dice it first. ๐Ÿ”ช

Here is the logic I standardized:

  1. User uploads file ๐Ÿ“‚
  2. Server reads buffer ๐Ÿ‘“
  3. RecursiveCharacterTextSplitter breaks it into chunks ๐Ÿงฉ
  4. Upsert to Pinecone with metadata ๐Ÿ’พ

It looks something like this (simplified for sanity):

import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { PineconeStore } from "langchain/vectorstores/pinecone";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";

// ๐Ÿช„ The logic that usually takes 3 hours to debug
export async function addDocumentsToVectorStore(text, fileId) {

  // 1. Split the text intelligently
  const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 1000,
    chunkOverlap: 200,
  });

  const docs = await splitter.createDocuments([text]);

  // 2. Embed and Upsert to the Cloud โ˜๏ธ
  const embeddings = new OpenAIEmbeddings();

  await PineconeStore.fromDocuments(docs, embeddings, {
    pineconeIndex: index,
    namespace: `file_${fileId}`, // ๐Ÿ”’ Isolate vectors per file
  });

  console.log("Vectors Upserted. We are ready to chat. โšก๏ธ");
}
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ฆ So, I Productized It.

I got tired of setting this up from scratch.

I took my personal repo, cleaned it up, added a polished UI, and turned it into a reusable Starter Kit.

What's inside the box?

โœ… Pre-configured LangChain Setup

โœ… Built-in Vector Ingestion (PDF/TXT/MD)

โœ… Streaming Chat Components (ChatGPT style)

โœ… Rate Limiting (Save your API credits!)

๐Ÿงช The Experiment (Smoke Test)

I'm running a little experiment this weekend.

Instead of charging the full launch price ($149), I set up a $9 Early Bird Deposit.

Why $9? It acts as a "Skin in the Game" filter. If the pain of building RAG pipelines isn't worth the price of a coffee โ˜•๏ธ, then I know this isn't worth building further.

But if you want to save 40 hours of work?

๐Ÿ‘‰ Grab the RAG Starter Kit (Early Access)

(Even if you don't buy it, feel free to roast my landing page in the comments! I'm building this in public and feedback is gold. ๐Ÿฅ‡)

Happy Coding! ๐Ÿš€

Top comments (0)