Abayomi Olatunji

Posted on Jun 1

Building an AI Assistant with Ollama and Next.js – Part 3 (RAG with LangChain, Pinecone and Ollama)

🚨 This is Part 3 of the “Building an AI Assistant with Ollama and Next.js” series.

👉 Check out Part 1 here

👉 Check out Part 2 here

🤖 Introduction

In the previous parts, we covered how to set up an AI assistant locally using Ollama, Next.js, and different package integrations. In this part, we’re diving deeper into building a Knowledge-Based AI assistant using RAG (Retrieval-Augmented Generation) with LangChain, Ollama, and Pinecone.

We’ll walk through how to:

Load and preprocess documents
Split and embed them into vector space
Store the embeddings in Pinecone
Query these vectors for smart retrieval

🔧 Tools Used

Next.js
TailwindCSS
Cursor IDE
Ollama
LangChain
Pinecone Vector Database
PDF-Parse, Mammoth.js for document reading

📘 What is RAG?

RAG stands for Retrieval-Augmented Generation. It’s a hybrid AI approach that improves response accuracy by combining:

Retrieval: Searches for relevant documents or chunks from a knowledge base.
Generation: Uses a language model (like Gemma or LLaMA) to generate natural responses based on the retrieved content.

🔁 Flow Summary

Load files (PDF, DOCX, TXT)
Split them into readable chunks
Embed those chunks into vector representations
Store them in Pinecone
Query Pinecone using user input and generate context-aware answers

You can read more on this on the LangChain documentation: https://js.langchain.com/docs/tutorials/rag/

🧩 Key Packages and Docs for further reading

Package	Use	Docs
`langchain`	Framework for chaining LLMs with tools	Docs
`@pinecone-database/pinecone`	Pinecone client	Docs
`@langchain/pinecone`	LangChain-Pinecone integration	Docs
`@langchain/community/embeddings/ollama`	Ollama embeddings for LangChain	Docs
`pdf-parse`, `mammoth`	For loading and reading PDFs, DOCX, and TXT	pdf-parse, mammoth

🧰 Tool Setup Overview

🔧 1. Setting Up Pinecone

Create an account on Pinecone
Create an Index with the following settings:
- Name: e.g., database_name
- Vector Type: Dense
- Dimension: 1024 (must match mxbai-embed-large)
- Metric: Cosine
- Environment: us-east-1-aws

You can select existing embedding models available on the setup options, I choose to use the custom setting so that it aligns with the model I'm using on the project. i.e mxbai-embed-large

🛠 2. Configure `.env`

Add these to your .env.local:

PINECONE_API_KEY=your-api-key
PINECONE_INDEX_NAME=database_name
PINECONE_ENVIRONMENT=us-east-1-aws
OLLAMA_MODEL=gemma3:1b

🚀 3. Launch Ollama and Embedding Model

Make sure Ollama is installed and run this model in your terminal (you can use any LLM model of your choice):

ollama run gemma3:1b

Install embedding model with:

ollama pull mxbai-embed-large

LangChain will reference this locally using Ollama via:

new OllamaEmbeddings({
  model: 'mxbai-embed-large',
  baseUrl: 'http://localhost:11434'
});

Note: you can check more models in https://js.langchain.com/docs/integrations/chat/ and https://ollama.com/search. Also, you explore other embedding models in https://js.langchain.com/docs/integrations/text_embedding/

🧪 How It Works – Step by Step

Here, I will be giving a breakdown of what we're trying to achieve, followed by the code snippet to use.

Step 1: Upload and Process Document

User uploads .pdf, .docx, or .txt.
We load the file using langchain loaders.
The text is split into chunks using RecursiveCharacterTextSplitter.
Chunks are returned as an array of LangChain Document objects.

Step 2: Embed and Store in Pinecone

Chunks are embedded via OllamaEmbeddings using mxbai-embed-large.
Vectors are stored in the Pinecone vector index under a namespace.

Step 3: Query for Context

When a user types a question, we run a vector similarity search.
Relevant chunks are retrieved from Pinecone.
Chunks are combined into a context block.
The context is injected into the prompt as a system message for the LLM.

utils/documentProcessing.ts

import { OllamaEmbeddings } from '@langchain/community/embeddings/ollama';
import { Document } from '@langchain/core/documents';
import { PineconeStore } from '@langchain/pinecone';
import { Pinecone } from '@pinecone-database/pinecone';
import { DocxLoader } from 'langchain/document_loaders/fs/docx';
import { PDFLoader } from 'langchain/document_loaders/fs/pdf';
import { TextLoader } from 'langchain/document_loaders/fs/text';
import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter';

const pinecone = new Pinecone({ apiKey: process.env.PINECONE_API_KEY! });
const embeddings = new OllamaEmbeddings({ model: 'mxbai-embed-large', baseUrl: 'http://localhost:11434' });
const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000, chunkOverlap: 200 });

export async function processDocument(file: File | Blob, fileName: string): Promise<Document[]> {
  let documents: Document[];
  if (fileName.endsWith('.pdf')) documents = await new PDFLoader(file).load();
  else if (fileName.endsWith('.docx')) documents = await new DocxLoader(file).load();
  else if (fileName.endsWith('.txt')) documents = await new TextLoader(file).load();
  else throw new Error('Unsupported file type');

  return await textSplitter.splitDocuments(documents);
}

export async function storeDocuments(documents: Document[]): Promise<void> {
  const pineconeIndex = pinecone.Index(process.env.PINECONE_INDEX_NAME!);
  await PineconeStore.fromDocuments(documents, embeddings, {
    pineconeIndex,
    maxConcurrency: 5,
    namespace: 'your_namespace', //optional
  });
}

export async function queryDocuments(query: string): Promise<Document[]> {
  const pineconeIndex = pinecone.Index(process.env.PINECONE_INDEX_NAME!);
  const vectorStore = await PineconeStore.fromExistingIndex(embeddings, {
    pineconeIndex,
    maxConcurrency: 5,
    namespace: 'your_namespace', //optional
  });

  return await vectorStore.similaritySearch(query, 4);
}

api/chat/upload/route.ts

import { processDocument, storeDocuments } from '@/utils/documentProcessing';
import { NextResponse } from 'next/server';

export async function POST(req: Request) {
  const formData = await req.formData();
  const file = formData.get('file') as File;
  if (!file) return NextResponse.json({ error: 'No file provided' }, { status: 400 });

  const documents = await processDocument(file, file.name);
  await storeDocuments(documents);

  return NextResponse.json({
    message: 'Document processed and stored successfully',
    fileName: file.name,
    documentCount: documents.length
  });
}

api/chat/route.ts

import { queryDocuments } from '@/utils/documentProcessing';
import { Message, streamText } from 'ai';
import { NextRequest } from 'next/server';
import { createOllama } from 'ollama-ai-provider';

const ollama = createOllama();
const MODEL_NAME = process.env.OLLAMA_MODEL || 'gemma3:1b';

export async function POST(req: NextRequest) {
  const { messages } = await req.json();
  const lastMessage = messages[messages.length - 1];
  const relevantDocs = await queryDocuments(lastMessage.content);

  const context = relevantDocs.map((doc) => doc.pageContent).join('\n\n');
  const systemMessage: Message = {
    id: 'system',
    role: 'system',
    content: `You are a helpful AI assistant with access to a knowledge base. 
    Use the following context to answer the user's questions:\n\n${context}`,
  };

  const promptMessages = [systemMessage, ...messages];
  const result = await streamText({
    model: ollama(MODEL_NAME),
    messages: promptMessages
  });

  return result.toDataStreamResponse();
}

For the UI part, here are the code snippet

ChatInput.tsx

'use client'
interface ChatInputProps {
  input: string;
  handleInputChange: (e: React.ChangeEvent<HTMLTextAreaElement>) => void;
  handleSubmit: (e: React.FormEvent<HTMLFormElement>) => void;
  isLoading: boolean;
}

export default function ChatInput({ input, handleInputChange, handleSubmit, isLoading }: ChatInputProps) {

  return (
    <form onSubmit={handleSubmit} className="flex gap-4">
      <textarea
        value={input}
        onChange={handleInputChange}
        placeholder="Ask a question about the documents..."
        className="flex-1 p-4 border border-gray-200 dark:border-gray-700 rounded-xl 
          bg-white dark:bg-gray-800 
          placeholder-gray-400 dark:placeholder-gray-500
          focus:outline-none focus:ring-2 focus:ring-blue-500 dark:focus:ring-blue-400
          resize-none min-h-[50px] max-h-32
          text-gray-700 dark:text-gray-200"
        rows={1}
        required
        disabled={isLoading}
      />
      <button
        type="submit"
        disabled={isLoading}
        className={`px-6 py-2 rounded-xl font-medium transition-all duration-200
          ${isLoading 
            ? 'bg-gray-100 dark:bg-gray-700 text-gray-400 dark:text-gray-500 cursor-not-allowed'
            : 'bg-blue-500 hover:bg-blue-600 active:bg-blue-700 text-white shadow-sm hover:shadow'
          }`}
      >
        {isLoading ? (
          <span className="flex items-center gap-2">
            <svg className="animate-spin h-4 w-4" viewBox="0 0 24 24">
              <circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" fill="none"/>
              <path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"/>
            </svg>
            Processing
          </span>
        ) : 'Send'}
      </button>
    </form>
  );
}

ChatMessage.tsx

'use client'
import { Message } from 'ai';
import ReactMarkdown from 'react-markdown';

interface ChatMessageProps {
  message: Message;
}

export default function ChatMessage({ message }: ChatMessageProps) {
  return (
    <div
      className={`flex items-start gap-4 p-6 rounded-2xl shadow-sm transition-colors ${
        message.role === 'assistant'
          ? 'bg-white dark:bg-gray-800 border border-gray-100 dark:border-gray-700'
          : 'bg-blue-50 dark:bg-blue-900/30 border border-blue-100 dark:border-blue-800'
      }`}
    >
      <div className={`w-8 h-8 rounded-full flex items-center justify-center flex-shrink-0 ${
        message.role === 'assistant'
          ? 'bg-purple-100 text-purple-600 dark:bg-purple-900 dark:text-purple-300'
          : 'bg-blue-100 text-blue-600 dark:bg-blue-900 dark:text-blue-300'
      }`}>
        {message.role === 'assistant' ? '🤖' : '👤'}
      </div>
      <div className="flex-1 min-w-0">
        <div className="font-medium text-sm mb-2 text-gray-700 dark:text-gray-300">
          {message.role === 'assistant' ? 'AI Assistant' : 'You'}
        </div>
        <div className="prose dark:prose-invert prose-sm max-w-none">
          <ReactMarkdown>{message.content}</ReactMarkdown>
        </div>
      </div>
    </div>
  );
}

FileUpload.tsx

"use client"
import React, { useState } from 'react';

export default function FileUpload() {
  const [isUploading, setIsUploading] = useState(false);
  const [message, setMessage] = useState('');
  const [error, setError] = useState('');

  const handleFileUpload = async (e: React.ChangeEvent<HTMLInputElement>) => {
    const file = e.target.files?.[0];
    if (!file) return;

    // Reset states
    setMessage('');
    setError('');
    setIsUploading(true);

    try {
      const formData = new FormData();
      formData.append('file', file);

      const response = await fetch('/api/chat/upload', {
        method: 'POST',
        body: formData,
      });

      const data = await response.json();

      if (!response.ok) {
        throw new Error(data.error || 'Error uploading file');
      }

      setMessage(`Successfully uploaded ${file.name}`);
    } catch (err) {
      setError(err instanceof Error ? err.message : 'Error uploading file');
    } finally {
      setIsUploading(false);
    }
  };

  return (
    <div className="mb-6">
      <div className="flex flex-col sm:flex-row items-center gap-4">
        <label
          className={`flex items-center gap-2 px-6 py-3 rounded-xl border-2 border-dashed
            transition-all duration-200 cursor-pointer
            ${isUploading 
              ? 'border-gray-300 bg-gray-50 dark:border-gray-700 dark:bg-gray-800/50'
              : 'border-blue-300 hover:border-blue-400 hover:bg-blue-50 dark:border-blue-700 dark:hover:border-blue-600 dark:hover:bg-blue-900/30'
            }`}
        >
          <svg 
            className={`w-5 h-5 ${isUploading ? 'text-gray-400' : 'text-blue-500'}`} 
            fill="none" 
            stroke="currentColor" 
            viewBox="0 0 24 24"
          >
            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M4 16v1a3 3 0 003 3h10a3 3 0 003-3v-1m-4-8l-4-4m0 0L8 8m4-4v12" />
          </svg>
          <span className={`font-medium ${isUploading ? 'text-gray-400' : 'text-blue-500'}`}>
            {isUploading ? 'Uploading...' : 'Upload Document'}
          </span>
          <input
            type="file"
            className="hidden"
            accept=".pdf,.docx"
            onChange={handleFileUpload}
            disabled={isUploading}
          />
        </label>
        <span className="text-sm text-gray-500 dark:text-gray-400 flex items-center gap-2">
          <svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
          </svg>
          Supported: PDF, DOCX
        </span>
      </div>

      {message && (
        <div className="mt-4 p-4 bg-green-50 dark:bg-green-900/30 rounded-xl border border-green-100 dark:border-green-800">
          <p className="text-sm text-green-600 dark:text-green-400 flex items-center gap-2">
            <svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
              <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M5 13l4 4L19 7" />
            </svg>
            {message}
          </p>
        </div>
      )}
      {error && (
        <div className="mt-4 p-4 bg-red-50 dark:bg-red-900/30 rounded-xl border border-red-100 dark:border-red-800">
          <p className="text-sm text-red-600 dark:text-red-400 flex items-center gap-2">
            <svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
              <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M12 8v4m0 4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
            </svg>
            {error}
          </p>
        </div>
      )}
    </div>
  );
}

ChatPage.tsx

"use client"
import { useChat } from 'ai/react';

import ChatInput from './ChatInput';
import ChatMessage from './ChatMessage';
import FileUpload from './FileUpload';

export default function ChatPage() {
  const { input, messages, handleInputChange, handleSubmit, isLoading } = useChat({
    api: '/api/chat',
    onError: (error) => {
      console.error('Chat error:', error);
      alert('Error: ' + error.message);
    }
  });

  return (
    <div className="flex flex-col h-screen bg-gray-50 dark:bg-gray-900">
      <div className="flex-1 max-w-5xl mx-auto w-full p-4 md:p-6 lg:p-8">
        <div className="flex-1 overflow-y-auto mb-4 space-y-6">
          <h1 className="text-3xl font-bold text-gray-900 dark:text-white text-center mb-8">
            RAG-Powered Knowledge Base Chat
          </h1>
          <div className="bg-white dark:bg-gray-800 rounded-xl shadow-lg p-6">
            <FileUpload />
          </div>
          <div className="space-y-6">
            {messages.map((message) => (
              <ChatMessage key={message.id} message={message} />
            ))}
          </div>
        </div>
        <div className="sticky bottom-0 bg-white dark:bg-gray-800 rounded-xl shadow-lg p-4">
          <ChatInput 
            input={input} 
            handleInputChange={handleInputChange} 
            handleSubmit={handleSubmit} 
            isLoading={isLoading}
          />
        </div>
      </div>
    </div>
  );
}

Viola! You're reading to run your code

npm run dev

Click the Upload document button to upload the document you want to store. Once the upload is successful, your Pinecone dashboard will look like this:

With the document loaded, you can ask your AI Assistant questions relating to the content in the document and you will get the correct response. Here is a screenshot of my test:

Happy Coding 😎! Feel free to share your experience and feedbacks too. Cheers!

DEV Community

Building an AI Assistant with Ollama and Next.js – Part 3 (RAG with LangChain, Pinecone and Ollama)

🤖 Introduction

🔧 Tools Used

📘 What is RAG?

🔁 Flow Summary

🧩 Key Packages and Docs for further reading

🧰 Tool Setup Overview

🔧 1. Setting Up Pinecone

🛠 2. Configure `.env`

🚀 3. Launch Ollama and Embedding Model

🧪 How It Works – Step by Step

Step 1: Upload and Process Document

Step 2: Embed and Store in Pinecone

Step 3: Query for Context

Top comments (0)

🤖 Introduction

🔧 Tools Used

📘 What is RAG?

🔁 Flow Summary

🧩 Key Packages and Docs for further reading

🧰 Tool Setup Overview

🔧 1. Setting Up Pinecone

🛠 2. Configure .env

🚀 3. Launch Ollama and Embedding Model

🧪 How It Works – Step by Step

Step 1: Upload and Process Document

Step 2: Embed and Store in Pinecone

Step 3: Query for Context

🛠 2. Configure `.env`