🚨 This is Part 3 of the “Building an AI Assistant with Ollama and Next.js” series.
👉 Check out Part 1 here
👉 Check out Part 2 here
🤖 Introduction
In the previous parts, we covered how to set up an AI assistant locally using Ollama, Next.js, and different package integrations. In this part, we’re diving deeper into building a Knowledge-Based AI assistant using RAG (Retrieval-Augmented Generation) with LangChain, Ollama, and Pinecone.
We’ll walk through how to:
- Load and preprocess documents
- Split and embed them into vector space
- Store the embeddings in Pinecone
- Query these vectors for smart retrieval
🔧 Tools Used
- Next.js
- TailwindCSS
- Cursor IDE
- Ollama
- LangChain
- Pinecone Vector Database
- PDF-Parse, Mammoth.js for document reading
📘 What is RAG?
RAG stands for Retrieval-Augmented Generation. It’s a hybrid AI approach that improves response accuracy by combining:
- Retrieval: Searches for relevant documents or chunks from a knowledge base.
- Generation: Uses a language model (like Gemma or LLaMA) to generate natural responses based on the retrieved content.
🔁 Flow Summary
- Load files (PDF, DOCX, TXT)
- Split them into readable chunks
- Embed those chunks into vector representations
- Store them in Pinecone
- Query Pinecone using user input and generate context-aware answers
You can read more on this on the LangChain documentation: https://js.langchain.com/docs/tutorials/rag/
🧩 Key Packages and Docs for further reading
Package | Use | Docs |
---|---|---|
langchain |
Framework for chaining LLMs with tools | Docs |
@pinecone-database/pinecone |
Pinecone client | Docs |
@langchain/pinecone |
LangChain-Pinecone integration | Docs |
@langchain/community/embeddings/ollama |
Ollama embeddings for LangChain | Docs |
pdf-parse , mammoth
|
For loading and reading PDFs, DOCX, and TXT | pdf-parse, mammoth |
🧰 Tool Setup Overview
🔧 1. Setting Up Pinecone
- Create an account on Pinecone
- Create an Index with the following settings:
-
Name: e.g.,
database_name
-
Vector Type:
Dense
-
Dimension:
1024
(must matchmxbai-embed-large
) -
Metric:
Cosine
-
Environment:
us-east-1-aws
-
Name: e.g.,
You can select existing embedding models available on the setup options, I choose to use the custom setting so that it aligns with the model I'm using on the project. i.e mxbai-embed-large
🛠 2. Configure .env
Add these to your .env.local
:
PINECONE_API_KEY=your-api-key
PINECONE_INDEX_NAME=database_name
PINECONE_ENVIRONMENT=us-east-1-aws
OLLAMA_MODEL=gemma3:1b
🚀 3. Launch Ollama and Embedding Model
Make sure Ollama is installed and run this model in your terminal (you can use any LLM model of your choice):
ollama run gemma3:1b
Install embedding model with:
ollama pull mxbai-embed-large
LangChain will reference this locally using Ollama via:
new OllamaEmbeddings({
model: 'mxbai-embed-large',
baseUrl: 'http://localhost:11434'
});
Note: you can check more models in https://js.langchain.com/docs/integrations/chat/ and https://ollama.com/search. Also, you explore other embedding models in https://js.langchain.com/docs/integrations/text_embedding/
🧪 How It Works – Step by Step
Here, I will be giving a breakdown of what we're trying to achieve, followed by the code snippet to use.
Step 1: Upload and Process Document
- User uploads .pdf, .docx, or .txt.
- We load the file using langchain loaders.
- The text is split into chunks using RecursiveCharacterTextSplitter.
- Chunks are returned as an array of LangChain Document objects.
Step 2: Embed and Store in Pinecone
- Chunks are embedded via OllamaEmbeddings using mxbai-embed-large.
- Vectors are stored in the Pinecone vector index under a namespace.
Step 3: Query for Context
- When a user types a question, we run a vector similarity search.
- Relevant chunks are retrieved from Pinecone.
- Chunks are combined into a context block.
- The context is injected into the prompt as a system message for the LLM.
utils/documentProcessing.ts
import { OllamaEmbeddings } from '@langchain/community/embeddings/ollama';
import { Document } from '@langchain/core/documents';
import { PineconeStore } from '@langchain/pinecone';
import { Pinecone } from '@pinecone-database/pinecone';
import { DocxLoader } from 'langchain/document_loaders/fs/docx';
import { PDFLoader } from 'langchain/document_loaders/fs/pdf';
import { TextLoader } from 'langchain/document_loaders/fs/text';
import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter';
const pinecone = new Pinecone({ apiKey: process.env.PINECONE_API_KEY! });
const embeddings = new OllamaEmbeddings({ model: 'mxbai-embed-large', baseUrl: 'http://localhost:11434' });
const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000, chunkOverlap: 200 });
export async function processDocument(file: File | Blob, fileName: string): Promise<Document[]> {
let documents: Document[];
if (fileName.endsWith('.pdf')) documents = await new PDFLoader(file).load();
else if (fileName.endsWith('.docx')) documents = await new DocxLoader(file).load();
else if (fileName.endsWith('.txt')) documents = await new TextLoader(file).load();
else throw new Error('Unsupported file type');
return await textSplitter.splitDocuments(documents);
}
export async function storeDocuments(documents: Document[]): Promise<void> {
const pineconeIndex = pinecone.Index(process.env.PINECONE_INDEX_NAME!);
await PineconeStore.fromDocuments(documents, embeddings, {
pineconeIndex,
maxConcurrency: 5,
namespace: 'your_namespace', //optional
});
}
export async function queryDocuments(query: string): Promise<Document[]> {
const pineconeIndex = pinecone.Index(process.env.PINECONE_INDEX_NAME!);
const vectorStore = await PineconeStore.fromExistingIndex(embeddings, {
pineconeIndex,
maxConcurrency: 5,
namespace: 'your_namespace', //optional
});
return await vectorStore.similaritySearch(query, 4);
}
api/chat/upload/route.ts
import { processDocument, storeDocuments } from '@/utils/documentProcessing';
import { NextResponse } from 'next/server';
export async function POST(req: Request) {
const formData = await req.formData();
const file = formData.get('file') as File;
if (!file) return NextResponse.json({ error: 'No file provided' }, { status: 400 });
const documents = await processDocument(file, file.name);
await storeDocuments(documents);
return NextResponse.json({
message: 'Document processed and stored successfully',
fileName: file.name,
documentCount: documents.length
});
}
api/chat/route.ts
import { queryDocuments } from '@/utils/documentProcessing';
import { Message, streamText } from 'ai';
import { NextRequest } from 'next/server';
import { createOllama } from 'ollama-ai-provider';
const ollama = createOllama();
const MODEL_NAME = process.env.OLLAMA_MODEL || 'gemma3:1b';
export async function POST(req: NextRequest) {
const { messages } = await req.json();
const lastMessage = messages[messages.length - 1];
const relevantDocs = await queryDocuments(lastMessage.content);
const context = relevantDocs.map((doc) => doc.pageContent).join('\n\n');
const systemMessage: Message = {
id: 'system',
role: 'system',
content: `You are a helpful AI assistant with access to a knowledge base.
Use the following context to answer the user's questions:\n\n${context}`,
};
const promptMessages = [systemMessage, ...messages];
const result = await streamText({
model: ollama(MODEL_NAME),
messages: promptMessages
});
return result.toDataStreamResponse();
}
For the UI part, here are the code snippet
ChatInput.tsx
'use client'
interface ChatInputProps {
input: string;
handleInputChange: (e: React.ChangeEvent<HTMLTextAreaElement>) => void;
handleSubmit: (e: React.FormEvent<HTMLFormElement>) => void;
isLoading: boolean;
}
export default function ChatInput({ input, handleInputChange, handleSubmit, isLoading }: ChatInputProps) {
return (
<form onSubmit={handleSubmit} className="flex gap-4">
<textarea
value={input}
onChange={handleInputChange}
placeholder="Ask a question about the documents..."
className="flex-1 p-4 border border-gray-200 dark:border-gray-700 rounded-xl
bg-white dark:bg-gray-800
placeholder-gray-400 dark:placeholder-gray-500
focus:outline-none focus:ring-2 focus:ring-blue-500 dark:focus:ring-blue-400
resize-none min-h-[50px] max-h-32
text-gray-700 dark:text-gray-200"
rows={1}
required
disabled={isLoading}
/>
<button
type="submit"
disabled={isLoading}
className={`px-6 py-2 rounded-xl font-medium transition-all duration-200
${isLoading
? 'bg-gray-100 dark:bg-gray-700 text-gray-400 dark:text-gray-500 cursor-not-allowed'
: 'bg-blue-500 hover:bg-blue-600 active:bg-blue-700 text-white shadow-sm hover:shadow'
}`}
>
{isLoading ? (
<span className="flex items-center gap-2">
<svg className="animate-spin h-4 w-4" viewBox="0 0 24 24">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" fill="none"/>
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"/>
</svg>
Processing
</span>
) : 'Send'}
</button>
</form>
);
}
ChatMessage.tsx
'use client'
import { Message } from 'ai';
import ReactMarkdown from 'react-markdown';
interface ChatMessageProps {
message: Message;
}
export default function ChatMessage({ message }: ChatMessageProps) {
return (
<div
className={`flex items-start gap-4 p-6 rounded-2xl shadow-sm transition-colors ${
message.role === 'assistant'
? 'bg-white dark:bg-gray-800 border border-gray-100 dark:border-gray-700'
: 'bg-blue-50 dark:bg-blue-900/30 border border-blue-100 dark:border-blue-800'
}`}
>
<div className={`w-8 h-8 rounded-full flex items-center justify-center flex-shrink-0 ${
message.role === 'assistant'
? 'bg-purple-100 text-purple-600 dark:bg-purple-900 dark:text-purple-300'
: 'bg-blue-100 text-blue-600 dark:bg-blue-900 dark:text-blue-300'
}`}>
{message.role === 'assistant' ? '🤖' : '👤'}
</div>
<div className="flex-1 min-w-0">
<div className="font-medium text-sm mb-2 text-gray-700 dark:text-gray-300">
{message.role === 'assistant' ? 'AI Assistant' : 'You'}
</div>
<div className="prose dark:prose-invert prose-sm max-w-none">
<ReactMarkdown>{message.content}</ReactMarkdown>
</div>
</div>
</div>
);
}
FileUpload.tsx
"use client"
import React, { useState } from 'react';
export default function FileUpload() {
const [isUploading, setIsUploading] = useState(false);
const [message, setMessage] = useState('');
const [error, setError] = useState('');
const handleFileUpload = async (e: React.ChangeEvent<HTMLInputElement>) => {
const file = e.target.files?.[0];
if (!file) return;
// Reset states
setMessage('');
setError('');
setIsUploading(true);
try {
const formData = new FormData();
formData.append('file', file);
const response = await fetch('/api/chat/upload', {
method: 'POST',
body: formData,
});
const data = await response.json();
if (!response.ok) {
throw new Error(data.error || 'Error uploading file');
}
setMessage(`Successfully uploaded ${file.name}`);
} catch (err) {
setError(err instanceof Error ? err.message : 'Error uploading file');
} finally {
setIsUploading(false);
}
};
return (
<div className="mb-6">
<div className="flex flex-col sm:flex-row items-center gap-4">
<label
className={`flex items-center gap-2 px-6 py-3 rounded-xl border-2 border-dashed
transition-all duration-200 cursor-pointer
${isUploading
? 'border-gray-300 bg-gray-50 dark:border-gray-700 dark:bg-gray-800/50'
: 'border-blue-300 hover:border-blue-400 hover:bg-blue-50 dark:border-blue-700 dark:hover:border-blue-600 dark:hover:bg-blue-900/30'
}`}
>
<svg
className={`w-5 h-5 ${isUploading ? 'text-gray-400' : 'text-blue-500'}`}
fill="none"
stroke="currentColor"
viewBox="0 0 24 24"
>
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M4 16v1a3 3 0 003 3h10a3 3 0 003-3v-1m-4-8l-4-4m0 0L8 8m4-4v12" />
</svg>
<span className={`font-medium ${isUploading ? 'text-gray-400' : 'text-blue-500'}`}>
{isUploading ? 'Uploading...' : 'Upload Document'}
</span>
<input
type="file"
className="hidden"
accept=".pdf,.docx"
onChange={handleFileUpload}
disabled={isUploading}
/>
</label>
<span className="text-sm text-gray-500 dark:text-gray-400 flex items-center gap-2">
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
Supported: PDF, DOCX
</span>
</div>
{message && (
<div className="mt-4 p-4 bg-green-50 dark:bg-green-900/30 rounded-xl border border-green-100 dark:border-green-800">
<p className="text-sm text-green-600 dark:text-green-400 flex items-center gap-2">
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M5 13l4 4L19 7" />
</svg>
{message}
</p>
</div>
)}
{error && (
<div className="mt-4 p-4 bg-red-50 dark:bg-red-900/30 rounded-xl border border-red-100 dark:border-red-800">
<p className="text-sm text-red-600 dark:text-red-400 flex items-center gap-2">
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M12 8v4m0 4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
{error}
</p>
</div>
)}
</div>
);
}
ChatPage.tsx
"use client"
import { useChat } from 'ai/react';
import ChatInput from './ChatInput';
import ChatMessage from './ChatMessage';
import FileUpload from './FileUpload';
export default function ChatPage() {
const { input, messages, handleInputChange, handleSubmit, isLoading } = useChat({
api: '/api/chat',
onError: (error) => {
console.error('Chat error:', error);
alert('Error: ' + error.message);
}
});
return (
<div className="flex flex-col h-screen bg-gray-50 dark:bg-gray-900">
<div className="flex-1 max-w-5xl mx-auto w-full p-4 md:p-6 lg:p-8">
<div className="flex-1 overflow-y-auto mb-4 space-y-6">
<h1 className="text-3xl font-bold text-gray-900 dark:text-white text-center mb-8">
RAG-Powered Knowledge Base Chat
</h1>
<div className="bg-white dark:bg-gray-800 rounded-xl shadow-lg p-6">
<FileUpload />
</div>
<div className="space-y-6">
{messages.map((message) => (
<ChatMessage key={message.id} message={message} />
))}
</div>
</div>
<div className="sticky bottom-0 bg-white dark:bg-gray-800 rounded-xl shadow-lg p-4">
<ChatInput
input={input}
handleInputChange={handleInputChange}
handleSubmit={handleSubmit}
isLoading={isLoading}
/>
</div>
</div>
</div>
);
}
Viola! You're reading to run your code
npm run dev
Click the Upload document
button to upload the document you want to store. Once the upload is successful, your Pinecone dashboard will look like this:
With the document loaded, you can ask your AI Assistant questions relating to the content in the document and you will get the correct response. Here is a screenshot of my test:
Happy Coding 😎! Feel free to share your experience and feedbacks too. Cheers!
Top comments (0)