TL;DR
Learn how to build an intelligent product knowledge base using KaibanJS and SimpleRAGRetrieve that can answer customer questions using Retrieval-Augmented Generation (RAG). We'll create AI agents that search, analyze, and recommend products based on semantic understanding, all in JavaScript!
π Live Demo Code: See full example
Why This Matters
Ever struggled with building intelligent search systems that actually understand what users mean, not just keyword matching? RAG (Retrieval-Augmented Generation) is the game-changer, and now you can implement it in JavaScript with minimal code.
What We're Building
A product support system where AI agents can:
- β Search through product catalogs using semantic understanding
- β Answer specific questions about products
- β Compare and recommend products based on customer needs
- β Access real product specifications and availability
All powered by vector search and LLMs, no complex backend required!
Meet KaibanJS & SimpleRAGRetrieve
What is KaibanJS?
KaibanJS is a JavaScript framework for building AI agent teams. Think of it as a way to create collaborative AI agents that can use tools, work together, and solve complex tasks.
What is SimpleRAGRetrieve?
SimpleRAGRetrieve from @kaibanjs/tools
is a specialized RAG tool designed specifically for scenarios where you have pre-indexed data in a vector store. This is different from tools that handle both indexing and retrieval - SimpleRAGRetrieve focuses purely on the retrieval side, assuming your vector store is already populated.
Key benefits:
- π― Pre-configured RAG pipeline - Just plug in your pre-indexed vector store
- π§ Customizable retrieval - Control how many documents to fetch, search type, and scoring
- π LangChain.js compatibility - Use any embeddings or vector stores from the LangChain ecosystem
- β‘ Production-ready - Works with Pinecone, Supabase, or in-memory stores
Note: In this tutorial, we're using mocked data and showing the full indexing process with RAGToolkit for demonstration purposes. This helps you understand the complete RAG pipeline from data to retrieval. In real-world scenarios, you'd typically have your vector store already indexed and simply connect SimpleRAGRetrieve to it.
The Architecture
Here's what we're building:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Customer Query: "Need laptop for video editing" β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Product Specialist Agent (with RAG tool) β
β β β
β Task 1: Search Knowledge Base β
β Task 2: Provide Recommendation β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Vector Store (8 Tech Products) β
β β’ Embeddings with OpenAI β
β β’ Semantic search with metadata β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Understanding the Two Phases
This example demonstrates both phases of a RAG system:
Phase 1: Indexing (Setup - Usually done once)
- Transform raw data into documents
- Split documents into chunks with
RAGToolkit
- Generate embeddings
- Store in vector database
Phase 2: Retrieval (Runtime - Every query)
- User asks a question
-
SimpleRAGRetrieve
converts question to embedding - Searches vector store for similar chunks
- LLM generates answer from retrieved context
Key Point: In production, Phase 1 might happen in a separate ETL pipeline or background job. Your application using
SimpleRAGRetrieve
only does Phase 2, connecting to an already-indexed vector store.
Step 1: Setup & Installation
First, install the necessary packages:
npm install kaibanjs @kaibanjs/tools @langchain/openai langchain
You'll need an OpenAI API key for embeddings and LLM capabilities.
Step 2: Prepare Your Data
Demo Note: For this tutorial, we're using mocked data to demonstrate the complete RAG pipeline from start to finish. This lets you see how data flows from raw information β indexing β retrieval. In real applications, you'd typically connect SimpleRAGRetrieve to an already-indexed vector store (Pinecone, Supabase, etc.) without needing the indexing code.
Let's create a product catalog. Each product has rich metadata for better retrieval:
const sampleData = [
{
id: 1,
name: 'UltraBook Pro 15',
category: 'Laptop',
content:
'The UltraBook Pro 15 is a premium laptop featuring a 15.6-inch 4K display, Intel i9 processor, 32GB RAM, and 1TB NVMe SSD. Perfect for professional work, video editing, and gaming. Battery life up to 12 hours.',
price: 2499,
specs: ['Intel i9', '32GB RAM', '1TB SSD', '4K Display'],
inStock: true
},
{
id: 2,
name: 'SmartPhone X12',
category: 'Smartphone',
content:
'The SmartPhone X12 features a 6.7-inch OLED display, triple camera system with 108MP main sensor, 5G connectivity, and 5000mAh battery.',
price: 999,
specs: ['6.7" OLED', '108MP Camera', '5G', '5000mAh'],
inStock: true
}
// ... more products
];
Step 3: Initialize Vector Store with RAGToolkit
One of KaibanJS's superpowers is the RAGToolkit - a utility that simplifies the indexing process for documents and vector stores:
import { OpenAIEmbeddings } from '@langchain/openai';
import { MemoryVectorStore } from 'langchain/vectorstores/memory';
import { RAGToolkit } from '@kaibanjs/tools';
// Create shared embeddings (works with ANY LangChain embeddings!)
const sharedEmbeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY
});
// Create vector store
const sharedVectorStore = new MemoryVectorStore(sharedEmbeddings);
// RAGToolkit makes it easy to add and index documents
const ragToolkit = new RAGToolkit({
embeddings: sharedEmbeddings,
vectorStore: sharedVectorStore,
chunkOptions: {
chunkSize: 500, // Size of text chunks for indexing
chunkOverlap: 100 // Overlap to maintain context during indexing
},
env: { OPENAI_API_KEY: process.env.OPENAI_API_KEY }
});
Important: The chunkOptions
here configure how documents are split during the indexing process. This is different from retrieval options - we're setting up how the data gets stored in the vector database.
Pro Tip: You can swap MemoryVectorStore
with Pinecone, Supabase, Chroma, or any other LangChain-compatible vector store. That's the power of compatibility! π
Step 4: Load Documents into Vector Store
const initializeVectorStore = async () => {
// Transform products into documents with metadata
const documents = sampleData.map(item => ({
source: item.content,
type: 'string',
metadata: {
id: item.id,
name: item.name,
category: item.category,
price: item.price,
specs: item.specs,
inStock: item.inStock,
// Combined text for better semantic search
fullText: `${item.name} ${item.category} ${
item.content
} ${item.specs.join(' ')}`
}
}));
await ragToolkit.addDocuments(documents);
console.log('β
Vector store initialized with product knowledge base');
};
// Call this when your app starts
await initializeVectorStore();
Step 5: Create the SimpleRAGRetrieve Tool
Now comes the magic - configure the RAG retrieval tool:
import { SimpleRAGRetrieve } from '@kaibanjs/tools';
const productKnowledgeBaseTool = new SimpleRAGRetrieve({
OPENAI_API_KEY: process.env.OPENAI_API_KEY,
vectorStore: sharedVectorStore, // Our pre-indexed vector store
embeddings: sharedEmbeddings, // Same embeddings used for indexing
retrieverOptions: {
k: 4, // Retrieve top 4 most relevant documents
searchType: 'similarity' // Can also use 'mmr' for diversity
}
});
ποΈ Understanding the Retrieval Configuration
retrieverOptions - How documents are retrieved from the vector store
-
k
: Number of documents to retrieve (more = more context, but slower) -
searchType
:-
'similarity'
: Find most similar documents (best for factual queries) -
'mmr'
: Maximal Marginal Relevance (adds diversity, good for comparisons)
-
-
scoreThreshold
: Minimum similarity score (0-1, filters low-quality matches) -
filter
: Metadata filters (e.g.,{ category: 'Laptop', inStock: true }
)
Step 6: Create the AI Agent
import { Agent, Task, Team } from 'kaibanjs';
const productSpecialist = new Agent({
name: 'Product Specialist',
role: 'Technology Product Expert',
goal: 'Help customers find the right products by searching our knowledge base',
background:
'Expert in technology products with deep knowledge of specs and use cases',
tools: [productKnowledgeBaseTool] // Give agent the RAG tool
});
Step 7: Define Tasks
Create a workflow with multiple tasks:
// Task 1: Search for relevant products
const searchProductTask = new Task({
description: `Search our product knowledge base to answer: {customerQuery}
Focus on finding accurate product information including specifications, features, prices, and availability.`,
expectedOutput:
'Detailed product information that directly addresses the customer query',
agent: productSpecialist
});
// Task 2: Provide recommendations
const recommendationTask = new Task({
description: `Based on the product information found, provide a helpful recommendation.
Customer's question: {customerQuery}
If comparing products, highlight key differences. If seeking recommendations, suggest the best option.`,
expectedOutput:
'A clear recommendation that helps the customer make an informed decision',
agent: productSpecialist
});
Step 8: Assemble the Team
const team = new Team({
name: 'Product Support Team',
agents: [productSpecialist],
tasks: [searchProductTask, recommendationTask],
inputs: {
customerQuery:
'I need a laptop for video editing and gaming. What do you recommend?'
},
env: {
OPENAI_API_KEY: process.env.OPENAI_API_KEY
}
});
// Start the team!
await team.start();
π― Advanced Configurations
Using Pinecone for Production
For production systems, use a managed vector database like Pinecone:
import { PineconeStore } from '@langchain/pinecone';
import { Pinecone } from '@pinecone-database/pinecone';
const pinecone = new Pinecone({
apiKey: process.env.PINECONE_API_KEY
});
const pineconeIndex = pinecone.Index('products-index');
const vectorStore = await PineconeStore.fromExistingIndex(sharedEmbeddings, {
pineconeIndex
});
// Use this vector store with SimpleRAGRetrieve
const tool = new SimpleRAGRetrieve({
OPENAI_API_KEY: process.env.OPENAI_API_KEY,
vectorStore: vectorStore,
embeddings: embeddings,
retrieverOptions: {
k: 4,
searchType: 'mmr', // More diverse results
scoreThreshold: 0.7 // Only high-quality matches
}
});
Real-World Scenario: In production, you'd typically have your Pinecone index already populated with data (maybe thousands or millions of documents). Your application would just connect to it and start retrieving - no indexing code needed in your retrieval service!
// In a real app, you just connect to existing index
const existingVectorStore = await PineconeStore.fromExistingIndex(embeddings, {
pineconeIndex
});
// SimpleRAGRetrieve queries the pre-indexed data
const retriever = new SimpleRAGRetrieve({
OPENAI_API_KEY: process.env.OPENAI_API_KEY,
vectorStore: existingVectorStore,
embeddings: embeddings
});
Custom Embeddings
Swap OpenAI embeddings for other providers:
// Cohere embeddings
import { CohereEmbeddings } from '@langchain/cohere';
const embeddings = new CohereEmbeddings({
apiKey: process.env.COHERE_API_KEY,
model: 'embed-english-v3.0'
});
// HuggingFace embeddings
import { HuggingFaceInferenceEmbeddings } from '@langchain/community/embeddings/hf';
const embeddings = new HuggingFaceInferenceEmbeddings({
apiKey: process.env.HUGGINGFACE_API_KEY,
model: 'sentence-transformers/all-MiniLM-L6-v2'
});
π Why This Approach Rocks
1. LangChain.js Compatibility
SimpleRAGRetrieve works with the entire LangChain.js ecosystem:
- Embeddings: OpenAI, Cohere, HuggingFace, Anthropic
- Vector Stores: Pinecone, Supabase, Chroma, Qdrant, Weaviate
- LLMs: Any LangChain-compatible model
2. Flexible Chunking Strategies (via RAGToolkit)
Control how your documents are split during indexing with RAGToolkit:
// For technical documentation (more context per chunk)
const ragToolkit = new RAGToolkit({
embeddings,
vectorStore,
chunkOptions: { chunkSize: 1000, chunkOverlap: 200 }
});
// For short product descriptions (smaller, precise chunks)
const ragToolkit = new RAGToolkit({
embeddings,
vectorStore,
chunkOptions: { chunkSize: 300, chunkOverlap: 50 }
});
// For legal documents (maintain maximum context)
const ragToolkit = new RAGToolkit({
embeddings,
vectorStore,
chunkOptions: { chunkSize: 1500, chunkOverlap: 300 }
});
3. Configurable Retrieval
Fine-tune how documents are retrieved:
retrieverOptions: {
k: 10, // Get more options
searchType: 'mmr', // Diverse results
scoreThreshold: 0.8, // High quality only
filter: { // Metadata filtering
category: 'Laptop',
inStock: true,
price: { $lte: 2000 }
}
}
4. Multi-Agent Workflows
Combine multiple agents with different RAG tools:
const searchAgent = new Agent({ tools: [productRAG] });
const pricingAgent = new Agent({ tools: [pricingRAG] });
const reviewAgent = new Agent({ tools: [reviewRAG] });
// Agents collaborate on tasks!
Real-World Use Cases
This pattern works for:
- π Documentation Search - Help users find answers in docs
- ποΈ E-commerce - Product recommendations and comparisons
- π Educational Platforms - Course recommendations based on learning goals
- πΌ Customer Support - Answer questions from knowledge bases
- π₯ Healthcare - Search medical literature and guidelines
- βοΈ Legal - Find relevant cases and regulations
Performance Tips
- Start with MemoryVectorStore for development, migrate to Pinecone/Supabase for production
-
Adjust
k
based on your use case: 3-5 for specific answers, 10+ for comprehensive research -
Use
scoreThreshold
to filter out irrelevant results - Metadata filtering is faster than semantic search - combine both!
- Monitor token usage - larger chunks = more tokens sent to LLM
- Separate indexing from retrieval: Index once (with RAGToolkit), retrieve many times (with SimpleRAGRetrieve)
The SimpleRAGRetrieve Philosophy
Separation of Concerns:
- Indexing Tools (like RAGToolkit): Handle document processing, chunking, embedding generation, and storage
- Retrieval Tools (like SimpleRAGRetrieve): Focus solely on searching and retrieving from pre-indexed data
This separation means:
- β Your retrieval service stays lightweight and fast
- β You can update the index independently without changing retrieval logic
- β Multiple applications can share the same indexed knowledge base
- β Easier to scale - indexing and retrieval can run on different infrastructure
When to use SimpleRAGRetrieve:
- β You have an existing Pinecone/Supabase/Chroma index
- β Your data is already embedded and stored
- β You need to add RAG capabilities to AI agents
- β You want a production-ready retrieval solution
When you might need a different tool:
- β You need to handle both indexing and retrieval in one tool (use
SimpleRAG
instead) - β You're doing one-off queries with fresh content each time
What's Next?
- π Explore SimpleRAGRetrieve docs
- π KaibanJS Documentation
- π¬ Join the community
- π― More KaibanJS tools
- π‘ Complete Code Example
Conclusion
Building intelligent RAG systems doesn't have to be complex. With KaibanJS and SimpleRAGRetrieve, you get:
β
Pre-configured RAG pipeline
β
LangChain.js compatibility for any provider
β
Flexible configuration options
β
Multi-agent collaboration capabilities
The full code example shows how to build a production-ready product knowledge base in less than 200 lines of code. That's the power of the right abstractions!
Try it out and let me know what you build! Drop a comment with your use case below π
Tags: #javascript #ai #rag #langchain #agents #kaibanjs #openai #vectorsearch #semanticsearch
Found this helpful? Follow me for more AI agent tutorials and JavaScript tips!
Top comments (0)