Dariel Vila for KaibanJS

Posted on Oct 16

Building a Smart Product Knowledge Base with RAG and AI Agents in JavaScript

#webdev #ai #javascript #tutorial

TL;DR

Learn how to build an intelligent product knowledge base using KaibanJS and SimpleRAGRetrieve that can answer customer questions using Retrieval-Augmented Generation (RAG). We'll create AI agents that search, analyze, and recommend products based on semantic understanding, all in JavaScript!

🔗 Live Demo Code: See full example

Why This Matters

Ever struggled with building intelligent search systems that actually understand what users mean, not just keyword matching? RAG (Retrieval-Augmented Generation) is the game-changer, and now you can implement it in JavaScript with minimal code.

What We're Building

A product support system where AI agents can:

✅ Search through product catalogs using semantic understanding
✅ Answer specific questions about products
✅ Compare and recommend products based on customer needs
✅ Access real product specifications and availability

All powered by vector search and LLMs, no complex backend required!

Meet KaibanJS & SimpleRAGRetrieve

What is KaibanJS?

KaibanJS is a JavaScript framework for building AI agent teams. Think of it as a way to create collaborative AI agents that can use tools, work together, and solve complex tasks.

What is SimpleRAGRetrieve?

SimpleRAGRetrieve from @kaibanjs/tools is a specialized RAG tool designed specifically for scenarios where you have pre-indexed data in a vector store. This is different from tools that handle both indexing and retrieval - SimpleRAGRetrieve focuses purely on the retrieval side, assuming your vector store is already populated.

Key benefits:

🎯 Pre-configured RAG pipeline - Just plug in your pre-indexed vector store
🔧 Customizable retrieval - Control how many documents to fetch, search type, and scoring
🌐 LangChain.js compatibility - Use any embeddings or vector stores from the LangChain ecosystem
⚡ Production-ready - Works with Pinecone, Supabase, or in-memory stores

Note: In this tutorial, we're using mocked data and showing the full indexing process with RAGToolkit for demonstration purposes. This helps you understand the complete RAG pipeline from data to retrieval. In real-world scenarios, you'd typically have your vector store already indexed and simply connect SimpleRAGRetrieve to it.

The Architecture

Here's what we're building:

┌─────────────────────────────────────────────────────┐
│  Customer Query: "Need laptop for video editing"    │
└─────────────────────┬───────────────────────────────┘
                      │
                      ▼
┌─────────────────────────────────────────────────────┐
│  Product Specialist Agent (with RAG tool)           │
│  ↓                                                   │
│  Task 1: Search Knowledge Base                      │
│  Task 2: Provide Recommendation                     │
└─────────────────────┬───────────────────────────────┘
                      │
                      ▼
┌─────────────────────────────────────────────────────┐
│  Vector Store (8 Tech Products)                     │
│  • Embeddings with OpenAI                           │
│  • Semantic search with metadata                    │
└─────────────────────────────────────────────────────┘

Understanding the Two Phases

This example demonstrates both phases of a RAG system:

Phase 1: Indexing (Setup - Usually done once)

Transform raw data into documents
Split documents into chunks with RAGToolkit
Generate embeddings
Store in vector database

Phase 2: Retrieval (Runtime - Every query)

User asks a question
SimpleRAGRetrieve converts question to embedding
Searches vector store for similar chunks
LLM generates answer from retrieved context

Key Point: In production, Phase 1 might happen in a separate ETL pipeline or background job. Your application using SimpleRAGRetrieve only does Phase 2, connecting to an already-indexed vector store.

Step 1: Setup & Installation

First, install the necessary packages:

npm install kaibanjs @kaibanjs/tools @langchain/openai langchain

You'll need an OpenAI API key for embeddings and LLM capabilities.

Step 2: Prepare Your Data

Demo Note: For this tutorial, we're using mocked data to demonstrate the complete RAG pipeline from start to finish. This lets you see how data flows from raw information → indexing → retrieval. In real applications, you'd typically connect SimpleRAGRetrieve to an already-indexed vector store (Pinecone, Supabase, etc.) without needing the indexing code.

Let's create a product catalog. Each product has rich metadata for better retrieval:

const sampleData = [
  {
    id: 1,
    name: 'UltraBook Pro 15',
    category: 'Laptop',
    content:
      'The UltraBook Pro 15 is a premium laptop featuring a 15.6-inch 4K display, Intel i9 processor, 32GB RAM, and 1TB NVMe SSD. Perfect for professional work, video editing, and gaming. Battery life up to 12 hours.',
    price: 2499,
    specs: ['Intel i9', '32GB RAM', '1TB SSD', '4K Display'],
    inStock: true
  },
  {
    id: 2,
    name: 'SmartPhone X12',
    category: 'Smartphone',
    content:
      'The SmartPhone X12 features a 6.7-inch OLED display, triple camera system with 108MP main sensor, 5G connectivity, and 5000mAh battery.',
    price: 999,
    specs: ['6.7" OLED', '108MP Camera', '5G', '5000mAh'],
    inStock: true
  }
  // ... more products
];

Step 3: Initialize Vector Store with RAGToolkit

One of KaibanJS's superpowers is the RAGToolkit - a utility that simplifies the indexing process for documents and vector stores:

import { OpenAIEmbeddings } from '@langchain/openai';
import { MemoryVectorStore } from 'langchain/vectorstores/memory';
import { RAGToolkit } from '@kaibanjs/tools';

// Create shared embeddings (works with ANY LangChain embeddings!)
const sharedEmbeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY
});

// Create vector store
const sharedVectorStore = new MemoryVectorStore(sharedEmbeddings);

// RAGToolkit makes it easy to add and index documents
const ragToolkit = new RAGToolkit({
  embeddings: sharedEmbeddings,
  vectorStore: sharedVectorStore,
  chunkOptions: {
    chunkSize: 500, // Size of text chunks for indexing
    chunkOverlap: 100 // Overlap to maintain context during indexing
  },
  env: { OPENAI_API_KEY: process.env.OPENAI_API_KEY }
});

Important: The chunkOptions here configure how documents are split during the indexing process. This is different from retrieval options - we're setting up how the data gets stored in the vector database.

Pro Tip: You can swap MemoryVectorStore with Pinecone, Supabase, Chroma, or any other LangChain-compatible vector store. That's the power of compatibility! 🚀

Step 4: Load Documents into Vector Store

const initializeVectorStore = async () => {
  // Transform products into documents with metadata
  const documents = sampleData.map(item => ({
    source: item.content,
    type: 'string',
    metadata: {
      id: item.id,
      name: item.name,
      category: item.category,
      price: item.price,
      specs: item.specs,
      inStock: item.inStock,
      // Combined text for better semantic search
      fullText: `${item.name} ${item.category} ${
        item.content
      } ${item.specs.join(' ')}`
    }
  }));

  await ragToolkit.addDocuments(documents);
  console.log('✅ Vector store initialized with product knowledge base');
};

// Call this when your app starts
await initializeVectorStore();

Step 5: Create the SimpleRAGRetrieve Tool

Now comes the magic - configure the RAG retrieval tool:

import { SimpleRAGRetrieve } from '@kaibanjs/tools';

const productKnowledgeBaseTool = new SimpleRAGRetrieve({
  OPENAI_API_KEY: process.env.OPENAI_API_KEY,
  vectorStore: sharedVectorStore, // Our pre-indexed vector store
  embeddings: sharedEmbeddings, // Same embeddings used for indexing
  retrieverOptions: {
    k: 4, // Retrieve top 4 most relevant documents
    searchType: 'similarity' // Can also use 'mmr' for diversity
  }
});

🎛️ Understanding the Retrieval Configuration

retrieverOptions - How documents are retrieved from the vector store

k: Number of documents to retrieve (more = more context, but slower)
searchType:
- 'similarity': Find most similar documents (best for factual queries)
- 'mmr': Maximal Marginal Relevance (adds diversity, good for comparisons)
scoreThreshold: Minimum similarity score (0-1, filters low-quality matches)
filter: Metadata filters (e.g., { category: 'Laptop', inStock: true })

Step 6: Create the AI Agent

import { Agent, Task, Team } from 'kaibanjs';

const productSpecialist = new Agent({
  name: 'Product Specialist',
  role: 'Technology Product Expert',
  goal: 'Help customers find the right products by searching our knowledge base',
  background:
    'Expert in technology products with deep knowledge of specs and use cases',
  tools: [productKnowledgeBaseTool] // Give agent the RAG tool
});

Step 7: Define Tasks

Create a workflow with multiple tasks:

// Task 1: Search for relevant products
const searchProductTask = new Task({
  description: `Search our product knowledge base to answer: {customerQuery}

  Focus on finding accurate product information including specifications, features, prices, and availability.`,
  expectedOutput:
    'Detailed product information that directly addresses the customer query',
  agent: productSpecialist
});

// Task 2: Provide recommendations
const recommendationTask = new Task({
  description: `Based on the product information found, provide a helpful recommendation.

  Customer's question: {customerQuery}

  If comparing products, highlight key differences. If seeking recommendations, suggest the best option.`,
  expectedOutput:
    'A clear recommendation that helps the customer make an informed decision',
  agent: productSpecialist
});

Step 8: Assemble the Team

const team = new Team({
  name: 'Product Support Team',
  agents: [productSpecialist],
  tasks: [searchProductTask, recommendationTask],
  inputs: {
    customerQuery:
      'I need a laptop for video editing and gaming. What do you recommend?'
  },
  env: {
    OPENAI_API_KEY: process.env.OPENAI_API_KEY
  }
});

// Start the team!
await team.start();

🎯 Advanced Configurations

Using Pinecone for Production

For production systems, use a managed vector database like Pinecone:

import { PineconeStore } from '@langchain/pinecone';
import { Pinecone } from '@pinecone-database/pinecone';

const pinecone = new Pinecone({
  apiKey: process.env.PINECONE_API_KEY
});

const pineconeIndex = pinecone.Index('products-index');
const vectorStore = await PineconeStore.fromExistingIndex(sharedEmbeddings, {
  pineconeIndex
});

// Use this vector store with SimpleRAGRetrieve
const tool = new SimpleRAGRetrieve({
  OPENAI_API_KEY: process.env.OPENAI_API_KEY,
  vectorStore: vectorStore,
  embeddings: embeddings,
  retrieverOptions: {
    k: 4,
    searchType: 'mmr', // More diverse results
    scoreThreshold: 0.7 // Only high-quality matches
  }
});

Real-World Scenario: In production, you'd typically have your Pinecone index already populated with data (maybe thousands or millions of documents). Your application would just connect to it and start retrieving - no indexing code needed in your retrieval service!

// In a real app, you just connect to existing index
const existingVectorStore = await PineconeStore.fromExistingIndex(embeddings, {
  pineconeIndex
});

// SimpleRAGRetrieve queries the pre-indexed data
const retriever = new SimpleRAGRetrieve({
  OPENAI_API_KEY: process.env.OPENAI_API_KEY,
  vectorStore: existingVectorStore,
  embeddings: embeddings
});

Custom Embeddings

Swap OpenAI embeddings for other providers:

// Cohere embeddings
import { CohereEmbeddings } from '@langchain/cohere';
const embeddings = new CohereEmbeddings({
  apiKey: process.env.COHERE_API_KEY,
  model: 'embed-english-v3.0'
});

// HuggingFace embeddings
import { HuggingFaceInferenceEmbeddings } from '@langchain/community/embeddings/hf';
const embeddings = new HuggingFaceInferenceEmbeddings({
  apiKey: process.env.HUGGINGFACE_API_KEY,
  model: 'sentence-transformers/all-MiniLM-L6-v2'
});

🔍 Why This Approach Rocks

1. LangChain.js Compatibility

SimpleRAGRetrieve works with the entire LangChain.js ecosystem:

Embeddings: OpenAI, Cohere, HuggingFace, Anthropic
Vector Stores: Pinecone, Supabase, Chroma, Qdrant, Weaviate
LLMs: Any LangChain-compatible model

2. Flexible Chunking Strategies (via RAGToolkit)

Control how your documents are split during indexing with RAGToolkit:

// For technical documentation (more context per chunk)
const ragToolkit = new RAGToolkit({
  embeddings,
  vectorStore,
  chunkOptions: { chunkSize: 1000, chunkOverlap: 200 }
});

// For short product descriptions (smaller, precise chunks)
const ragToolkit = new RAGToolkit({
  embeddings,
  vectorStore,
  chunkOptions: { chunkSize: 300, chunkOverlap: 50 }
});

// For legal documents (maintain maximum context)
const ragToolkit = new RAGToolkit({
  embeddings,
  vectorStore,
  chunkOptions: { chunkSize: 1500, chunkOverlap: 300 }
});

3. Configurable Retrieval

Fine-tune how documents are retrieved:

retrieverOptions: {
  k: 10,                    // Get more options
  searchType: 'mmr',        // Diverse results
  scoreThreshold: 0.8,      // High quality only
  filter: {                 // Metadata filtering
    category: 'Laptop',
    inStock: true,
    price: { $lte: 2000 }
  }
}

4. Multi-Agent Workflows

Combine multiple agents with different RAG tools:

const searchAgent = new Agent({ tools: [productRAG] });
const pricingAgent = new Agent({ tools: [pricingRAG] });
const reviewAgent = new Agent({ tools: [reviewRAG] });

// Agents collaborate on tasks!

Real-World Use Cases

This pattern works for:

📚 Documentation Search - Help users find answers in docs
🛍️ E-commerce - Product recommendations and comparisons
🎓 Educational Platforms - Course recommendations based on learning goals
💼 Customer Support - Answer questions from knowledge bases
🏥 Healthcare - Search medical literature and guidelines
⚖️ Legal - Find relevant cases and regulations

Performance Tips

Start with MemoryVectorStore for development, migrate to Pinecone/Supabase for production
Adjust k based on your use case: 3-5 for specific answers, 10+ for comprehensive research
Use scoreThreshold to filter out irrelevant results
Metadata filtering is faster than semantic search - combine both!
Monitor token usage - larger chunks = more tokens sent to LLM
Separate indexing from retrieval: Index once (with RAGToolkit), retrieve many times (with SimpleRAGRetrieve)

The SimpleRAGRetrieve Philosophy

Separation of Concerns:

Indexing Tools (like RAGToolkit): Handle document processing, chunking, embedding generation, and storage
Retrieval Tools (like SimpleRAGRetrieve): Focus solely on searching and retrieving from pre-indexed data

This separation means:

✅ Your retrieval service stays lightweight and fast
✅ You can update the index independently without changing retrieval logic
✅ Multiple applications can share the same indexed knowledge base
✅ Easier to scale - indexing and retrieval can run on different infrastructure

When to use SimpleRAGRetrieve:

✅ You have an existing Pinecone/Supabase/Chroma index
✅ Your data is already embedded and stored
✅ You need to add RAG capabilities to AI agents
✅ You want a production-ready retrieval solution

When you might need a different tool:

❌ You need to handle both indexing and retrieval in one tool (use SimpleRAG instead)
❌ You're doing one-off queries with fresh content each time

What's Next?

Conclusion

Building intelligent RAG systems doesn't have to be complex. With KaibanJS and SimpleRAGRetrieve, you get:

✅ Pre-configured RAG pipeline

✅ LangChain.js compatibility for any provider

✅ Flexible configuration options

✅ Multi-agent collaboration capabilities

The full code example shows how to build a production-ready product knowledge base in less than 200 lines of code. That's the power of the right abstractions!

Try it out and let me know what you build! Drop a comment with your use case below 👇

Tags: #javascript #ai #rag #langchain #agents #kaibanjs #openai #vectorsearch #semanticsearch

Found this helpful? Follow me for more AI agent tutorials and JavaScript tips!

DEV Community