Arya Pratap Singh

Posted on Sep 28

But what is “contextual search” — case study of KENDO-RAG and how it beats Google for private data

#ai #webdev #tutorial #rag

Picture this (even tough you don't want to, but pls lets focus): You're frantically digging through your notes at 2 AM before a big presentation, typing variations of "quarterly revenue projections methodology" into your company's search bar. Nothing. You try "Q4 forecast approach." Still nothing. Meanwhile, the document you need literally contains the phrase "how we calculate expected earnings for the final quarter."

Welcome to the keyword search hellscape we've all accepted as normal. For most it is normal.

Here's the uncomfortable truth: Most of the world's knowledge isn't on the public web—it's hiding behind login walls, scattered across internal wikis, personal notes, and company docs that Google can't even see/index, let alone search intelligently. We're living in an age where I can ask ChatGPT to explain quantum mechanics in the style of a pirate, but I can't find last month's meeting notes without remembering the exact phrase someone used.

Something's fundamentally broken here.

The "lost knowledge" problem (and why it's getting worse)

Let's do some math. The average knowledge worker deals with:

Roughly 2,500+ documents per year across different platforms
An average of 9+ different apps for work (thanks, SaaS sprawl)
Search success rates that hover around 30% for internal systems

When you compare this to Google's ~85% success rate for web searches, the gap becomes painfully obvious. We've accepted that finding information in our own systems should be harder than finding information on the entire internet. That's... weird?

The real cost isn't just time—it's the compound effect of lost institutional knowledge. Every failed search is a small fracture in your team's collective intelligence.

How contextual search actually works (without the marketing fluff)

Traditional search thinks like this: "Find documents containing these exact words"
Contextual search thinks like this: "Find documents that mean what I'm asking about"

The magic happens through vector embeddings—mathematical representations that capture semantic meaning. Think of them as coordinates in a high-dimensional space where similar concepts cluster together, even if they use completely different words.

Here's the process that makes it work:

The chunking strategy that actually matters

Here's where most implementations fail: the chunking strategy. You can't just throw entire documents at an embedding model—they have token limits and context windows to consider.

Too small chunks (< 150 tokens): You lose context. A chunk might reference "the admin password" without mentioning it's specifically for the testing environment.

Too large chunks (> 800 tokens): Everything becomes relevant to everything. Your search for "API rate limits" returns chunks about database performance because they happened to be in the same document section.

The sweet spot: Progress RAG handles this automatically, optimizing chunk sizes between 300-600 tokens with intelligent overlap to preserve context while maintaining search precision.

Real things from a working implementation

I recently built EduBox, a contextual search system for students to query their lecture notes and academic materials using Progress/Kendo RAG.

The technical implementation used:

Frontend: Next.js + TypeScript with real-time sync via Convex
RAG System: Progress Agentic RAG for document processing, knowledge base creation, and semantic search
LLM Integration: Multi-agent orchestration with tool calling via CopilotKit
File Processing: Automatic indexing of PDFs, docs, images, and even chat logs

Yes, it's slower than keyword search. Vector similarity calculations aren't instantaneous. But when the alternative is spending 15 minutes digging through folders only to come up empty-handed, 1.4 seconds feels pretty snappy.

The qualitative change was even more significant: Students stopped asking "Where did I put that thing about eigenvalues?" and started asking "What did the professor explain about eigenvalues in relation to machine learning applications?"

The developer's reality: Progress RAG makes it straightforward

Building contextual search used to be complex—vector databases, embedding models, chunking strategies. Progress RAG abstracts away the complexity while keeping the power:

// Progress RAG implementation in EduBox
class NucliaRAGService {
  constructor(apiKey, syncUrl) {
    this.apiKey = apiKey;
    this.syncUrl = syncUrl;
    this.defaultKB = process.env.NUCLIA_DEFAULT_KB;
  }

  // Create or get user's knowledge base
  async createKnowledgeBase(userId, userData) {
    const response = await fetch(`${this.syncUrl}/api/sync`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${this.apiKey}`
      },
      body: JSON.stringify({
        userId,
        userContext: userData,
        action: 'create_kb'
      })
    });

    return await response.json();
  }

  // Upload and process documents
  async uploadDocument(userId, file, metadata) {
    const formData = new FormData();
    formData.append('file', file);
    formData.append('userId', userId);
    formData.append('metadata', JSON.stringify(metadata));

    const response = await fetch(`${this.syncUrl}/api/upload`, {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${this.apiKey}`
      },
      body: formData
    });

    return await response.json();
  }

  // Semantic search and answer generation
  async queryKnowledgeBase(userId, query, options = {}) {
    const response = await fetch(`${this.syncUrl}/api/query`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${this.apiKey}`
      },
      body: JSON.stringify({
        userId,
        query,
        maxResults: options.maxResults || 5,
        includeContext: options.includeContext || true,
        generateAnswer: options.generateAnswer || true
      })
    });

    const result = await response.json();

    return {
      answer: result.generative_answer,
      sources: result.resources.map(resource => ({
        filename: resource.title,
        confidence: resource.score,
        snippet: resource.text.slice(0, 200),
        url: resource.origin?.url
      })),
      confidence: result.min_score
    };
  }

  // Sync user context for personalized RAG
  async syncUserContext(userId, userContext) {
    return await fetch(`${this.syncUrl}/api/sync`, {
      method: 'PUT',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${this.apiKey}`
      },
      body: JSON.stringify({
        userId,
        userContext,
        action: 'sync_context'
      })
    });
  }
}

// Usage in React component
const useContextualSearch = () => {
  const ragService = new NucliaRAGService(
    process.env.NEXT_PUBLIC_NUCLIA_API_KEY,
    process.env.NEXT_PUBLIC_NUCLIA_SYNC_URL
  );

  const search = async (query) => {
    const userId = await getCurrentUserId();
    const results = await ragService.queryKnowledgeBase(userId, query, {
      maxResults: 5,
      includeContext: true,
      generateAnswer: true
    });

    return results;
  };

  const uploadFile = async (file, metadata = {}) => {
    const userId = await getCurrentUserId();
    return await ragService.uploadDocument(userId, file, {
      ...metadata,
      uploadedAt: new Date().toISOString(),
      filename: file.name,
      type: file.type
    });
  };

  return { search, uploadFile };
};

What Progress RAG handles for you:

Automatic document chunking with optimal strategies
Vector embedding generation and storage using multiple models
Knowledge base isolation per user with secure access
Real-time document indexing and processing
Source attribution and confidence scoring
Multi-format support (PDFs, docs, images, text, code files)
Advanced filtering and metadata handling

The beauty is in what you don't have to manage: embedding models, vector databases, chunking algorithms, or scaling concerns. Progress RAG provides a RAG-as-a-Service platform that handles the complexity while giving you full customization over retrieval strategies and LLM integration.

The privacy advantage (your competitive moat)

Google dominates the public web, but your competitive advantages live in private repositories. Your processes, architectural decisions, lessons learned from production incidents, and institutional knowledge that took years to accumulate—none of that is searchable by your competitors.

When you implement contextual search internally, you're not just solving a productivity problem. You're creating an unfair advantage. Your team can instantly access collective knowledge that would take months for a competitor to rebuild.

New hires can ask "How do we handle race conditions in the payment processor?" and get answers synthesized from:

The original architectural decision document
A post-mortem from when the bug was discovered
Code comments from the fix
A Slack thread discussing edge cases

This isn't just about finding documents—it's about preserving and leveraging institutional memory.

Beyond the obvious: Where contextual search gets interesting

Code archaeology: "What were the original assumptions behind this module?" Feed your git history, PR discussions, and architecture docs into your knowledge base, and suddenly your codebase has a memory that explains why, not just what.

Decision lineage: "Why did we choose microservices over a monolith?" Instead of tribal knowledge, you get a trail connecting the original requirements, trade-off analyses, implementation challenges, and retrospective insights.

Academic research: Students can upload lecture notes, research papers, and assignment materials, then ask complex questions like "How does the concept of eigenvalues from linear algebra relate to the principal component analysis we covered in machine learning?" Progress RAG finds connections across different courses and materials.

Onboarding acceleration: New engineers asking "How does our deployment process work?" get answers that weave together infrastructure docs, runbooks, and real war stories from production incidents.

The implementation reality: What I learned building EduBox

In building EduBox with Progress RAG, I discovered some crucial patterns:

User context isolation: Each user gets their own knowledge base hash in Progress RAG. This isn't just about privacy—it's about relevance. My calculus notes shouldn't pollute your search for computer science concepts.

Incremental processing: Progress RAG indexes documents immediately upon upload. Students expect to ask questions about notes they just uploaded, not wait for overnight batch processing.

Multi-modal support: The system handles PDFs, Word docs, images with text, code files, and even chat logs. Progress RAG automatically adapts its processing pipeline based on content type.

Source attribution that builds trust: Every answer includes specific source snippets with confidence scores. Students can verify the AI's reasoning and dive deeper into the original materials when needed.

Cost efficiency: With Progress RAG's pricing model, processing 1000 documents costs roughly $2-5 depending on size and complexity. For a student or small team, that's negligible compared to the productivity gains.

Why this matters more than another productivity tool

We're in a weird transitional moment. We have AI that can reason about complex problems, but most organizations can't even find their own documentation effectively. Contextual search isn't just about better search—it's about unlocking the collective intelligence that already exists in your organization.

Every company is sitting on a goldmine of knowledge that's effectively inaccessible. Meeting transcripts with crucial decisions, design documents explaining architectural choices, customer feedback that shaped product direction—it's all there, but locked away by the limitations of keyword search.

The teams that figure this out first will have a compound advantage. Better decisions because they can access historical context. Faster onboarding because new team members can learn from everything that came before. Less repeated work because they can find previous solutions.

Google will always own the public web. But for your docs, your code, your institutional knowledge—contextual search wins. And it keeps that competitive advantage exactly where it should be: behind your login wall.

I built EduBox as a working example of this concept—students can upload lecture notes, assignments, and academic materials, then ask natural language questions to get contextual answers with source citations. The implementation details show how Progress RAG simplified what could have been a complex architecture into clean, maintainable code.

What's the biggest knowledge discovery problem in your organization? I'm curious about other patterns people are encountering as more teams adopt contextual search.

DEV Community