DEV Community

Cover image for Unifying Enterprise Knowledge Search with MindsDB
Prasanna
Prasanna

Posted on

Unifying Enterprise Knowledge Search with MindsDB

Hacktoberfest: Maintainer Spotlight

TL;DR: We built a production-ready, multi-source RAG system in just a few days by leveraging MindsDB's Knowledge Bases. What would have required 5+ separate services now runs on declarative SQL queries. Meet Area51: unified search across Zendesk, Jira, and Confluence with three intelligent agent interfaces.

The Problem: Three Islands of Information

Modern enterprises battle knowledge fragmentation every day. Critical information is scattered across distinct platforms:

  • Zendesk: Customer support tickets, resolutions, and interaction history
  • Jira: Engineering issues, bug reports, and project tracking
  • Confluence: Technical documentation, SOPs, and internal guides

This fragmentation creates real operational costs. Support agents waste time switching between systems, trying to piece together context that exists but isn't discoverable. An engineer investigating a bug can't easily see customer impact. Documentation teams operate blind to actual support needs.

This results in slower resolutions, duplicated effort, and knowledge that never reaches the people who need it.


The Solution: One Query, Complete Context

We built Area51 - a unified semantic search system that connects all three platforms through MindsDB Knowledge Bases. Instead of building separate ETL pipelines, custom vector databases, and orchestration layers, MindsDB handles everything declaratively.

The system automatically syncs all three platforms, provides semantic search that understands relationships and context, and stays current without manual intervention. Most importantly, it offers distinct agent interfaces for different workflows:

  1. LangGraph Agent - Conversational, multi-turn reasoning for complex interactions
  2. MindsDB SQL Agent - High-performance direct queries to run agents directly from the MindsDB UI
  3. MCP Server - Integration with AI assistants like Cursor and Claude Desktop

What MindsDB Replaced

What We Needed Traditional Approach With MindsDB
Data Integration Custom ETL scripts per source CREATE DATASOURCE statements
Embeddings Separate service + API calls Automatic chunking + embedding
Vector Storage Pinecone/Weaviate/Supabase Built-in pgvector integration
Search Engine Custom semantic search code Hybrid SQL queries with MATCHES
ETL Orchestration Airflow/Kubernetes CronJobs CREATE JOB scheduled queries
Evaluation Custom metrics framework EVALUATE KB with built-in metrics

Architecture Deep Dive

Area51 integrates Zendesk, Jira, and Confluence through MindsDB Knowledge Bases using hybrid semantic search. Here's how it works:

Data Integration Layer

Three MindsDB datasources connect directly to external APIs:

-- One statement per integration, zero API codeCREATE DATASOURCE zendesk_datasource WITH ENGINE = 'zendesk';
CREATE DATASOURCE jira_datasource WITH ENGINE = 'jira';
CREATE DATASOURCE confluence_datasource WITH ENGINE = 'confluence';

Enter fullscreen mode Exit fullscreen mode

Each datasource handles authentication, pagination, and rate limiting automatically. No custom ETL code required.

Knowledge Base Layer

Three separate knowledge bases, each with Azure OpenAI embeddings.

  1. Confluence - For companies official documentations
  2. Jira - For engineering/technical tickets
  3. Zendesk - For customer tickets

MindsDB automatically:

  • Chunks documents intelligently
  • Generates embeddings via Azure OpenAI
  • Stores vectors in pgvector
  • Creates hybrid search indexes
  • Handles versioning and metadata

Automation Layer: Zero-Ops ETL

Traditional approach: Airflow DAGs, Kubernetes CronJobs, custom monitoring dashboards.

MindsDB approach:

CREATE JOB jira_refresh_job (
    insert into jira_kb
    select * from jira_datasource.issues;
) EVERY hour;

Enter fullscreen mode Exit fullscreen mode

Translation: "Keep the knowledge base updated hourly, with incremental refreshes, forever."

We created three jobs running on different schedules:

  • Zendesk KB: Every 30 minutes
  • Jira KB: Every 30 minutes
  • Confluence KB: Every day

Zero DevOps overhead. MindsDB handles scheduled execution, incremental updates, failure retries, and status tracking.


The Three Agent Interfaces

1. LangGraph Conversational Agent

The LangGraph agent provides a visual development interface and conversational interaction through an intelligent routing system.

Starting the Agent:

uv run langgraph dev

Enter fullscreen mode Exit fullscreen mode

The server starts on http://127.0.0.1:2024 and automatically opens LangChain Studio in your browser, showing the graph visualization of the workflow.

Graph Architecture:

The support_agent graph uses intelligent routing:

  1. Router - Analyzes queries and determines which systems to search
  2. Search Nodes - Three parallel operations:
    • search_jira - Search Jira issues
    • search_zendesk - Search Zendesk tickets
    • search_confluence - Search Confluence documentation
  3. Aggregate Results - Synthesizes findings into unified responses
  4. Error Handling - Manages failures gracefully

Example Interaction:

Query: ”Is the team working on something related to 2FA? and are there any support tickets on the same?”

Agent Response:

2. MindsDB SQL Agent

The MindsDB SQL agent provides high-performance direct queries, automatically created during setup.

Accessing via MindsDB Studio:

  1. Open MindsDB Studio at http://localhost:47334
  2. Navigate to the SQL Editor
  3. Query using natural language:
SELECT answer
FROM support_agent
WHERE question = 'Anything that the engineering team is working on related to 2fa on jira?';

Enter fullscreen mode Exit fullscreen mode

The agent understands questions semantically and returns relevant results from all knowledge bases, automatically routing to Jira, Zendesk, or Confluence as needed.

Programmatic Access:

import mindsdb_sdk

server = mindsdb_sdk.connect('http://localhost:47334')
kb = server.knowledge_bases.get('zendesk_kb')

# Execute semantic search
results = kb.find('payment errors', limit=10)

Enter fullscreen mode Exit fullscreen mode

3. MCP Server for AI Assistants

The MCP (Model Context Protocol) server enables AI assistants like Cursor and Claude Desktop to search across all knowledge bases through natural conversation.

Starting the Server:

uv run fastmcp run server.py --transport=sse

Enter fullscreen mode Exit fullscreen mode

Server starts on http://127.0.0.1:5000/sse with the name "Support Agent Assistant".

Configure Cursor:

Add to Cursor settings:

{
  "mcpServers": {
    "area51-support-agent": {
      "command": "uv",
      "args": ["run", "fastmcp", "run", "server.py", "--transport=sse"],
      "env": {
        "MINDSDB_URL": "http://localhost:47334"
      }
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

Once configured, ask Cursor: "Search for authentication errors in Zendesk" and it will use the MCP server to query your knowledge bases.


Hybrid Search: The Best of Both Worlds

MindsDB's MATCHES operator unlocks semantic search, while traditional SQL enables structured filtering.

Semantic Search

Find relevant tickets using natural language understanding:

SELECT *
FROM confluence_kb
WHERE content = 'docker'
ORDER BY relevance DESC
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode

The semantic search understands that "docker" relates to "open source hosting" without requiring exact keyword matches.

Structured Filtering

Query specific metadata fields for precise results:

SELECT *
FROM jira_kb
WHERE status = 'In Progress';
Enter fullscreen mode Exit fullscreen mode

Hybrid Search

Combine semantic understanding with structured metadata filters:

SELECT *
FROM zendesk_kb
WHERE content = '2fa'
and hybrid_search=true
and hybrid_search_alpha=0.6
LIMIT 10;
Enter fullscreen mode Exit fullscreen mode

This query semantically searches for 2fa-related content while simultaneously filtering by while using hybrid search functionality of mindsDb.


Real-World Impact

Use Case 1: Distinguishing Feature Requests from Documentation Gaps

A customer asks how to bulk export transaction data via the API. The agent queries: "bulk export transaction data API"

  • Confluence: No documentation found
  • Jira: Finds "FEAT-445: Implement bulk data export endpoint" in development
  • Zendesk: Finds 3 similar tickets from past month

Result: Support can confidently respond that this is a planned feature with expected delivery next release, rather than creating a feature request.

Use Case 2: Finding Relevant Documentation to Resolve Tickets

Customer reports: "Getting 401 errors when calling the payments endpoint"

The agent queries: "authentication failures payments API 401 errors"

  • Confluence: Returns "API Authentication Troubleshooting Guide" with dedicated section on 401 errors
  • Zendesk: Finds 15 resolved tickets with similar issues
  • Pattern Analysis: Most common resolution is using sandbox keys in production

Result: Issue typically resolved in 5 minutes with existing documentation.

Use Case 3: Leveraging Historical Context for Faster Resolution

Customer: "Dashboard widgets loading very slowly since yesterday"

The agent queries: "dashboard performance slow loading widgets"

  • Zendesk: Shows 8 tickets in past 24 hours, plus historical spike 3 months ago
  • Jira: Finds current investigation (DASH-892) and past issue (DASH-756) with similar root cause
  • Confluence: Returns monitoring runbook and database maintenance procedures

Result: Engineering team checks database indexes first, finds the issue in 15 minutes instead of hours of investigation. Historical pattern recognition prevents reinventing the wheel.


Performance Metrics

Area51 achieves strong performance across all three knowledge bases with sub-second query times and high relevance.

Knowledge Base Hit Rate Recall Score Avg Query Time
Confluence KB 94.0% 0.940 0.15s

Metrics are generated using MindsDB's built-in evaluation framework, which calculates hit rates, cumulative recall, and mean reciprocal rank. The system auto-generates test questions from knowledge base content for continuous quality monitoring.

Evaluation Command

EVALUATE KNOWLEDGE_BASE confluence_kb
USING 
    test_table = 'pgvector_datasource.confluence_test_table',
    version = 'doc_id',
    generate_data = false,
    evaluate = true;

Enter fullscreen mode Exit fullscreen mode

Returns comprehensive metrics including Hit@1, Hit@5, Hit@10, MRR, and recall scores.


Deployment

Our entire system runs in a single Docker Compose file:

services:
  mindsdb:
    build: .
    ports:
      - "47334:47334"
  pgvector:
    image: pgvector/pgvector:pg16
    ports:
      - "5432:5432"

Enter fullscreen mode Exit fullscreen mode

Setup script (setup.py) handles:

  1. Starting containers
  2. Creating datasources
  3. Creating knowledge bases
  4. Inserting initial data
  5. Creating agents
  6. Scheduling refresh jobs

Technical Stack

Layer Technology Why
LLM Azure OpenAI (GPT-4o-mini) Cost-effective embeddings + reasoning
Orchestration LangGraph Multi-turn agent flows
Vector Storage pgvector (via MindsDB) Postgres-native, no extra infra
Data Layer MindsDB Knowledge Bases SQL-native, zero ETL
Connectors MindsDB integrations Jira, Zendesk, Confluence
Agent Framework MindsDB SQL Agent Direct hybrid query interface
MCP Wrapper FastMCP Claude Desktop integration
Hosting Docker Compose Single docker compose up deployment

Getting Started

Prerequisites:

  • Docker and Docker Compose
  • Python 3.10+
  • Azure OpenAI API key
  • Zendesk/Jira/Confluence API credentials

Installation:

# Clone the repository
git clone <repo-url>
cd area51

# Set up Python environment
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Configure environment variablescp .env.example .env
# Edit .env with your API credentials# Run setup script
python setup.py --mode setup

Enter fullscreen mode Exit fullscreen mode

Acknowledgments

Special thanks to the @mindsdbteam team for building a platform that makes AI-powered data applications accessible to every developer.

Built for Hacktoberfest 2024 by Prasanna Saravanan, Vignesh, and Varsha. Powered by MindsDB Knowledge Bases.

GitHub Repository: https://github.com/prasnna-saravanan/area-51

Made with ❤️ during Hacktoberfest 2025

Top comments (0)