Authors: Alejandro Ponce de León Chávez, Nigel Brown, and Pankaj Telang
AI: the final frontier. These are the voyages of the Enterprise. Our continuing mission: to explore strange new worlds; to seek out new applications and new technologies; to boldly go where not everyone has gone before!
Bottom line up front: We built a production-ready solution that makes enterprise Google Drive, GitHub, and Discord knowledge instantly available to AI agents using Model Context Protocol (MCP) servers and deployed using ToolHive's Kubernetes operator. Instead of hunting through documents for hours, your AI agents can now find and synthesize information from your organization's scattered knowledge in seconds.
The Enterprise Knowledge Nebula
Most enterprises struggle to chart a course through a galaxy of scattered knowledge. Essential data is often marooned across far-flung systems - Google Docs, internal wikis, Slack channels, Discord servers, and GitHub repositories - forming a nebula of information silos.
Picture this familiar scenario: You're working on a critical project deadline and need to find the latest marketing assets, product roadmap, or expense policy buried somewhere in your company's Google Drive. You know it exists, but where? Sound familiar?
This fragmentation leads to adverse business impact:
- Inefficiency: Employees waste valuable time searching for, collating, and synthesizing information from multiple sources
- Information decay: Important knowledge remains inaccessible to employees who need it, leading to duplicated efforts and missed opportunities
- Reduced productivity: The difficulty in accessing relevant information hinders collaboration and decision-making All the data you need is there, if only you had time to read it... or if someone could read it for you... or something...
We could use AI for that!
Why MCP Servers Are the Missing Piece
Large language models excel at understanding and reasoning about information, but they're blind to your proprietary enterprise knowledge. That's where Model Context Protocol (MCP) servers come in - they're the bridge that connects AI agents to your internal systems.
For example, when you ask Claude about your company's expense policy, an MCP server can fetch the relevant document from Google Drive and provide that context to the AI model.
The challenge? Deploying and managing MCP servers in enterprise environments requires solving for security, scalability, and reliability - exactly what ToolHive and Kubernetes excel at.
Our Journey: From Concept to Production
We started thinking about building a tool to connect our dots, but then discovered that one already exists: Onyx - an end-to-end open source solution for enterprise knowledge management.
We decided to explore. We connected Onyx to multiple sources including Google Drive, GitHub repositories, and Discord channels, let it read and semantically index the text, and then unleashed AI agents on it. Here's how we built our enterprise-ready knowledge retrieval system.
The Architecture
Our solution combines four key technologies:
- Onyx: Extracts and indexes content from multiple sources, including Google Drive, GitHub repositories, Discord channels, and others, using vector embeddings
- ToolHive Kubernetes Operator: Our ready-to-use operator that deploys and manages MCP servers securely at scale
- Knowledge MCP server: Acts as a secure bridge between AI clients and the Onyx knowledge base
- LibreChat: A flexible, open-source UI for AI interactions that integrates seamlessly with MCP servers
Implementation Journey
Step 1: Deploy Onyx
We started with Onyx's Kubernetes deployment. Key points from our experience:
- We copied images to our local cloud repository for better control
- The default configuration gives you a full-size cluster — you can scale this back for smaller deployments
- GPUs would help performance but aren't strictly necessary for getting started
- Pay careful attention to authentication setup (more on this below)
Onyx comes with a built-in chat interface that might be all you need. However, as an enterprise that needs to integrate with other agents, apps, and domains while ensuring proper access control, we wanted a different approach.
Step 2: Create the MCP Server
MCP proved to be the ideal marshalling point. We created a custom MCP server for Onyx.
The server is fairly simple - essentially a passthrough for calls to Onyx with some tailored prompts and authentication handling.
Step 3: Deploy with ToolHive
This is where ToolHive's Kubernetes operator shines. Instead of manually configuring containers and networking, you define your MCP server as a Kubernetes custom resource:
apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPServer
metadata:
name: knowledge-mcp-server
namespace: toolhive-system
spec:
image: xxxx.com/knowledge-mcp-server:latest
transport: streamable-http
port: 8000
targetPort: 8000
env:
- name: ONYX_URL
value: "http://onyx-api-service.onyx:8080"
permissionProfile:
type: builtin
name: network
resources:
limits:
cpu: "100m"
memory: "128Mi"
requests:
cpu: "50m"
memory: "64Mi"
oidcConfig:
type: inline
inline:
issuer: <IDP URL>
audience: <AUDIENCE FOR TOKEN>
jwksUrl: <URL TO FETCH JWKS>
jwksAllowPrivateIP: false
ToolHive gives us a layer of control and authentication over the MCP server. Connections are protected by OAuth, so we know exactly who is calling in.
Step 4: Set Up LibreChat
To make this available to our team on a permanent basis, we deployed LibreChat - a fantastic, flexible open-source AI chat interface. This gives us a production-ready UI that integrates seamlessly with our MCP server.
- Note this slight issue if you try this at home
ToolHive: Making MCP Servers Enterprise-Ready
The game-changer in our architecture is ToolHive's Kubernetes operator. Here's why it matters:
One-command deployment: Apply the YAML configuration with a simple kubectl command and ToolHive handles pod creation, service discovery, security policies, and monitoring automatically:
kubectl apply -f toolhive-deployment.yaml
Security by default: Every MCP server runs in an isolated container with minimal permissions. The operator automatically creates:
- Dedicated ServiceAccount per MCP with least-privilege access
- Network policies that restrict communication
- Secure secret management for OAuth credentials
- RBAC configurations for multi-tenant deployments
Enterprise scale: The operator supports multi-namespace deployments, allowing different teams to manage their own MCP servers while maintaining security boundaries.
Real-World Results: From Hours to Seconds
The transformation is immediate. Instead of employees spending hours hunting through Google Drive folders, GitHub issues, or Discord messages, they can ask natural language questions and get answers with source citations in seconds.
Example interactions:
Although not essential it helps to create a custom agent in LibreChat. This saves things like the model, prompt and tools to use for later.
Then we can go through and ask about some of the enterprise data we’ve given it.
It does a good job, joining the dots for us.
Lessons Learned: The Good, The Bad, and The AI
Is it good?
Well, yes. And no. (It is AI after all!)
The impressive parts:
- Simple retrieval across massive document collections
- Excellent at synthesizing information from multiple sources
- Natural language queries that actually work
- Source citations that let you verify information
The challenges:
- Sometimes it struggles with dates and can make things up
- You need to be very careful with permissions
- Authorization policies between different tools need careful consideration
Security Considerations: Boldly Safely Going
Take care with permissions - If you follow default instructions, you might expose all your documents to everyone. This probably isn't what you want.
The default approach suggested by Onyx requires domain-wide delegation access for the full Google Drive Workspace. It is a big, scary ask. Onyx will impersonate each user in the domain and fetch their documents. Had we granted that, it would have allowed access to all documents in our domain, including sensitive documents with PII. Furthermore, indexed document fragments will end up in the vector database which may or may not be secured to high enough standards. The authorization boundary is also fragmented in this process.
While it would be possible to take a less restrictive approach with broader service account permissions, we prioritized security and explicit document access setup. We used a normal account with OAuth access and standard permission. For now, the account has the same access as a typical employee to our internally public documents, which represent the knowledge that teams want to be discoverable and accessible and have explicitly shared for broad internal access. This gives us confidence that every piece of information in our vector database belongs there intentionally, and that access to the database doesn’t create an escalation of privilege for users of the database.
Still on our roadmap:
- Pass the OIDC token from LibreChat through the MCP to Onyx for proper authorization
- Make email address verification spoof-proof
- Implement fine-grained access controls
- Open-source the Knowledge MCP server
Getting Started: Deploy Your Own Knowledge MCP Server
Want to try this yourself? Here's the path we recommend:
- Set up the ToolHive Operator
# Install the operator CRDs
helm upgrade -i toolhive-operator-crds \
oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds
# Deploy the operator
helm upgrade -i toolhive-operator \
oci://ghcr.io/stacklok/toolhive/toolhive-operator \
-n toolhive-system --create-namespace
Deploy Onyx: Follow the Kubernetes deployment guide and configure your Google Drive connectors.
Create your MCP server: Use your favorite programming language. We went with the Python SDK.
- Note: We implemented a search API in Onyx which is called by the MCP server.
Deploy with ToolHive: Apply your MCPServer resource and watch ToolHive handle the rest.
Connect LibreChat: Deploy LibreChat and configure it to use your new MCP server.
Try ToolHive Yourself
Ready to break down your knowledge silos?
- Explore ToolHive: Check out the ToolHive documentation and try the Kubernetes operator quickstart
- Join the community: Connect with other MCP developers in our Discord
The tools exist today to make enterprise knowledge universally accessible to AI agents. The question isn't whether to build this capability - it's how quickly you can deploy it.
You might just Klingon to it! (Ach!, this metaphor - she cannae take any more, captain!)
What knowledge silos is your organization struggling with? Let us know in the comments how you're thinking about connecting AI agents to your internal systems.
Top comments (0)