DEV Community: Pankaj Telang

A Mind Meld for the Modern Enterprise: Breaking Down Knowledge Silos with MCP

Pankaj Telang — Mon, 04 Aug 2025 02:36:54 +0000

Authors: Alejandro Ponce de León Chávez, Nigel Brown, and Pankaj Telang

AI: the final frontier. These are the voyages of the Enterprise. Our continuing mission: to explore strange new worlds; to seek out new applications and new technologies; to boldly go where not everyone has gone before!

Bottom line up front: We built a production-ready solution that makes enterprise Google Drive, GitHub, and Discord knowledge instantly available to AI agents using Model Context Protocol (MCP) servers and deployed using ToolHive's Kubernetes operator. Instead of hunting through documents for hours, your AI agents can now find and synthesize information from your organization's scattered knowledge in seconds.

The Enterprise Knowledge Nebula

Most enterprises struggle to chart a course through a galaxy of scattered knowledge. Essential data is often marooned across far-flung systems - Google Docs, internal wikis, Slack channels, Discord servers, and GitHub repositories - forming a nebula of information silos.

Picture this familiar scenario: You're working on a critical project deadline and need to find the latest marketing assets, product roadmap, or expense policy buried somewhere in your company's Google Drive. You know it exists, but where? Sound familiar?

This fragmentation leads to adverse business impact:

Inefficiency: Employees waste valuable time searching for, collating, and synthesizing information from multiple sources
Information decay: Important knowledge remains inaccessible to employees who need it, leading to duplicated efforts and missed opportunities
Reduced productivity: The difficulty in accessing relevant information hinders collaboration and decision-making All the data you need is there, if only you had time to read it... or if someone could read it for you... or something...

We could use AI for that!

Why MCP Servers Are the Missing Piece

Large language models excel at understanding and reasoning about information, but they're blind to your proprietary enterprise knowledge. That's where Model Context Protocol (MCP) servers come in - they're the bridge that connects AI agents to your internal systems.

For example, when you ask Claude about your company's expense policy, an MCP server can fetch the relevant document from Google Drive and provide that context to the AI model.

The challenge? Deploying and managing MCP servers in enterprise environments requires solving for security, scalability, and reliability - exactly what ToolHive and Kubernetes excel at.

Our Journey: From Concept to Production

We started thinking about building a tool to connect our dots, but then discovered that one already exists: Onyx - an end-to-end open source solution for enterprise knowledge management.

We decided to explore. We connected Onyx to multiple sources including Google Drive, GitHub repositories, and Discord channels, let it read and semantically index the text, and then unleashed AI agents on it. Here's how we built our enterprise-ready knowledge retrieval system.

The Architecture

Our solution combines four key technologies:

Onyx: Extracts and indexes content from multiple sources, including Google Drive, GitHub repositories, Discord channels, and others, using vector embeddings
ToolHive Kubernetes Operator: Our ready-to-use operator that deploys and manages MCP servers securely at scale
Knowledge MCP server: Acts as a secure bridge between AI clients and the Onyx knowledge base
LibreChat: A flexible, open-source UI for AI interactions that integrates seamlessly with MCP servers

Implementation Journey

Step 1: Deploy Onyx

We started with Onyx's Kubernetes deployment. Key points from our experience:

We copied images to our local cloud repository for better control
The default configuration gives you a full-size cluster — you can scale this back for smaller deployments
GPUs would help performance but aren't strictly necessary for getting started
Pay careful attention to authentication setup (more on this below)

Onyx comes with a built-in chat interface that might be all you need. However, as an enterprise that needs to integrate with other agents, apps, and domains while ensuring proper access control, we wanted a different approach.

Step 2: Create the MCP Server

MCP proved to be the ideal marshalling point. We created a custom MCP server for Onyx.

The server is fairly simple - essentially a passthrough for calls to Onyx with some tailored prompts and authentication handling.

Step 3: Deploy with ToolHive

This is where ToolHive's Kubernetes operator shines. Instead of manually configuring containers and networking, you define your MCP server as a Kubernetes custom resource:

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPServer
metadata:
  name: knowledge-mcp-server
  namespace: toolhive-system
spec:
  image: xxxx.com/knowledge-mcp-server:latest
  transport: streamable-http
  port: 8000
  targetPort: 8000
  env:
    - name: ONYX_URL
      value: "http://onyx-api-service.onyx:8080"
  permissionProfile:
    type: builtin
    name: network
  resources:
    limits:
      cpu: "100m"
      memory: "128Mi"
    requests:
      cpu: "50m"
      memory: "64Mi"
  oidcConfig:
    type: inline
    inline:
      issuer: <IDP URL>
      audience: <AUDIENCE FOR TOKEN>
      jwksUrl: <URL TO FETCH JWKS>
      jwksAllowPrivateIP: false

ToolHive gives us a layer of control and authentication over the MCP server. Connections are protected by OAuth, so we know exactly who is calling in.

Step 4: Set Up LibreChat

To make this available to our team on a permanent basis, we deployed LibreChat - a fantastic, flexible open-source AI chat interface. This gives us a production-ready UI that integrates seamlessly with our MCP server.

Note this slight issue if you try this at home

ToolHive: Making MCP Servers Enterprise-Ready

The game-changer in our architecture is ToolHive's Kubernetes operator. Here's why it matters:

One-command deployment: Apply the YAML configuration with a simple kubectl command and ToolHive handles pod creation, service discovery, security policies, and monitoring automatically:

kubectl apply -f toolhive-deployment.yaml

Security by default: Every MCP server runs in an isolated container with minimal permissions. The operator automatically creates:

Dedicated ServiceAccount per MCP with least-privilege access
Network policies that restrict communication
Secure secret management for OAuth credentials
RBAC configurations for multi-tenant deployments

Enterprise scale: The operator supports multi-namespace deployments, allowing different teams to manage their own MCP servers while maintaining security boundaries.

Real-World Results: From Hours to Seconds

The transformation is immediate. Instead of employees spending hours hunting through Google Drive folders, GitHub issues, or Discord messages, they can ask natural language questions and get answers with source citations in seconds.

Example interactions:

Although not essential it helps to create a custom agent in LibreChat. This saves things like the model, prompt and tools to use for later.

Then we can go through and ask about some of the enterprise data we’ve given it.

It does a good job, joining the dots for us.

Lessons Learned: The Good, The Bad, and The AI

Is it good?

Well, yes. And no. (It is AI after all!)

The impressive parts:

Simple retrieval across massive document collections
Excellent at synthesizing information from multiple sources
Natural language queries that actually work
Source citations that let you verify information

The challenges:

Sometimes it struggles with dates and can make things up
You need to be very careful with permissions
Authorization policies between different tools need careful consideration

Security Considerations: Boldly Safely Going

Take care with permissions - If you follow default instructions, you might expose all your documents to everyone. This probably isn't what you want.

The default approach suggested by Onyx requires domain-wide delegation access for the full Google Drive Workspace. It is a big, scary ask. Onyx will impersonate each user in the domain and fetch their documents. Had we granted that, it would have allowed access to all documents in our domain, including sensitive documents with PII. Furthermore, indexed document fragments will end up in the vector database which may or may not be secured to high enough standards. The authorization boundary is also fragmented in this process.

While it would be possible to take a less restrictive approach with broader service account permissions, we prioritized security and explicit document access setup. We used a normal account with OAuth access and standard permission. For now, the account has the same access as a typical employee to our internally public documents, which represent the knowledge that teams want to be discoverable and accessible and have explicitly shared for broad internal access. This gives us confidence that every piece of information in our vector database belongs there intentionally, and that access to the database doesn’t create an escalation of privilege for users of the database.

Still on our roadmap:

Pass the OIDC token from LibreChat through the MCP to Onyx for proper authorization
Make email address verification spoof-proof
Implement fine-grained access controls
Open-source the Knowledge MCP server

Getting Started: Deploy Your Own Knowledge MCP Server

Want to try this yourself? Here's the path we recommend:

Set up the ToolHive Operator

# Install the operator CRDs
helm upgrade -i toolhive-operator-crds \
  oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds

# Deploy the operator  
helm upgrade -i toolhive-operator \
  oci://ghcr.io/stacklok/toolhive/toolhive-operator \
  -n toolhive-system --create-namespace

Deploy Onyx: Follow the Kubernetes deployment guide and configure your Google Drive connectors.
Create your MCP server: Use your favorite programming language. We went with the Python SDK.

Note: We implemented a search API in Onyx which is called by the MCP server.

Deploy with ToolHive: Apply your MCPServer resource and watch ToolHive handle the rest.
Connect LibreChat: Deploy LibreChat and configure it to use your new MCP server.

Try ToolHive Yourself

Ready to break down your knowledge silos?

Explore ToolHive: Check out the ToolHive documentation and try the Kubernetes operator quickstart
Join the community: Connect with other MCP developers in our Discord

The tools exist today to make enterprise knowledge universally accessible to AI agents. The question isn't whether to build this capability - it's how quickly you can deploy it.

You might just Klingon to it! (Ach!, this metaphor - she cannae take any more, captain!)

What knowledge silos is your organization struggling with? Let us know in the comments how you're thinking about connecting AI agents to your internal systems.

Exploring SmolAgents

Pankaj Telang — Fri, 02 May 2025 19:30:24 +0000

Part 2 of Demystifying AI Agents

In Part 1 of this series, I covered the conceptual foundations of AI agents. In this part, I introduce SmolAgents, a lightweight agent framework, and demonstrate its use by implementing the stock trading scenario from Part 1.

SmolAgents Framework

SmolAgents is a lightweight AI agent framework developed by Hugging Face. It is designed to help developers build and deploy agents with minimal code.

To observe or act on the environment, the LLM (large language model) powering an agent can request tool calls in two ways:

By generating JSON, which the agent then parses, or
By generating executable Python code, which the agent executes using a python interpreter.

Research shows that the second method is more flexible and modular. SmolAgents supports both approaches; in this blog, I’ll focus on an implementation that uses executable Python code.

Stock Trading Agent

Let’s briefly recap the stock trading agent introduced in Part 1. This agent performs user-requested tasks using:

An LLM as the planning and reasoning engine
A set of tools for:
- Looking up stock tickers
- Fetching current holdings
- Getting stock prices
- Selling stocks

For example, a user might issue this instruction:

Check the stock price of Nvidia. If it is above $150, sell 80% of the stock that I hold.

Part 1 walked through an agent execution for the above user instruction in terms of the think-act-observe loop. Now, let's see how to implement it using SmolAgents.

SmolAgent Implementation of the Stock Trading Agent

We begin by importing the required components from the smolagents library:

from smolagents import CodeAgent, OpenAIServerModel, tool

Next, we implement the four tools needed by our stock trading agent. These tools use hardcoded values (for demonstration purposes); in a real-world scenario, they would interface with external APIs.

@tool
def lookup_ticker(name: str) -> str:
    """
    Looks up the stock ticker symbol for a given company name.
    """
    if name.lower() == "nvidia":
        return "NVDA"
    else:
        return "Unknown"

@tool
def get_my_stock_holdings() -> dict:
    """
    Returns the user's current stock holdings.
    """
    return {
        "NVDA": 100,
        "MSFT": 50,
        "TSLA": 20,
    }

@tool
def get_stock_price(ticker: str) -> float:
    """
    Returns the current price of the specified stock ticker.
    """
    if ticker == "NVDA":
        return 293.46
    elif ticker == "MSFT":
        return 225.23
    else:
        raise ValueError(f"Unknown stock symbol: {ticker}")

@tool
def sell_stock(ticker: str, quantity: int) -> str:
    """
    Sells a specified quantity of a given stock.
    """
    print(f"Sold {quantity} shares of {ticker}")
    return "Success"

Now we create the stock trading agent using the CodeAgent class:

model = OpenAIServerModel(
    model_id="gpt-4o",
    api_key="<OPENAI_API_KEY>"
)

stock_agent = CodeAgent(
    tools=[lookup_ticker, get_my_stock_holdings, get_stock_price, sell_stock],
    model=model,
    planning_interval=3,
)

Note that we're using CodeAgent, which relies on executable Python code for tool invocation.

To run the agent:

agent.run(
    "Check the stock price of Nvidia. If it is above $150, sell 80 percent of the stock that I hold."
)

Results

Here’s what happens when the agent executes. Observe how these align with each of the think-act-observe iterations I covered in Part 1.

The agent analyzes the instruction and creates a plan:
It calls the lookup_ticker tool to find Nvidia’s stock symbol:
It calls get_stock_price to check Nvidia’s current price:
It retrieves the user’s stock holdings via get_my_stock_holdings:
Since Nvidia's stock price is above $150, it sells 80% of the holdings:
Finally, it confirms the stock sale to the user:

A few things to note in the above executions:

All tool calls are implemented using Python, providing flexibility and reliability.
Since the calculations are handled in Python, their accuracy is ensured. Unlike LLMs, which can sometimes make errors in mathematical calculations, Python executes exact calculations using its built-in logic and libraries.

Conclusion

In this second part of the Demystifying AI Agents series, we demonstrated how to implement a simple stock trading agent using the SmolAgents framework. This hands-on example showcases how lightweight frameworks like SmolAgents can help developers quickly build and experiment with autonomous agents using LLMs. In the next part, we’ll explore how CodeGate can be used in conjunction with SmolAgents for security and privacy.

References:

Artificial Intelligence: A Modern Approach, Russell & Norvig, Chapter 3 https://people.eecs.berkeley.edu/~russell/aima1e/chapter02.pdf
Smolagents: https://huggingface.co/blog/smolagents
Codegate: https://codegate.ai/

Understanding AI Agents

Pankaj Telang — Tue, 04 Mar 2025 19:15:57 +0000

Part 1 of Demystifying AI agents

If you're a developer or engineer navigating the rapidly evolving landscape of AI, you're in the right place. I’ve spent a while exploring how AI can enhance software development, security, and automation. My goal with this blog series is to demystify AI agents—what they are, how they work, and how you can build secure and reliable ones.

This series will break down key concepts around AI agents, walk through real-world use cases, and provide practical guidance on implementing them effectively. Whether you're experimenting with autonomous systems or looking to integrate agents into your workflow, I hope this blog will help you understand not just how agents function but also why they behave the way they do.

What You’ll Learn in This Series

What is an AI agent? A foundational look at agents, the thought-act-observe loop, and how they interact with tools.
Exploring SmolAgents – a lightweight framework for building AI agents.
Securing AI agents – how CodeGate can protect AI-driven workflows.

What is an AI agent?
An agent is an entity that can observe and act on the environment in which it is situated. Based on the user provided goal (i.e. instructions), the agent first constructs a plan (i.e. a sequence of steps), to achieve the goal. Then, the agent executes the plan, optionally calling a pre-defined set of tools.

A tool is an arbitrary function in a programming language with a well defined interface. For example, the tool might be a function that performs some calculation, query a database, or call an API.

Conceptually, the agent executes several iterations of a loop consisting of: thinking, acting and observing. This is also referred to as the think-act-observe loop.

Think: Given a user instruction, the agent thinks and constructs a plan to satisfy the user’s request. As the agent executes the plan, the agent reasons with the current state to decide which action to perform next.
Act: The agent executes the action it decided from thinking by calling the available tools. A tool might read or update some information or it might carry out a physical action in the real world.
Observe: The agent observes the results from its action. These results are the output generated by the tool used for the action. The agent uses these results to further think about its next action.

Fig 1 illustrates a general schema of an AI agent.

As a naive example, a stock trading agent can observe the price of a specific stock in a stock market and perform the action of either purchasing or selling that stock based on the user instructions.

In the context of LLMs, an agent is a program which employs a LLM as a planning and reasoning engine to achieve a user specified goal. For observing and acting, the agent employs the available tools. However, the decision to use a tool is made by the LLM. I will clarify this notion of an agent using the example of stock trading.

Suppose the stock trading agent from the above example has access to these tools:

# Lookup the stock ticker of a company
lookup_ticker(company_name: str)

# Lookup my stock holdings
get_my_stock_holdings()

# Get the price of a given stock ticker
get_stock_price(stock_ticker: str)

# Sell the stock of given ticker and given quantity
sell_stock(stock_ticker: str, quantity: int)

Now, assume a user provides these instructions to the agent:

Check the stock price of Nvidia. If it is above $150, sell 80% of the stock that I hold.

When the agent starts execution, it employs the LLM to construct a plan, that is, a sequence of steps. The LLMs reasoning process might look like below.

I need to create a plan for this problem. First, I need to check the price of Nvidia stock. I have access to the get_stock_price tool. But the tool needs a stock ticker. I need to first look up the ticker symbol of Nvidia. I will use the lookup_ticker tool. 
Step 1: Call lookup_ticker to get ticker for Nvidia.

Now I can look up Nvidia’s stock price using the get_stock_price tool. 
Step 2: Call get_stock_price to get the price using ticker. 

If the price is above $150, I need to sell 80% of the stock. I need to first get the total number of shares. I will use the get_my_stock_holdings tool. 
Step 3: Call get_mystock_holdings() to get the number of Nvidia stock.

I need to calculate 80% of the Nvidia shares. I will use the calculator tool. 
Step 4: Call calculate to get 80% of the shares.

I need to sell 80% of the shares. I can use the sell_stock tool.
Step 5: Call sell_stock to sell 80% of the shares.

At the end of the reasoning process, the LLM has generated a plan consisting of Steps 1–5.

Now, the agent can execute this plan by repeatedly running the think-act-observe loop. Here is how the agent execution might look like.

Step 1

Think: I now have a plan, and I need to execute Step 1.
Act: Call lookup_ticker to get ticker for Nvidia.
Observe: The tool returns the stock ticker as: NVDA.

Step 2

Think: Now I have the ticker symbol NVDA of Nvidia. I can execute Step 2.
Act: Call get_stock_price to get the price using ticker. 
Observe: The tool returns the stock price as: $160.

Step 2

Think: Since $160 is above $150, I need to sell 80% of the shares. I need to execute Step 3. 
Act:  Call get_mystock_holdings().
Observe: The tool returns the stock holding: {“nvidia”: 100, “msft”: 50, “tsla”: 10}.

Step 4

Think: I know that the user has 100 units of Nvidia. I need to execute Step 4.
Act: Call calculate to get 80% of the shares.
Observe: The tool returns: 20.

Step 5

Think: I know I need to sell 20 shares. I need to execute Step 5.
Act: Call sell_stock to sell 80% of the shares.
Observe: The tool returns success.

Done

Think: User’s request is successfully completed. I am done.
Act: Indicate completion of user’s request.

Conclusion: This blog introduces the notion of an AI agent. It presents the think-act-observe loop, which is central to how an agent executes. Through a simple stock scenario, I outline the planning and execution steps of an agent. I hope you found this blog useful. In the next blog, I will review smolagents, a recently introduced framework from Huggingface for developing such agents. I will also describe how smolagents can be secured using Codegate, a recently introduced framework for security AI agents and LLMs.

References:

Artificial Intelligence: A Modern Approach, Russell & Norvig, Chapter 3 https://people.eecs.berkeley.edu/~russell/aima1e/chapter02.pdf
Smolagents: https://huggingface.co/blog/smolagents
Codegate: https://codegate.ai/