<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Stacklok</title>
    <description>The latest articles on DEV Community by Stacklok (@stacklok).</description>
    <link>https://dev.to/stacklok</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F9162%2Fff58e982-dcd1-478b-93be-574a981873f6.png</url>
      <title>DEV Community: Stacklok</title>
      <link>https://dev.to/stacklok</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/stacklok"/>
    <language>en</language>
    <item>
      <title>Cut token waste across your entire team with the MCP Optimizer</title>
      <dc:creator>Alejandro Ponce de León</dc:creator>
      <pubDate>Wed, 11 Mar 2026 17:14:51 +0000</pubDate>
      <link>https://dev.to/stacklok/cut-token-waste-across-your-entire-team-with-the-mcp-optimizer-7e</link>
      <guid>https://dev.to/stacklok/cut-token-waste-across-your-entire-team-with-the-mcp-optimizer-7e</guid>
      <description>&lt;p&gt;You already cut your own token bill. Now imagine doing that for every member on your team, without them lifting a finger.&lt;/p&gt;

&lt;p&gt;Here's what you'll learn in this post:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Why per-person Optimizer setups don't scale, and what to do instead
&lt;/li&gt;
&lt;li&gt;How Stacklok's &lt;a href="https://stacklok.com/blog/introducing-virtual-mcp-server-unified-gateway-for-multi-mcp-workflows/" rel="noopener noreferrer"&gt;Virtual MCP Server (vMCP)&lt;/a&gt; delivers team-wide token savings from a single deployment
&lt;/li&gt;
&lt;li&gt;How AI agents benefit automatically, with no per-agent configuration required
&lt;/li&gt;
&lt;li&gt;How to deploy the Optimizer in Kubernetes in two steps&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8v141h8mzwkhsxd2b5x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8v141h8mzwkhsxd2b5x.png" alt=" " width="800" height="505"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The MCP Optimizer dynamically finds and exposes the right tools to clients only when needed, via a unified vMCP Gateway endpoint.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem at scale
&lt;/h2&gt;

&lt;p&gt;If you read &lt;a href="https://stacklok.com/blog/cut-token-waste-from-your-ai-workflow-with-the-toolhive-mcp-optimizer/" rel="noopener noreferrer"&gt;Cut Token Waste from Your AI Workflow with the ToolHive MCP Optimizer&lt;/a&gt;, you know the local Optimizer works great — download it, run it, and watch your token bill drop by 60-85% per request in our benchmarks. But individual setups aren't enterprise setups. You can't ask every team member to install an embedding model, tune search parameters, and keep the whole thing running alongside their other tools. And you can't ask your platform team to verify that each of those setups is configured correctly and stays that way. You need a solution that everyone benefits from the moment they connect.&lt;/p&gt;

&lt;p&gt;Configuration drift is the first headache. One person runs a different embedding model than another. Someone tweaked the hybrid search ratio three weeks ago and forgot to tell anyone. Someone else doesn't even know the Optimizer needs configuring and wonders why their token bill is 3x everyone else's. Meanwhile, each machine burns CPU and memory running its own embedding inference — resources that could be doing literally anything else.&lt;/p&gt;

&lt;p&gt;AI agents amplify both the problem and the payoff. Agents that fan out across multiple MCP servers stuff the full tool catalog into the context window on every invocation. When an agent connects to five or six MCP servers, that catalog grows quickly. The token bill climbs, inference slows, and the LLM starts picking the wrong tools because it's drowning in descriptions.&lt;/p&gt;

&lt;p&gt;Multiply that by hundreds of agent runs a day. Without a centralized Optimizer, you'd have to manually wire it up for each agent and each server combination.&lt;/p&gt;

&lt;p&gt;What you actually want — for users and AI agents alike — is to configure it once, in one place, and have everyone benefit automatically. That's exactly what Stacklok now delivers through the vMCP and Operator.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Optimizer works
&lt;/h2&gt;

&lt;p&gt;The core idea is simple. Instead of sending your AI agent the full list of every tool from every MCP server (which can easily run to hundreds of descriptions), the Optimizer collapses them into two meta-tools:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Your agent receives a prompt&lt;/strong&gt; that requires tool use.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It calls &lt;code&gt;find_tool&lt;/code&gt;&lt;/strong&gt; with a natural language description of what it needs.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Optimizer runs hybrid search&lt;/strong&gt; (semantic and keyword) against all registered tools.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Only the relevant tools come back&lt;/strong&gt; — typically 8 instead of 200+.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The agent calls &lt;code&gt;call_tool&lt;/code&gt;&lt;/strong&gt; to invoke the one it needs.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Your agent never sees the full tool catalog. It discovers tools on demand, pays only for the descriptions it actually needs, and the LLM stays focused on fewer, more relevant options.&lt;/p&gt;

&lt;p&gt;For a deeper dive into the mechanics and benchmarks, see &lt;a href="https://stacklok.com/blog/cut-token-waste-from-your-ai-workflow-with-the-toolhive-mcp-optimizer/" rel="noopener noreferrer"&gt;the original Optimizer blog post&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  All the power of vMCP, now with cost savings
&lt;/h2&gt;

&lt;p&gt;If you're already running Stacklok in Kubernetes, you're likely using the vMCP— a unified gateway that aggregates multiple MCP servers behind a single endpoint. vMCP gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unified gateway&lt;/strong&gt;. One endpoint for all your MCP servers. Onboarding a new team member means sharing one URL, not configuring five connections.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authentication and authorization&lt;/strong&gt;. Centralized auth for incoming clients (OIDC, anonymous, etc.) and outgoing connections, so you can enforce access policies without modifying each MCP server.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aggregation and conflict resolution&lt;/strong&gt;. Automatic prefixing, priority ordering, or manual overrides when tool names collide across MCP servers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Optimizer adds one more layer on top:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Token optimization&lt;/strong&gt;. Every tool behind the gateway gets indexed. Clients see only &lt;code&gt;find_tool&lt;/code&gt; and &lt;code&gt;call_tool&lt;/code&gt; instead of the full catalog.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The savings are real. The &lt;a href="https://stacklok.com/blog/cut-token-waste-from-your-ai-workflow-with-the-toolhive-mcp-optimizer/" rel="noopener noreferrer"&gt;original Optimizer blog post&lt;/a&gt; walks through the benchmarks in detail, showing 60-85% token reductions per request. In a &lt;a href="https://stacklok.com/blog/stackloks-mcp-optimizer-vs-anthropics-tool-search-tool-a-head-to-head-comparison/" rel="noopener noreferrer"&gt;head-to-head comparison with Anthropic's tool search tool&lt;/a&gt;, the Optimizer matched or exceeded a first-party solution.&lt;/p&gt;

&lt;p&gt;Token savings aren't the only benefit. Fewer tool descriptions means less noise for the LLM to wade through, which means better tool selection and fewer hallucinated tool calls. You're saving tokens and getting better results.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to deploy the Optimizer in Kubernetes
&lt;/h2&gt;

&lt;p&gt;The Kubernetes setup is deliberately minimal. You need two things: an &lt;code&gt;EmbeddingServer&lt;/code&gt; and a reference to it from your &lt;code&gt;VirtualMCPServer&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Deploy an EmbeddingServer
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;EmbeddingServer&lt;/code&gt; Custom Resource Definition (CRD) manages a shared embedding model for the whole team. With sensible defaults baked in, the minimal configuration is just this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;toolhive.stacklok.dev/v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;EmbeddingServer&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;optimizer-embedding&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The operator defaults to &lt;code&gt;BAAI/bge-small-en-v1.5&lt;/code&gt; as the model and runs the &lt;a href="https://github.com/huggingface/text-embeddings-inference" rel="noopener noreferrer"&gt;HuggingFace Text Embeddings Inference&lt;/a&gt; server. You can increase the replica count via &lt;code&gt;spec.replicas&lt;/code&gt; to match your team's throughput needs. One shared instance serves every vMCP in the namespace. For all available configuration options, see the &lt;a href="https://docs.stacklok.com/toolhive/guides-vmcp/optimizer" rel="noopener noreferrer"&gt;Optimizer docs&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Reference it from your VirtualMCPServer
&lt;/h3&gt;

&lt;p&gt;Add a single field to your existing &lt;code&gt;VirtualMCPServer&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;embeddingServerRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;optimizer-embedding&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the only change. When the operator sees &lt;code&gt;embeddingServerRef&lt;/code&gt; without an explicit &lt;code&gt;optimizer&lt;/code&gt; config block, it auto-populates the optimizer with sensible defaults and resolves the embedding server URL automatically. You don't need any manual wiring.&lt;/p&gt;

&lt;p&gt;For finer control — tuning search parameters, timeouts, and more — see the &lt;a href="https://docs.stacklok.com/toolhive/guides-vmcp/optimizer" rel="noopener noreferrer"&gt;Optimizer docs&lt;/a&gt; for the full reference.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cost savings add up
&lt;/h2&gt;

&lt;p&gt;The per-request savings are compelling on their own, but they compound quickly when you multiply across a team: every team member, every request, every day. At typical API pricing, those savings add up fast. Fewer tokens also means faster responses and lower latency for your organization.&lt;/p&gt;

&lt;p&gt;Beyond the raw savings, the Kubernetes approach gives you operational advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitOps-friendly&lt;/strong&gt;. &lt;code&gt;EmbeddingServer&lt;/code&gt; and &lt;code&gt;VirtualMCPServer&lt;/code&gt; configurations live in Git, get reviewed in PRs, and deploy through your existing CI/CD pipeline. That gives you full change history and rollback for compliance requirements.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One shared embedding server&lt;/strong&gt;. Instead of every machine running a local embedding model, one instance serves the whole team. Less resource waste, consistent behavior.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero end-user setup&lt;/strong&gt;. Users point their MCP client at the vMCP endpoint. The Optimizer is transparent; they don't need to know it's there.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Centralized security boundary&lt;/strong&gt;. All tool discovery flows through one place, giving you a single point to audit and control which tools your team can access.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;p&gt;Here's everything referenced above and some extra resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Optimizer docs:&lt;/strong&gt; &lt;a href="https://docs.stacklok.com/toolhive/guides-vmcp/optimizer" rel="noopener noreferrer"&gt;Configuration guide&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;vMCP blog post:&lt;/strong&gt; &lt;a href="https://stacklok.com/blog/introducing-virtual-mcp-server-unified-gateway-for-multi-mcp-workflows/" rel="noopener noreferrer"&gt;Introducing Virtual MCP Server: a unified gateway for multi-MCP workflows&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;vMCP docs:&lt;/strong&gt; &lt;a href="https://docs.stacklok.com/toolhive/guides-vmcp" rel="noopener noreferrer"&gt;Virtual MCP Server configuration guide&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quickstart example:&lt;/strong&gt; &lt;a href="https://github.com/stacklok/toolhive/blob/main/examples/operator/virtual-mcps/vmcp_optimizer_quickstart.yaml" rel="noopener noreferrer"&gt;vmcp_optimizer_quickstart.yaml&lt;/a&gt;: deploys several MCP backends with a fully auto-configured optimizer
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;All options example:&lt;/strong&gt; &lt;a href="https://github.com/stacklok/toolhive/blob/main/examples/operator/virtual-mcps/vmcp_optimizer_all_options.yaml" rel="noopener noreferrer"&gt;vmcp_optimizer_all_options.yaml&lt;/a&gt;: every tuning knob exposed
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Original Optimizer blog:&lt;/strong&gt; &lt;a href="https://stacklok.com/blog/cut-token-waste-from-your-ai-workflow-with-the-toolhive-mcp-optimizer/" rel="noopener noreferrer"&gt;Cut Token Waste from Your AI Workflow&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ToolHive GitHub:&lt;/strong&gt; &lt;a href="https://github.com/stacklok/toolhive" rel="noopener noreferrer"&gt;github.com/stacklok/toolhive&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Want to see what Stacklok can do for your organization? &lt;a href="https://stacklok.com/contact" rel="noopener noreferrer"&gt;Book a demo&lt;/a&gt; or get started right away with &lt;a href="https://github.com/stacklok/toolhive" rel="noopener noreferrer"&gt;ToolHive&lt;/a&gt;, our open source project. Join the conversation and engage directly with our team on &lt;a href="https://discord.gg/stacklok" rel="noopener noreferrer"&gt;Discord&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>stacklok</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Build your first enterprise MCP server with GitHub Copilot</title>
      <dc:creator>Alejandro Ponce de León</dc:creator>
      <pubDate>Mon, 02 Feb 2026 08:53:54 +0000</pubDate>
      <link>https://dev.to/stacklok/build-your-first-enterprise-mcp-server-with-github-copilot-4gll</link>
      <guid>https://dev.to/stacklok/build-your-first-enterprise-mcp-server-with-github-copilot-4gll</guid>
      <description>&lt;p&gt;&lt;em&gt;Ever wondered how to bridge the gap between your company's private knowledge and AI assistants? You're about to vibecode your way there.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What all the fuss with MCP is about
&lt;/h2&gt;

&lt;p&gt;Back in November 2022, the world changed when OpenAI launched ChatGPT. It wasn't the first Large Language Model (LLM), but it was the most capable at the time, and most importantly, it was available for everyone to explore. To make a small analogy: it got to the moon first. LLMs sparked everyone's imagination and forever changed the way we work. Maybe that's a little far-fetched, but they definitely boosted productivity across many areas.&lt;/p&gt;

&lt;p&gt;Yet LLMs weren't (and still aren't) all-mighty. They've been trained on vast amounts of internet content, but they have two critical limitations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;They're not trained on private content.&lt;/strong&gt; No company wikis, internal docs, or how-tos.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;They have a knowledge cutoff.&lt;/strong&gt; Their training stops at a fixed date, usually months in the past.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So if you ask ChatGPT something like "How was feature X designed in product Y, and how can I integrate it with my new feature Z?", it will have no idea what you're talking about. First of all, it would most probably not have access to the implementation details, since it would fall into the private content of an organization. Even if it did, there’s no guarantee because it’s frozen in time; it doesn’t know what’s changed in the world since that cutoff.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP to the rescue
&lt;/h2&gt;

&lt;p&gt;Fortunately, both problems can be solved with &lt;a href="https://huggingface.co/docs/smolagents/en/tutorials/tools" rel="noopener noreferrer"&gt;&lt;strong&gt;tools&lt;/strong&gt;&lt;/a&gt;. Tools empower LLMs with capabilities beyond their training. To solve the two issues above, we can create tools that tell the LLM: "When you're asked about product X at company Y, use tool Z to get the most up-to-date information." That tool might, for example, search an internal knowledge base.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener noreferrer"&gt;&lt;strong&gt;MCP (Model Context Protocol)&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;has quickly become the standard for tool calling.&lt;/strong&gt; Modern AI systems have two essential parts: the client (VS Code, Cursor, ChatGPT, Claude Code, etc.) and the model itself. Tools live on the client side. When the model doesn't know something, it calls a tool that the client executes. Originally introduced by Anthropic, MCP’s open design and community adoption have made it the clear industry standard, now supported by &lt;a href="https://techcrunch.com/2025/03/26/openai-adopts-rival-anthropics-standard-for-connecting-ai-models-to-data/" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt;, &lt;a href="https://techcrunch.com/2025/04/09/google-says-itll-embrace-anthropics-standard-for-connecting-ai-models-to-data/" rel="noopener noreferrer"&gt;Google&lt;/a&gt;, Microsoft, and others. That means you can write an MCP server once and use it with your favorite clients.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building your first MCP, the AI scrappy way
&lt;/h2&gt;

&lt;p&gt;Let's say your boss just tasked you with connecting your AI assistants to the corporate Confluence wiki. This is a perfect use case for MCP; you need to expose enterprise knowledge to AI tools in a standardized way.&lt;/p&gt;

&lt;p&gt;For this tutorial, we'll assume you already have a querying system in place, whether that's a Retrieval Augmented Generation (RAG) pipeline, a search API, or another knowledge retrieval mechanism. Our job is to wrap that existing system with an MCP server so AI assistants can access it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Our approach: vibecoding
&lt;/h3&gt;

&lt;p&gt;We're going to build this MCP server using what &lt;a href="https://x.com/karpathy/status/1886192184808149383?lang=en" rel="noopener noreferrer"&gt;Andrej Karpathy&lt;/a&gt; sarcastically called "&lt;strong&gt;vibecoding&lt;/strong&gt;": letting LLMs do most, if not all, of the code. The term spread like wildfire because, well, it works surprisingly well for certain tasks. It's not a silver bullet, but it's perfect for handling boilerplate code and getting something functional quickly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ingredients
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Python 3.13+
&lt;/li&gt;
&lt;li&gt;VS Code with Copilot
&lt;/li&gt;
&lt;li&gt;uv for package management&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why Copilot?
&lt;/h3&gt;

&lt;p&gt;While editors like Cursor, Codex, Windsurf, and Claude Code have gained wide popularity for their deep AI integration, GitHub Copilot remains the most widely available option for enterprise developers. It’s often already included in Microsoft or GitHub contracts, making it simple to deploy without extra approvals. We’ll still use Copilot here because it’s what most teams already have available and will get the job done.&lt;/p&gt;

&lt;h3&gt;
  
  
  Implementation
&lt;/h3&gt;

&lt;h4&gt;
  
  
  The initial prompt
&lt;/h4&gt;

&lt;p&gt;Getting started with AI-assisted development is all about setting clear expectations. Here's the first prompt I used to kick off the project. Being specific about tooling and goals helps guide the AI toward the implementation you actually want. After this initial prompt, we should have the scaffolding of the project and most of the implementation ready.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;This is a new project called enterprise-mcp. It is a Python project using 3.13 or greater. The project is meant to be an MCP server that will access enterprise knowledge and make it available to LLMs. The project should:
- Use uv as package manager
- For adding packages use `uv add &amp;lt;package_name&amp;gt;`
- All configuration should be centralized in pyproject.toml file
- Use uv dependency groups when adding development dependencies like pytest, e.g. `uv add pytest --dev` or `uv add --group dev pytest`
- I would also like a Taskfile to centralize running commands, like `task format`, `task test`, or `task typecheck`
- Use `ruff` for linting and formatting
- Use `ty` for typechecking https://docs.astral.sh/ty/
- Use `async` functions wherever possible and `asyncio.gather` when parallelizing multiple tasks
- Use the official Python MCP SDK: https://github.com/modelcontextprotocol/python-sdk
- For now, make a single tool called search_enterprise_knowledge. Make sure the tool has appropriate descriptions that are descriptive enough for LLM usage
- Make the implementation with tests. I don't care so much about unit tests but about testing the overall functionality of the application
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Where the AI got confused
&lt;/h4&gt;

&lt;p&gt;Even with a detailed prompt, the first pass required some corrections. Still, we had a working implementation even at the first prompt, which is also quite impressive. Two main issues emerged, both likely related to the AI's knowledge cutoff:&lt;/p&gt;

&lt;h5&gt;
  
  
  1. Misunderstanding the MCP SDK
&lt;/h5&gt;

&lt;p&gt;Instead of using the official Python SDK, Copilot attempted to semi-reimplement the MCP protocol from scratch, creating custom &lt;code&gt;list_tools&lt;/code&gt; and &lt;code&gt;call_tool&lt;/code&gt; endpoints. Since the MCP SDK is fairly recent, it wasn't in the training data, and crucially, the AI didn't check the documentation before implementing.&lt;/p&gt;

&lt;h5&gt;
  
  
  2. Using Mypy instead of Ty
&lt;/h5&gt;

&lt;p&gt;Similar story here. The AI defaulted to the more established Mypy rather than looking up the newer Ty package I'd specified.&lt;/p&gt;

&lt;h4&gt;
  
  
  Manual refinements
&lt;/h4&gt;

&lt;p&gt;Beyond fixing the AI's mistakes, I made some personal preference edits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Structure of pyproject.toml.&lt;/strong&gt; No coding assistant until this day nails my pyproject.toml preferences on the first try (it may well be a me problem and not an AI problem). I referenced configurations from past projects I liked and adapted them here.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Taskfile.yml adjustments.&lt;/strong&gt; Same deal with the Taskfile.yml. That said, the AI got me 80-90% of the way there, which is pretty remarkable.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Iterating with prompt #2
&lt;/h4&gt;

&lt;p&gt;After the initial implementation and manual edits, a few minor improvements remained. Rather than handle them manually, I asked Copilot to finish the job because it would certainly take less time than I would:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I have made some changes in my server.py to correctly use the Python SDK. I want you to:
1. Transform my server to a streamable HTTP server.
2. Add a comprehensive docstring for my handle_search method so that it's usable by LLMs whenever enterprise knowledge is needed.
Check the documentation of the Python SDK to know how to correctly transform the server to streamable HTTP: https://github.com/modelcontextprotocol/python-sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Closing the loop
&lt;/h4&gt;

&lt;p&gt;The final step is updating project memory: the context file that helps future AI sessions (and human developers) understand your project quickly. For most coding assistants, this lives in &lt;code&gt;AGENTS.md&lt;/code&gt; or &lt;code&gt;CLAUDE.md&lt;/code&gt; at the project root. Most coding assistants recognize either. It's a good place to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Document the project structure, so the agent knows where to implement a new feature or fix a bug
&lt;/li&gt;
&lt;li&gt;Outline the project's best practices
&lt;/li&gt;
&lt;li&gt;Give instructions that can be repeated across runs, e.g., always run unit tests along with code linting
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Perfect, 3 final tasks after some manual modifications:
1. Make sure my commands `task format`, `task typecheck` and `task test` work and return without errors
2. Update the file AGENTS.md with relevant context information for coding agents. Take into account the best practices signaled at the beginning, like centralizing everything in pyproject.toml and using `task ..` commands to run relevant project commands. The code formatting and tests commands should be used every time a coding task is finished. Read the repo again for any other relevant information
3. Finally, update a README.md with a summary of the project and the development process
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Key lessons
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Tools are not API endpoints
&lt;/h4&gt;

&lt;p&gt;This is crucial to understand when building MCP servers: an MCP tool is fundamentally different from an API endpoint, even though it's tempting to map them one-to-one.&lt;/p&gt;

&lt;p&gt;API endpoints are designed as small, atomic, reusable operations. They're the building blocks you compose together: one endpoint to fetch user data, another to update preferences, another to send notifications. Each is focused and modular, meant to serve multiple use cases across your application.&lt;/p&gt;

&lt;p&gt;MCP tools, by contrast, are meant to accomplish complete deterministic workflows or actions. Think of an API as giving you a toolbox of small buttons, each doing one thing, that you wire together. An MCP tool is a single big button that says "do the thing." It handles an entire task from start to finish.&lt;/p&gt;

&lt;p&gt;For example, instead of separate tools for "search documents," "filter by date," and "format results," you'd create one &lt;code&gt;search_enterprise_knowledge&lt;/code&gt; tool that handles the full workflow of finding, filtering, and returning relevant information in one shot.&lt;/p&gt;

&lt;h4&gt;
  
  
  You're still accountable
&lt;/h4&gt;

&lt;p&gt;Whatever code the AI produces, you own it. If it breaks in production, you can't blame Copilot or Claude. Humans remain accountable for the code we ship.&lt;/p&gt;

&lt;p&gt;This means you should always review what gets generated. Not necessarily line-by-line, but at minimum: understand what it does, verify it follows your standards, and run it through your normal quality checks. A quick sanity check is never wasted time, especially when you're the one who'll be called at 2am to fix it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing the MCP server
&lt;/h2&gt;

&lt;p&gt;For this first iteration, it's better to remove all variables like coding assistants and configuration files. The easiest way to do that is with the &lt;a href="https://modelcontextprotocol.io/docs/tools/inspector" rel="noopener noreferrer"&gt;&lt;strong&gt;MCP Inspector&lt;/strong&gt;&lt;/a&gt;, a tool from Anthropic for inspecting an MCP server and querying it directly. To run the inspector:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @modelcontextprotocol/inspector 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example response
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fduk1l72xrtqu7m7bvy28.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fduk1l72xrtqu7m7bvy28.png" alt="The MCP Inspector connected to the enterprise knowledge MCP server" width="800" height="346"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;## Result 1
**Title:** API Documentation - Authentication
**Content:**
# API Authentication Guide
## Overview
Our REST API uses OAuth 2.0 for authentication.
## Getting Started
1. Register your application
2. Obtain client credentials
3. Request access token
4. Include token in requests

## Example

curl -H "Authorization: Bearer YOUR_TOKEN" \
  https://api.company.com/v1/users
Access tokens expire after 1 hour.


**Metadata:**
- author: API Team
- created: 2024-02-01
- last_updated: 2024-10-20
- tags: ['api', 'authentication', 'oauth', 'documentation']
- source: confluence
**URL:** https://company.atlassian.net/wiki/spaces/API/pages/987654321
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;We've successfully built a working MCP server using Copilot and vibecoding, ready to access enterprise knowledge through a standardized protocol!&lt;/p&gt;

&lt;p&gt;By letting GitHub Copilot handle most of the boilerplate code, we created a functional Python MCP server with proper tooling, testing, and documentation, all while maintaining code quality and best practices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full code repository:&lt;/strong&gt; &lt;a href="https://github.com/aponcedeleonch/enterprise-mcp" rel="noopener noreferrer"&gt;https://github.com/aponcedeleonch/enterprise-mcp&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the next blog post, we're taking this further by introducing &lt;a href="https://docs.stacklok.com/toolhive" rel="noopener noreferrer"&gt;&lt;strong&gt;ToolHive&lt;/strong&gt;&lt;/a&gt;, a powerful platform that makes deploying and managing MCP servers effortless. ToolHive offers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Instant deployment&lt;/strong&gt; using Docker containers or source packages (Python, TypeScript, or Go)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secure by default&lt;/strong&gt; with isolated containers, customizable permissions, and encrypted secrets management
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seamless integration&lt;/strong&gt; with GitHub Copilot, Cursor, and other popular AI clients
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise-ready features,&lt;/strong&gt; including OAuth-based authorization and Kubernetes deployment via the ToolHive Kubernetes Operator
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A curated registry&lt;/strong&gt; of verified MCP servers you can discover and run immediately, or create your own custom registry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stay tuned to learn how to evolve our enterprise MCP server from a prototype into a production-ready service!&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>githubcopilot</category>
      <category>ai</category>
      <category>vibecoding</category>
    </item>
    <item>
      <title>Introducing mcp-tef - Testing Your MCP Tool Descriptions Before They Cause Problems</title>
      <dc:creator>Nigel Brown</dc:creator>
      <pubDate>Tue, 16 Dec 2025 20:08:11 +0000</pubDate>
      <link>https://dev.to/stacklok/introducing-mcp-tef-testing-your-mcp-tool-descriptions-before-they-cause-problems-fan</link>
      <guid>https://dev.to/stacklok/introducing-mcp-tef-testing-your-mcp-tool-descriptions-before-they-cause-problems-fan</guid>
      <description>&lt;h2&gt;
  
  
  Introducing mcp-tef - Testing Your MCP Tool Descriptions Before They Cause Problems
&lt;/h2&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;When you build MCP tools, vague or overlapping descriptions cause LLMs to select the wrong tools—or no tools at all. Testing in production frustrates users and damages trust. &lt;strong&gt;mcp-tef&lt;/strong&gt; is an open-source tool evaluation system that lets you test tool descriptions systematically before deployment, catching problems early with real LLM testing, similarity detection, and quality analysis.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Tool Description Failures in Production
&lt;/h2&gt;

&lt;p&gt;When you write an MCP tool, you provide a name and description. The LLM reads this description and decides whether to use your tool based on user prompts. But here's what goes wrong:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vague descriptions confuse LLMs.&lt;/strong&gt; A tool called &lt;code&gt;search&lt;/code&gt; with description "Search for things" gives the LLM no information about what can be searched, how to search it, or what it returns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Overlapping descriptions cause conflicts.&lt;/strong&gt; You might have your own &lt;code&gt;create_issue&lt;/code&gt; tool, but then add a third-party GitHub MCP server that also has &lt;code&gt;create_issue&lt;/code&gt;. The LLM sees two tools with identical names doing similar things and can't determine which to select.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The result:&lt;/strong&gt; The LLM either picks the wrong tool entirely or becomes so confused that it picks no tool at all. Users get frustrated, trust erodes, and you're debugging in production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It gets worse with mixed environments.&lt;/strong&gt; The MCP ecosystem is growing fast. You're mixing custom tools with third-party MCP servers, and maybe multiple third-party servers together. Each has its own set of tools, and they all need to play nicely together. Without systematic testing, conflicts and confusion multiply.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters: The Cost of Getting It Wrong
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Testing in production is expensive.&lt;/strong&gt; By the time you realize your tool descriptions are broken, you've already frustrated users. You're fixing problems reactively instead of preventing them proactively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Manual testing doesn't scale.&lt;/strong&gt; How do you know if your fix actually works? How do you know if two descriptions are too similar? How do you test that the LLM will actually pick the right tool when a user asks a real question? You can't manually test every possible prompt against every combination of tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The solution:&lt;/strong&gt; Test tool descriptions systematically before deployment, with real LLM testing and actionable feedback.&lt;/p&gt;




&lt;h2&gt;
  
  
  How mcp-tef Solves This
&lt;/h2&gt;

&lt;p&gt;mcp-tef is an open source (Apache 2.0 licensed) tool evaluation system that helps you create correct, non-clashing tool descriptions from the start. It provides three core capabilities:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Tool evaluation
&lt;/h3&gt;

&lt;p&gt;Create test cases with real user prompts (queries), and mcp-tef tests whether the LLM picks the right tool. It provides metrics (precision, recall, F1 scores), validates parameter extraction, and analyzes confidence. If the LLM is highly confident but wrong, that's a "misleading" description that needs immediate attention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a test case&lt;/span&gt;
mtef test-case create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; https://localhost:8000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"GitHub repository search"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Find repositories related to MCP tools"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--expected-server&lt;/span&gt; &lt;span class="s2"&gt;"http://localhost:8080/github/mcp"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--expected-tool&lt;/span&gt; &lt;span class="s2"&gt;"search_repositories"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--servers&lt;/span&gt; &lt;span class="s2"&gt;"http://localhost:8080/github/mcp:streamable-http"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--insecure&lt;/span&gt;

✓ Test &lt;span class="k"&gt;case&lt;/span&gt; created successfully
ID: d2fcb4bf-8334-4339-a0a8-c1ead2deeea6

&lt;span class="c"&gt;# Run the test&lt;/span&gt;
mtef test-run execute d2fcb4bf-8334-4339-a0a8-c1ead2deeea6 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; https://localhost:8000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model-provider&lt;/span&gt; openrouter &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model-name&lt;/span&gt; anthropic/claude-3.5-sonnet &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--api-key&lt;/span&gt; sk-or-v1-... &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--insecure&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;✓ Test run completed successfully
Status: completed
Classification: TP &lt;span class="o"&gt;(&lt;/span&gt;True Positive&lt;span class="o"&gt;)&lt;/span&gt;
Tool Match: Correct
Confidence: high &lt;span class="o"&gt;(&lt;/span&gt;robust description&lt;span class="o"&gt;)&lt;/span&gt;
Param Score: 10.0/10
Execution: 9,295 ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Similarity detection
&lt;/h3&gt;

&lt;p&gt;Uses embeddings to find tools with similar descriptions. Generates similarity matrices showing which tools overlap, and flags high-similarity pairs (e.g., 0.87 similarity) that might confuse the LLM. Provides specific recommendations for differentiation, including revised descriptions you can use.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mtef similarity analyze &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; https://localhost:8000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--server-urls&lt;/span&gt; &lt;span class="s2"&gt;"http://localhost:8080/fetch/mcp:streamable-http,http://localhost:8080/toolhive-doc-mcp/mcp:streamable-http,http://localhost:8080/mcp-optimizer/mcp:streamable-http,http://localhost:8080/github/mcp:streamable-http"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--threshold&lt;/span&gt; 0.85 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--insecure&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;✓ Analysis &lt;span class="nb"&gt;complete&lt;/span&gt;: 18 pairs flagged above 0.85 threshold
Analyzed 55 tools across 4 servers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Tool quality analysis
&lt;/h3&gt;

&lt;p&gt;Scores tool descriptions on clarity, completeness, and conciseness (1-10 scale). Tells you what's missing, what's vague, and what could be improved. Provides suggested improved descriptions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;mtef tool-quality &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; https://localhost:8000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--server-urls&lt;/span&gt; &lt;span class="s2"&gt;"http://localhost:8080/toolhive-doc-mcp/mcp"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model-provider&lt;/span&gt; openrouter &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model-name&lt;/span&gt; anthropic/claude-3.5-sonnet &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--insecure&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--timeout&lt;/span&gt; 120
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ℹ Using mcp-tef at https://localhost:8000

Tool Quality Evaluation Results
&lt;span class="o"&gt;============================================================&lt;/span&gt;

┏━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Tool Name  ┃ Clarity ┃ Completeness ┃ Conciseness ┃
┡━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ query_docs │  7/10   │     6/10     │    9/10     │
│ get_chunk  │  6/10   │     4/10     │    8/10     │
└────────────┴─────────┴──────────────┴─────────────┘

✓ Evaluated 2 tool&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note on transport support:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Supported&lt;/strong&gt;: mcp-tef connects to MCP servers using the Streamable HTTP or SSE (deprecated) transports.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not supported&lt;/strong&gt;: mcp-tef does not support stdio servers directly, but you can use stdio-based MCP servers with &lt;a href="https://docs.stacklok.com/toolhive/guides-mcp/run-mcp-servers" rel="noopener noreferrer"&gt;ToolHive&lt;/a&gt;, which runs stdio servers and exposes them via Streamable HTTP endpoint.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Using mcp-tef: CLI and HTTP API
&lt;/h2&gt;

&lt;p&gt;All the examples in this post use the &lt;code&gt;mtef&lt;/code&gt; CLI tool, but every operation can also be performed directly via HTTP API calls. The mcp-tef server exposes a REST API with OpenAPI documentation, so you can integrate it into your own workflows, CI/CD pipelines, or applications. The server provides interactive API documentation at &lt;code&gt;/docs&lt;/code&gt; and an OpenAPI specification at &lt;code&gt;/openapi.json&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Both approaches provide the same functionality—choose the one that fits your workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where You Use It
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Your own MCP servers:&lt;/strong&gt; Test descriptions before deployment. Create test cases for common user prompts, run them through mcp-tef, iterate on descriptions until tests pass.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Third-party MCP servers:&lt;/strong&gt; Evaluate tools before integrating. Test server tools in isolation, see how well they perform, make informed decisions about which servers to use.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mixed environments:&lt;/strong&gt; Before mixing multiple servers together, run similarity detection. See which tools conflict, use mcp-tef's recommendations to understand how to differentiate them—maybe you'll need &lt;a href="https://dev.to/stacklok/introducing-virtual-mcp-server-unified-gateway-for-multi-mcp-workflows-17ee"&gt;vMCP's prefixing&lt;/a&gt;, or maybe you can improve descriptions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Continuous testing:&lt;/strong&gt; As you add new tools or update descriptions, keep testing. Make mcp-tef part of your CI/CD pipeline. Catch problems before they reach users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM comparison and migration:&lt;/strong&gt; Validate that different models (e.g., Anthropic Claude vs. Ollama Llama) correctly select tools using the same test cases. Compare performance across providers to ensure tool descriptions work consistently.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real-World Example
&lt;/h2&gt;

&lt;p&gt;You're building a document management MCP server with a tool called &lt;code&gt;search&lt;/code&gt; and description: "Search for documents."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;mcp-tef flags it:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clarity: 3/10
&lt;/li&gt;
&lt;li&gt;Missing: what can you search? Content? Filenames? Metadata? What does it return?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;You improve it to:&lt;/strong&gt; "Search document CONTENT using keywords and boolean operators. Supports PDF, TXT, DOCX, and MD files. Returns ranked results with highlighted excerpts and relevance scores."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You test it:&lt;/strong&gt; Create a test case, run it, LLM correctly selects your tool. Great!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But then you add a third-party file system MCP server&lt;/strong&gt; with &lt;code&gt;find_files&lt;/code&gt;: "Find files by searching with patterns."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Similarity detection catches it:&lt;/strong&gt; 0.87 similarity. Recommendation: "Emphasize that &lt;code&gt;search&lt;/code&gt; searches CONTENT, while &lt;code&gt;find_files&lt;/code&gt; searches FILENAMES."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You differentiate them clearly:&lt;/strong&gt; Now the LLM can distinguish between searching document content and finding files by name. If the third-party server doesn't update their description, you can still use &lt;a href="https://dev.to/stacklok/introducing-virtual-mcp-server-unified-gateway-for-multi-mcp-workflows-17ee"&gt;vMCP&lt;/a&gt; to prefix them, but now the descriptions are also clear, so the LLM makes better choices.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;mcp-tef is open source and works with several providers: Anthropic, OpenAI, Openrouter and Ollama.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Required&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python 3.13+
&lt;/li&gt;
&lt;li&gt;uv package manager (&lt;a href="https://docs.astral.sh/uv/" rel="noopener noreferrer"&gt;https://docs.astral.sh/uv/&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Optional&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ollama — for local LLM testing (no API keys needed)
&lt;/li&gt;
&lt;li&gt;Docker — if deploying via the CLI (mtef deploy)
&lt;/li&gt;
&lt;li&gt;API keys — for cloud LLM providers (e.g., OpenRouter) if not using Ollama&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Install:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv tool &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="s2"&gt;"mcp-tef-cli@git+https://github.com/StacklokLabs/mcp-tef.git#subdirectory=cli"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Deploy:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mtef deploy &lt;span class="nt"&gt;--health-check&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Test your tools:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using the examples above&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check quality&lt;/span&gt;
mtef tool-quality ...

&lt;span class="c"&gt;# Create test case&lt;/span&gt;
mtef test-case create &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"My first test"&lt;/span&gt; &lt;span class="nt"&gt;--query&lt;/span&gt; ...

&lt;span class="c"&gt;# Run test&lt;/span&gt;
mtef test-run execute &amp;lt;test-case-id&amp;gt; ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The whole process takes just a few minutes. You'll immediately see if your descriptions work or if they need improvement.&lt;/p&gt;




&lt;h2&gt;
  
  
  How mcp-tef Works with vMCP and MCP Optimizer
&lt;/h2&gt;

&lt;p&gt;These tools are designed to work together, each solving different parts of the MCP ecosystem challenge:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;mcp-tef&lt;/strong&gt; helps you write better tool descriptions from the start. It tests whether descriptions are clear, complete, and differentiated. When descriptions are good, LLMs make better tool selection decisions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/stacklok/introducing-virtual-mcp-server-unified-gateway-for-multi-mcp-workflows-17ee"&gt;&lt;strong&gt;vMCP (Virtual MCP Server)&lt;/strong&gt;&lt;/a&gt; provides a unified gateway for multiple MCP servers, handling tool name conflicts through intelligent prefixing and routing. When you've tested your descriptions with mcp-tef, vMCP's prefixing works even better—the LLM can distinguish tools not just by name, but by their clear, well-differentiated descriptions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/stacklok/stackloks-mcp-optimizer-vs-anthropics-tool-search-tool-a-head-to-head-comparison-2f32"&gt;&lt;strong&gt;MCP Optimizer&lt;/strong&gt;&lt;/a&gt; intelligently routes requests to the right tools across your MCP ecosystem. With well-tested descriptions from mcp-tef, Optimizer has better information to work with, requiring fewer manual overrides and making smarter routing decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The workflow:&lt;/strong&gt; Use mcp-tef to test and improve your tool descriptions. Deploy with vMCP to handle multi-server coordination. Let MCP Optimizer route requests intelligently. Good descriptions make all these solutions work better together, creating a more reliable and maintainable system.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Verdict
&lt;/h2&gt;

&lt;p&gt;mcp-tef helps you write better tool descriptions systematically, with real LLM testing and actionable feedback. But great descriptions work even better when combined with the right infrastructure tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key takeaway:&lt;/strong&gt; Test your tool descriptions before deploying. Good descriptions lead to better tool selection, which leads to happier users. And when you combine well-tested descriptions with tools like &lt;a href="https://dev.to/stacklok/introducing-virtual-mcp-server-unified-gateway-for-multi-mcp-workflows-17ee"&gt;vMCP&lt;/a&gt; and &lt;a href="https://dev.to/stacklok/stackloks-mcp-optimizer-vs-anthropics-tool-search-tool-a-head-to-head-comparison-2f32"&gt;MCP Optimizer&lt;/a&gt;, you get a robust, maintainable MCP ecosystem that works reliably at scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Points Summary
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The problem&lt;/strong&gt;: Vague or overlapping tool descriptions confuse LLMs, leading to incorrect tool selection.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it matters&lt;/strong&gt;: Testing in production frustrates users; prevention is better than reactive fixes.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The solution&lt;/strong&gt;: mcp-tef provides systematic testing with tool evaluation, similarity detection, and quality analysis.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Where to use it&lt;/strong&gt;: Your own servers, third-party servers, mixed environments, continuous testing, LLM comparison.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The goal&lt;/strong&gt;: Create descriptions that are correct and don't clash, making your entire MCP ecosystem work better.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Working together&lt;/strong&gt;: mcp-tef, vMCP, and MCP Optimizer complement each other. Good descriptions make infrastructure tools work even better.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Want to join in the MCP fun? Visit &lt;a href="https://toolhive.dev/" rel="noopener noreferrer"&gt;toolhive.dev&lt;/a&gt; and join the &lt;a href="https://discord.gg/stacklok" rel="noopener noreferrer"&gt;ToolHive community on Discord&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>toolhive</category>
    </item>
    <item>
      <title>Introducing Virtual MCP Server: Unified Gateway for Multi-MCP Workflows</title>
      <dc:creator>Dan Barr</dc:creator>
      <pubDate>Thu, 11 Dec 2025 15:59:12 +0000</pubDate>
      <link>https://dev.to/stacklok/introducing-virtual-mcp-server-unified-gateway-for-multi-mcp-workflows-17ee</link>
      <guid>https://dev.to/stacklok/introducing-virtual-mcp-server-unified-gateway-for-multi-mcp-workflows-17ee</guid>
      <description>&lt;p&gt;If you're working with AI coding assistants like GitHub Copilot or Claude, you've probably encountered MCP (Model Context Protocol) servers. They're powerful, connecting your AI to GitHub, Jira, Slack, cloud providers, and more. But here's the problem: each connection requires separate configuration, authentication, and maintenance.&lt;/p&gt;

&lt;p&gt;Managing MCP server connections gets messy fast. That’s why we built the &lt;strong&gt;Virtual MCP Server (vMCP)&lt;/strong&gt; in ToolHive to solve this problem by aggregating multiple MCP servers into a single unified endpoint.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem: connection overload
&lt;/h2&gt;

&lt;p&gt;Picture this: you're an engineer on a platform team. Your AI assistant needs access to GitHub for code, Jira for tickets, Slack for notifications, PagerDuty for incidents, Datadog for metrics, AWS for infrastructure, Confluence for docs, and your internal knowledge base. That's 8 separate MCP server connections, each exposing 10-20+ tools. Now your AI's context window is filling up with 80+ tool descriptions, burning tokens and degrading performance as the LLM struggles to select the right tools from an overwhelming list.&lt;/p&gt;

&lt;p&gt;Each MCP server connection requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Individual configuration in your AI client
&lt;/li&gt;
&lt;li&gt;Separate authentication credentials
&lt;/li&gt;
&lt;li&gt;Manual coordination when tasks span multiple systems
&lt;/li&gt;
&lt;li&gt;Repeated parameter entry (same repo, same channel, same database)
&lt;/li&gt;
&lt;li&gt;Tool filtering to avoid context bloat and wasted tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Want to investigate a production incident? You're manually running commands across 4 different systems and piecing together the results yourself. Deploying an app? You're orchestrating a sequence of operations: merge PR, wait for CI, get approval, deploy, notify team. It's tedious, error-prone, and not reusable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The solution: aggregate everything
&lt;/h2&gt;

&lt;p&gt;vMCP transforms those 8 connections into one. You configure a single MCP endpoint that aggregates all your backend servers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before vMCP:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"github"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"jira"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"slack"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pagerduty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"datadog"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"aws"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"confluence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"docs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;With vMCP:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"company-tools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://vmcp.company.com/mcp"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One connection. One authentication flow. All your tools available.&lt;/p&gt;

&lt;p&gt;And here’s the key: &lt;strong&gt;you can run as many vMCP instances as you need&lt;/strong&gt;. Your frontend team connects to one vMCP with their specific tools. Your platform team connects to another with infrastructure access. Each vMCP aggregates exactly the backends that each team needs, with appropriate security policies and permissions.&lt;/p&gt;

&lt;p&gt;This matters for two reasons: security (no more giving everyone access to everything) and efficiency (fewer tools means smaller context windows, which means lower token costs and better AI performance).&lt;/p&gt;

&lt;h2&gt;
  
  
  What vMCP does
&lt;/h2&gt;

&lt;p&gt;vMCP is part of the ToolHive Kubernetes Operator. It acts as an intelligent aggregation layer that sits between your AI client and your backend MCP servers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6k3iv7ipy29yk4cnywjp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6k3iv7ipy29yk4cnywjp.png" alt="Diagram of the basic vMCP architecture" width="800" height="640"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Multi-server aggregation with tool filtering
&lt;/h3&gt;

&lt;p&gt;All MCP tools appear through a single endpoint, &lt;strong&gt;but you cherry-pick exactly which tools to expose&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Example: An engineer on the ToolHive team gets a single vMCP connection with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub’s &lt;code&gt;search_code&lt;/code&gt; tool (scoped to the &lt;code&gt;stacklok/toolhive&lt;/code&gt; repo only)
&lt;/li&gt;
&lt;li&gt;The ToolHive docs MCP server
&lt;/li&gt;
&lt;li&gt;An internal docs server hooked up to Google Drive and filtered to ToolHive design docs
&lt;/li&gt;
&lt;li&gt;Slack (only the &lt;code&gt;#toolhive-team&lt;/code&gt; channel)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No irrelevant tools cluttering the LLM's context. No wasted tokens on unused tool descriptions. Just the tools needed for their work, making it easier for the AI to select the right tool every time.&lt;/p&gt;

&lt;p&gt;When multiple MCP servers have tools with the same name (both GitHub and Jira have &lt;code&gt;create_issue&lt;/code&gt;), vMCP automatically prefixes them: &lt;code&gt;github_create_issue&lt;/code&gt; and &lt;code&gt;jira_create_issue&lt;/code&gt;. You can customize these names however you want.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Declarative multi-system workflows
&lt;/h3&gt;

&lt;p&gt;Real tasks often require coordinating across multiple systems. vMCP lets you define deterministic workflows that execute in parallel with conditionals, error handling, and approval gates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example: Incident investigation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of manually jumping between 4 different systems, copy/pasting data, and aggregating the results, a single “composite tool” could:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;→ Query logs from logging system
→ Fetch metrics from monitoring platform  
→ Pull traces from tracing service
→ Check infrastructure status from cloud provider
→ Manually combine everything into a report
→ Create Jira ticket with findings
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;vMCP executes all queries in parallel, automatically aggregates the data, and creates the ticket. Define the workflow once, use it for every incident.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example: App deployment&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A typical deployment workflow handled end-to-end:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;→ Merge pull request in GitHub
→ Wait for CI tests to pass
→ Request human approval (using MCP elicitation)
→ Deploy (only if approved)
→ Notify team in Slack
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Pre-configured defaults and guardrails
&lt;/h3&gt;

&lt;p&gt;Stop typing the same parameters repeatedly. Configure defaults once in vMCP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before:&lt;/strong&gt; Every GitHub query requires specifying &lt;code&gt;repo: stacklok/toolhive&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After:&lt;/strong&gt; The repo is pre-configured. Engineers never specify it, and they can't accidentally query the wrong one.&lt;/p&gt;

&lt;p&gt;This isn’t just convenience, it’s about deterministic behavior and security. By pre-configuring parameters, you ensure tools behave consistently, and users can only access resources you’ve explicitly exposed. No more accidental queries to the wrong repo, Slack channels, databases, cloud regions, or anything else you reference repeatedly.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Tool customization and security policies
&lt;/h3&gt;

&lt;p&gt;Third-party MCP servers often expose generic, unrestricted tools. vMCP lets you wrap and restrict them without modifying upstream servers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security policy enforcement:&lt;/strong&gt; Restrict a website fetch tool to internal domains only (&lt;code&gt;*.company.com&lt;/code&gt;), validate URLs before calling the backend, and provide clear error messages for violations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Simplified interfaces:&lt;/strong&gt; That AWS EC2 tool with 20+ parameters? Create a wrapper that only exposes the 3 parameters your frontend team actually needs, with safe defaults for everything else.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Centralized authentication
&lt;/h3&gt;

&lt;p&gt;vMCP implements a two-boundary authentication model with a complete audit trail. Your AI client authenticates once to vMCP using the OAuth 2.1 methods defined in the official MCP spec. vMCP handles authorization to each backend independently based on its requirements.&lt;/p&gt;

&lt;p&gt;When it’s time to revoke access, disable the user in your identity provider, and all backend access is revoked instantly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-world benefits
&lt;/h2&gt;

&lt;p&gt;Let's look at the incident investigation example with concrete numbers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without vMCP:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;4 sequential manual commands
&lt;/li&gt;
&lt;li&gt;2-3 minutes per command
&lt;/li&gt;
&lt;li&gt;5-10 minutes aggregating and formatting
&lt;/li&gt;
&lt;li&gt;15-20 minutes total per incident
&lt;/li&gt;
&lt;li&gt;Results vary by engineer
&lt;/li&gt;
&lt;li&gt;Process isn't documented or reusable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;With vMCP:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One command triggers the workflow
&lt;/li&gt;
&lt;li&gt;Parallel execution: 30 seconds
&lt;/li&gt;
&lt;li&gt;Automatic aggregation and formatting
&lt;/li&gt;
&lt;li&gt;Consistent results every time
&lt;/li&gt;
&lt;li&gt;Workflow is documented as code
&lt;/li&gt;
&lt;li&gt;Any team member can use it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a team handling 20 incidents per week, that's 5-6 hours saved. More importantly, the response is faster, more consistent, and doesn't require senior engineers to handle routine investigations.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;vMCP runs in Kubernetes alongside your backend MCP servers. You define three types of resources:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCPGroup:&lt;/strong&gt; Organizes backend servers logically (e.g., "platform-tools")&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCPServer:&lt;/strong&gt; Individual backend MCP servers (GitHub, Jira, etc.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;VirtualMCPServer:&lt;/strong&gt; The aggregation layer that combines servers from a group&lt;/p&gt;

&lt;p&gt;The ToolHive operator discovers backends, resolves tool name conflicts, applies security policies, and exposes everything through a single endpoint. Your AI client connects to vMCP just like any other MCP server.&lt;/p&gt;

&lt;p&gt;Since each VirtualMCPServer is a separate Kubernetes resource, you can deploy as many as needed. One per team, one per environment, or organized however makes sense for your security model.&lt;/p&gt;

&lt;p&gt;For a working example, check out the &lt;a href="https://docs.stacklok.com/toolhive/tutorials/quickstart-vmcp" rel="noopener noreferrer"&gt;quickstart tutorial&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use vMCP
&lt;/h2&gt;

&lt;p&gt;vMCP makes sense when you're managing multiple MCP servers (typically 5+), curating a subset of MCP tools for specific teams and workflows, or need tasks that coordinate across systems. It's especially valuable for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teams requiring centralized authentication and authorization
&lt;/li&gt;
&lt;li&gt;Workflows that should be reusable across the entire team
&lt;/li&gt;
&lt;li&gt;Security policies that need centralized enforcement
&lt;/li&gt;
&lt;li&gt;Reducing onboarding complexity for new engineers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're using a single MCP server for simple one-step operations, you probably don't need vMCP. It's built for managing complexity at scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;vMCP is available now as part of ToolHive. To try it out:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install the ToolHive Kubernetes Operator
&lt;/li&gt;
&lt;li&gt;Follow the &lt;a href="https://docs.stacklok.com/toolhive/tutorials/quickstart-vmcp" rel="noopener noreferrer"&gt;vMCP quickstart&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Connect your AI client to the aggregated endpoint&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We'd love to hear how you're using vMCP. What workflows are you building? Which MCP servers are you aggregating? Join the &lt;a href="https://discord.gg/stacklok" rel="noopener noreferrer"&gt;ToolHive community on Discord&lt;/a&gt; and let us know.&lt;/p&gt;

&lt;p&gt;Looking to leverage vMCP within your enterprise organization? &lt;a href="https://calendly.com/stacklok/30min" rel="noopener noreferrer"&gt;Book a demo with us&lt;/a&gt;.  &lt;/p&gt;




&lt;p&gt;&lt;em&gt;ToolHive is an open-source MCP platform focused on security and enterprise operationalization. Learn more at &lt;a href="https://toolhive.dev" rel="noopener noreferrer"&gt;toolhive.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>toolhive</category>
    </item>
    <item>
      <title>Stacklok's MCP Optimizer vs Anthropic's Tool Search Tool: A Head-to-Head Comparison</title>
      <dc:creator>Alejandro Ponce de León</dc:creator>
      <pubDate>Wed, 10 Dec 2025 15:36:56 +0000</pubDate>
      <link>https://dev.to/stacklok/stackloks-mcp-optimizer-vs-anthropics-tool-search-tool-a-head-to-head-comparison-2f32</link>
      <guid>https://dev.to/stacklok/stackloks-mcp-optimizer-vs-anthropics-tool-search-tool-a-head-to-head-comparison-2f32</guid>
      <description>&lt;h2&gt;
  
  
  &lt;strong&gt;TL;DR&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Both solutions tackle the critical problem of token bloat from excessive tool definitions. However, our testing with 2,792 tools reveals a stark performance gap: &lt;strong&gt;Stacklok MCP Optimizer achieves 94% accuracy&lt;/strong&gt; in selecting the right tools, while &lt;strong&gt;Anthropic's Tool Search Tool achieves only 34% accuracy&lt;/strong&gt;. If you're building production AI agents that need reliable tool selection without breaking the bank on tokens, these numbers matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Problem Both Are Solving&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;When you connect AI agents to multiple Model Context Protocol (MCP) servers, tool definitions quickly consume massive portions of your context window, often before your actual conversation even begins. &lt;/p&gt;

&lt;p&gt;The reality? Most queries only need a handful of these tools. Loading all of them wastes tokens (read: money) and degrades model performance as the tool count grows.&lt;/p&gt;

&lt;p&gt;Both &lt;a href="https://dev.to/stacklok/cut-token-waste-from-your-ai-workflow-with-the-toolhive-mcp-optimizer-3oo6"&gt;Stacklok MCP Optimizer&lt;/a&gt; (launched October 28, 2025) and &lt;a href="https://www.anthropic.com/engineering/advanced-tool-use" rel="noopener noreferrer"&gt;Anthropic's Tool Search Tool&lt;/a&gt; (launched November 20, 2025 as part of their advanced tool use beta) address this by loading a single search tool that finds and loads only the necessary tools on demand.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Why This Matters: Real Benefits and Trade-offs&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The Upside&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Token savings are substantial.&lt;/strong&gt; We've observed up to 80% reductions in input tokens. In their internal testing, Anthropic reports their approach preserves 191,300 tokens of context compared to loading all tools upfront, an 85% reduction. In rate-limited enterprise environments, this translates directly to &lt;a href="https://docs.stacklok.com/toolhive/tutorials/mcp-optimizer" rel="noopener noreferrer"&gt;cost savings and faster response times.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Improved model performance.&lt;/strong&gt; Reducing token overhead doesn't just save money, it can improve model accuracy. Anthropic's internal testing showed substantial improvements with Tool Search Tool enabled: Opus 4 jumped from 49% to 74%, and Opus 4.5 improved from 79.5% to 88.1% on MCP evaluations. However, it's important to note that Anthropic's experiments and datasets are not publicly available, making direct comparisons challenging.&lt;/p&gt;

&lt;p&gt;Our own testing with MCP Optimizer across different model tiers revealed an interesting pattern: while state-of-the-art models like Claude Sonnet 4 maintained strong performance when benchmarking tool selection accuracy (94.6% → 93.4%), mid-tier and smaller models showed significant improvements. Gemini 2.5 Flash increased from 83.2% to 92.4%, and the gpt-oss-20B model nearly doubled its accuracy from 38% to 69.4%. This suggests that efficient tool loading particularly benefits models with tighter context constraints, making MCP Optimizer valuable across different deployment scenarios, from resource-constrained edge deployments to cost-optimized production systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The Downside&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Risk of tool retrieval failure.&lt;/strong&gt; The benefits above assume the search tool successfully finds the right tool. But what happens when it doesn't? If the search tool can't find the right tool, your task fails or produces unexpected behavior. While the agent can retry searches, this introduces latency and still consumes tokens. The critical question becomes: &lt;em&gt;How often does the search actually work in practice ?&lt;/em&gt;  This is precisely what our head-to-head comparison measures.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;How Each Approach Works&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Both solutions introduce a lightweight search tool, but their algorithms differ significantly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stacklok MCP Optimizer&lt;/strong&gt;: Combines semantic search with BM25 for hybrid tool discovery
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic Tool Search Tool&lt;/strong&gt;: Offers two variants, BM25-only or regex-based pattern matching&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The algorithmic difference has profound implications for real-world performance, as our testing reveals.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Head-to-Head Comparison&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;We conducted a comprehensive evaluation to answer the question: &lt;em&gt;Which approach is more effective? (&lt;/em&gt;&lt;a href="https://github.com/StacklokLabs/mcp-optimizer/pull/148" rel="noopener noreferrer"&gt;Source code and full results&lt;/a&gt;&lt;em&gt;)&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Test Methodology&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Loaded 2,792 tools from various MCP servers using the &lt;a href="https://github.com/xfey/MCP-Zero?tab=readme-ov-file#dataset-mcp-tools" rel="noopener noreferrer"&gt;MCP-tools dataset&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;For each tool, generated a synthetic query using an LLM that would naturally require that specific tool

&lt;ul&gt;
&lt;li&gt;Example: For GitHub's &lt;code&gt;create_pull_request&lt;/code&gt; tool → Generated query: "Create a pull request from feature-branch to main branch in the octocat/Hello-World repository on GitHub"
&lt;/li&gt;
&lt;li&gt;Example: Slack's &lt;code&gt;channels_list&lt;/code&gt; tool → Generated query: "Show me all channels in my Slack workspace"
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Used Claude Sonnet 4.5 to test whether each approach could correctly search and select the original tool that generated the query

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval Accuracy&lt;/strong&gt;: Does the correct tool appear anywhere in the search results returned by the search tool?
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Selection Accuracy&lt;/strong&gt;: Is the correct tool actually selected by the model for use?
&lt;/li&gt;
&lt;li&gt;This direct mapping lets us objectively measure retrieval accuracy: we know the ground truth for every query. In the examples above, the correct tools would be GitHub's &lt;code&gt;create_pull_request&lt;/code&gt; and Slack's &lt;code&gt;channels_list&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Results&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnshjekx84tc9k5yps2a6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnshjekx84tc9k5yps2a6.png" alt="Accuracy comparison chart showing MCP Optimizer at 93.95% selection accuracy and 98.03% retrieval accuracy versus Tool Search Tool at 33.70%/47.85% (BM25) and 30.01%/39.00% (regex)" width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The stark difference in selection accuracy between approaches primarily reflects retrieval effectiveness rather than model performance. Since all approaches used the same model (Claude Sonnet 4.5) for tool selection, the 94% vs 34% accuracy gap stems from MCP Optimizer's superior retrieval accuracy (98% vs 48%). Put simply: if the correct tool doesn't appear in the search results, even the best model cannot select it. MCP Optimizer's hybrid semantic + BM25 search successfully surfaces the correct tool in 98% of cases, giving the model the opportunity to make the right selection. In contrast, Tool Search Tool's lower retrieval rates mean the model often never sees the correct tool among its options.&lt;/p&gt;

&lt;p&gt;These results align with independent testing from other organizations. &lt;a href="https://blog.arcade.dev/anthropic-tool-search-4000-tools-test" rel="noopener noreferrer"&gt;Arcade reported&lt;/a&gt; that Anthropic's Tool Search achieved only 56% retrieval accuracy with regex and 64% with BM25 across 4,027 tools.&lt;/p&gt;

&lt;h4&gt;
  
  
  Runtime Performance Characteristics
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Average execution time&lt;/th&gt;
&lt;th&gt;Average tools retrieved&lt;/th&gt;
&lt;th&gt;Average input tokens*&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MCP Optimizer&lt;/td&gt;
&lt;td&gt;5.75 seconds&lt;/td&gt;
&lt;td&gt;5.2&lt;/td&gt;
&lt;td&gt;3296&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool Search Tool (BM25)&lt;/td&gt;
&lt;td&gt;12.05 seconds&lt;/td&gt;
&lt;td&gt;5.0&lt;/td&gt;
&lt;td&gt;2823&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool Search Tool (regex)&lt;/td&gt;
&lt;td&gt;13.55 seconds&lt;/td&gt;
&lt;td&gt;5.2&lt;/td&gt;
&lt;td&gt;3679&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;* &lt;em&gt;Average Input Tokens: The total number of tokens sent to the model per request, including system prompt, tool definitions, and user query.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Beyond accuracy, the operational characteristics of each approach reveal important trade-offs. Tool Search Tool (BM25) achieves the lowest token consumption at 2,823 tokens per request, which likely stems from retrieving slightly fewer tools on average (5.0 vs 5.2). However, MCP Optimizer's token count of 3,296 still represents substantial savings compared to attempting to load all 2,792 tools upfront, which would require 206,073 tokens and cause an error due to context window limitations.&lt;/p&gt;

&lt;p&gt;The execution time differences are noteworthy: MCP Optimizer completes searches in 5.75 seconds on average, while Tool Search Tool takes 12.05 (BM25) and 13.55 seconds (regex). However, this comparison requires context. MCP Optimizer was executed locally in our test environment, while Tool Search Tool operates as an internal Anthropic service with unknown infrastructure requirements and potential network latency.&lt;/p&gt;

&lt;h3&gt;
  
  
  What This Means
&lt;/h3&gt;

&lt;p&gt;The numbers tell a clear story: &lt;strong&gt;MCP Optimizer consistently finds the correct tool 94% of the time&lt;/strong&gt;, while Tool Search Tool's accuracy hovers around 30-34% in environments with thousands of tools. For production systems where reliability and performance matters, this gap is significant.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Verdict&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Anthropic's Tool Search Tool correctly identifies a real problem facing production AI deployments. The concept of on-demand tool loading is sound, and the token savings are genuine. However, &lt;strong&gt;the current implementation isn't production-ready&lt;/strong&gt; for environments with large tool catalogs. Limited to Claude Sonnet 4.5 and Opus 4.5, it remains a proprietary solution exclusive to Anthropic's ecosystem.&lt;/p&gt;

&lt;p&gt;MCP Optimizer, on the other hand, delivers on the promise: reliable tool selection (94% accuracy) combined with significant token savings. Built into the ToolHive runtime as a free and open-source solution, it seamlessly integrates with all major &lt;a href="https://docs.stacklok.com/toolhive/reference/client-compatibility" rel="noopener noreferrer"&gt;AI clients&lt;/a&gt; including Claude Code, GitHub Copilot, Cursor, and others, providing vendor flexibility and broader compatibility across different AI platforms. For teams building AI agents that need to work consistently across hundreds or thousands of tools, this performance difference and deployment flexibility are critical.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Looking Forward&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The future of AI agents depends on solving context window constraints without sacrificing reliability.  For that future to arrive, we need tool selection systems that work reliably. MCP Optimizer proves that hybrid semantic + keyword search can deliver both token efficiency and production-grade accuracy. As Anthropic's Tool Search Tool matures beyond beta, we hope to see similar reliability gains.&lt;/p&gt;

&lt;p&gt;For now, if you're deploying AI agents in production and need dependable tool selection across extensive tool catalogs, the data points to MCP Optimizer as the more reliable choice.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Interested in learning more about MCP Optimizer? Check out the &lt;a href="https://docs.stacklok.com/toolhive/tutorials/mcp-optimizer" rel="noopener noreferrer"&gt;ToolHive documentation&lt;/a&gt; or visit &lt;a href="https://stacklok.com/" rel="noopener noreferrer"&gt;stacklok.com&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>anthropic</category>
      <category>stacklok</category>
    </item>
    <item>
      <title>Deploying an Okta-Authenticated BigQuery MCP Server on Kubernetes with ToolHive</title>
      <dc:creator>Yolanda Robla Mota</dc:creator>
      <pubDate>Wed, 19 Nov 2025 09:56:11 +0000</pubDate>
      <link>https://dev.to/stacklok/deploying-an-okta-authenticated-bigquery-mcp-server-on-kubernetes-with-toolhive-cf5</link>
      <guid>https://dev.to/stacklok/deploying-an-okta-authenticated-bigquery-mcp-server-on-kubernetes-with-toolhive-cf5</guid>
      <description>&lt;p&gt;In my &lt;a href="https://dev.to/stacklok/how-to-use-okta-to-remotely-authenticate-to-your-bigquery-mcp-server-5a35"&gt;previous article&lt;/a&gt;, I showed how to connect Okta authentication to a BigQuery MCP server running locally. The objective was to build a workflow that was secure (with user-level attribution and least privilege roles), short-lived, and that would save you the pain of managing Google service-account keys. That setup worked perfectly for local development, but it wasn’t something I’d confidently hand off to production.&lt;br&gt;
This time, we’ll take that local prototype and transform it into a production-ready, cloud-native deployment running on Kubernetes, secured by Okta, and managed end-to-end by the &lt;strong&gt;ToolHive Operator&lt;/strong&gt;. We’ll even make it accessible remotely through &lt;strong&gt;ngrok&lt;/strong&gt;, so you can connect to it from anywhere using VS Code.&lt;/p&gt;
&lt;h2&gt;
  
  
  Setting the Stage
&lt;/h2&gt;

&lt;p&gt;Before diving in, let’s make sure we have the right pieces in place. You’ll need a Kubernetes cluster (I’ll be using &lt;em&gt;kind&lt;/em&gt; for simplicity), along with &lt;em&gt;kubectl&lt;/em&gt; and &lt;em&gt;helm&lt;/em&gt;. You’ll also need an Okta account with an authorization server configured, and a Google Cloud project with BigQuery enabled.&lt;br&gt;
If you haven’t already, set up &lt;strong&gt;Workload Identity Federation&lt;/strong&gt; in your Google Cloud project. That’s what allows Google Cloud to trust Okta tokens and issue temporary credentials for BigQuery access.&lt;br&gt;
Finally, install the &lt;strong&gt;ToolHive CLI&lt;/strong&gt; (&lt;em&gt;thv&lt;/em&gt;) and sign up for an &lt;strong&gt;ngrok&lt;/strong&gt; account — we’ll use both to expose your service later on.&lt;/p&gt;
&lt;h2&gt;
  
  
  Deploying the ToolHive Operator
&lt;/h2&gt;

&lt;p&gt;Let’s start by getting the ToolHive Operator running in our cluster. The operator is what manages the lifecycle of MCP servers — it handles the pods, proxies, authentication, and updates automatically.&lt;br&gt;
I’m using &lt;em&gt;kind&lt;/em&gt; to create a local cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kind create cluster --name toolhive
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, install the ToolHive CRDs and the operator itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install toolhive-operator-crds \
  oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds

helm upgrade --install toolhive-operator \
  oci://ghcr.io/stacklok/toolhive/toolhive-operator \
  --namespace toolhive-system --create-namespace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A quick check confirms the operator is running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n toolhive-system
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;toolhive-operator-7875c8c5cd-xxxxx   1/1     Running   0   30s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With that, our cluster is ready to start managing MCP servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Storing the Okta Secret
&lt;/h2&gt;

&lt;p&gt;The next step is to give ToolHive access to your Okta client secret. This allows the proxy to validate incoming tokens. Instead of hardcoding secrets, Kubernetes encourages us to store them in a dedicated Secret resource.&lt;br&gt;
Here’s the YAML to create one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: v1
kind: Secret
metadata:
  name: okta-client-secret
  namespace: default
type: Opaque
stringData:
  client-secret: &amp;lt;YOUR_OKTA_CLIENT_SECRET&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save that as &lt;em&gt;00-okta-client-secret.yaml&lt;/em&gt; and apply it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f 00-okta-client-secret.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting Up Token Exchange
&lt;/h2&gt;

&lt;p&gt;To allow Okta to exchange its tokens for Google Cloud credentials, we’ll define an &lt;em&gt;MCPExternalAuthConfig&lt;/em&gt; resource. This tells ToolHive how to talk to Google’s Security Token Service (STS) and request access tokens for BigQuery.&lt;br&gt;
Here’s the config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPExternalAuthConfig
metadata:
  name: bigquery-token-exchange
  namespace: default
spec:
  type: tokenExchange
  tokenExchange:
    tokenUrl: https://sts.googleapis.com/v1/token
    audience: //iam.googleapis.com/projects/&amp;lt;YOUR_PROJECT_NUMBER&amp;gt;/locations/global/workloadIdentityPools/okta-pool/providers/okta-provider
    subjectTokenType: id_token
    scopes:
      - https://www.googleapis.com/auth/bigquery
      - https://www.googleapis.com/auth/cloud-platform
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f 01-external-auth-config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This configuration acts as a bridge between Okta and Google Cloud, handling the secure exchange behind the scenes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying the BigQuery MCP Server
&lt;/h2&gt;

&lt;p&gt;Now we can create the MCP server that will connect VS Code to BigQuery. This configuration ties together the image, authentication, and proxy.&lt;br&gt;
We need to expose a public endpoint that is the resourceURL. For that, we can use a service like ngrok. Configure a domain in the &lt;a href="https://dashboard.ngrok.com/domains" rel="noopener noreferrer"&gt;ngrok dashboard&lt;/a&gt; or note your automatically-generated “dev domain” if you’re on a free account. Configure that properly on the custom resource, along with the other settings indicated with &lt;em&gt;&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPServer
metadata:
  name: database-toolbox-bigquery
  namespace: default
spec:
  image: us-central1-docker.pkg.dev/database-toolbox/toolbox/toolbox:0.19.1
  env:
    - name: BIGQUERY_PROJECT
      value: &amp;lt;YOUR_GCP_PROJECT_ID&amp;gt;
    - name: BIGQUERY_USE_CLIENT_OAUTH
      value: "true"

  args:
    - --prebuilt
    - bigquery
    - --address
    - 0.0.0.0

  transport: streamable-http
  proxyPort: 8000
  mcpPort: 5000

  oidcConfig:
    type: inline
    resourceUrl: https://&amp;lt;YOUR_NGROK_DOMAIN&amp;gt;.ngrok-free.app/mcp   # Replace with your ngrok URL
    inline:
      issuer: https://&amp;lt;YOUR_OKTA_DOMAIN&amp;gt;.okta.com/oauth2/&amp;lt;YOUR_AUTH_SERVER_ID&amp;gt;
      audience: //iam.googleapis.com/projects/&amp;lt;YOUR_PROJECT_NUMBER&amp;gt;/locations/global/workloadIdentityPools/okta-pool/providers/okta-provider
      clientId: &amp;lt;YOUR_OKTA_CLIENT_ID&amp;gt;
      clientSecretRef:
        name: okta-client-secret
        key: client-secret

  externalAuthConfigRef:
    name: bigquery-token-exchange

  resources:
    limits:
      cpu: "1"
      memory: "512Mi"
    requests:
      cpu: "100m"
      memory: "128Mi"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f 02-mcp-server-bigquery.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Kubernetes will create two pods: one running the MCP server, and another running the ToolHive proxy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Exposing the Service Publicly
&lt;/h2&gt;

&lt;p&gt;Once the MCP server is running, we can expose it publicly to be reachable by authentication endpoints and clients. This means we’ll temporarily expose the service, create a tunnel through ngrok using ToolHive’s built-in support, and grab that domain before proceeding.&lt;br&gt;
Start by forwarding the proxy service locally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl port-forward -n default svc/database-toolbox-bigquery-proxy-svc 8000:8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes the MCP proxy accessible at &lt;a href="http://127.0.0.1:8000" rel="noopener noreferrer"&gt;http://127.0.0.1:8000&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now, use the ToolHive CLI to open a secure tunnel with ngrok:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;thv proxy tunnel http://127.0.0.1:8000 tunnel \
  --tunnel-provider ngrok \
  --provider-args '{"auth-token": "&amp;lt;YOUR_NGROK_AUTH_TOKEN&amp;gt;", “url”: “https://&amp;lt;YOUR_NGROK_DOMAIN&amp;gt;.ngrok-free.app”}'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ToolHive will create the tunnel and print a line like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✔ Tunnel created
Public URL: https://&amp;lt;YOUR_NGROK_DOMAIN&amp;gt;.ngrok-free.app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want more background on this tunneling feature, the ToolHive team has a nice write-up: &lt;a href="https://dev.to/stacklok/exposing-a-kubernetes-hosted-mcp-server-with-toolhive-ngrok-with-basic-auth-23kn"&gt;Exposing a Kubernetes-Hosted MCP Server with ToolHive + ngrok (with Basic Auth)&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Verifying the Deployment
&lt;/h2&gt;

&lt;p&gt;After a few moments, confirm everything’s running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n default -l toolhive-name=database-toolbox-bigquery
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see two pods in the “Running” state — one for the server, one for the proxy.&lt;br&gt;
If you’d like to peek under the hood, tail the proxy logs to see the authentication and token exchange process in action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl logs -n default -l app.kubernetes.io/instance=database-toolbox-bigquery-proxy --tail=50
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see debug lines referencing token validation and the STS endpoint.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connect from VS Code
&lt;/h2&gt;

&lt;p&gt;Once your MCP server is running, secured, and exposed via your public ngrok URL (for example: &lt;em&gt;&lt;a href="https://abc123.ngrok-free.app/mcp" rel="noopener noreferrer"&gt;https://abc123.ngrok-free.app/mcp&lt;/a&gt;&lt;/em&gt;), you’ll use VS Code’s MCP support to connect.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Open VS Code. Make sure you have the MCP / Copilot Chat extension installed and enabled.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Open the Command Palette (&lt;em&gt;Ctrl+Shift+P or ⌘+Shift+P&lt;/em&gt;) and run “&lt;strong&gt;MCP: Add Server&lt;/strong&gt;” (or you can open the &lt;em&gt;mcp.json&lt;/em&gt; configuration manually).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When prompted, enter a JSON configuration like this:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "servers": {
    "toolbox": {
      "url": "https://&amp;lt;YOUR_NGROK_DOMAIN&amp;gt;.ngrok-free.app/mcp",
      "type": "http"
    }
  },
  "inputs": []
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The "type": "http" indicates you’re connecting over HTTP transport.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;After saving/accepting this config, VS Code will attempt to connect to the MCP server. During this process it will prompt you to enter the &lt;strong&gt;Client ID&lt;/strong&gt; and the &lt;strong&gt;Client Secret&lt;/strong&gt; from your Okta app&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;These credentials allow VS Code to authenticate and authorize with the server according to the MCP/OIDC handshake.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Once the authentication completes, the server will appear in your MCP server list. You can open the Chat view, select the MCP tools (e.g., &lt;em&gt;query_bigquery, list_datasets&lt;/em&gt;, etc.), and issue queries or commands as needed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Try a test query to confirm everything is working:&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fppvjr181vjwpzub099tn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fppvjr181vjwpzub099tn.png" alt="BigQuery with VSCode" width="512" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;We’ve come a long way from a local Okta-authenticated server to a fully managed, cloud-ready Kubernetes deployment. Now you have a &lt;strong&gt;secure, scalable, and remote-accessible&lt;/strong&gt; BigQuery MCP server managed entirely by ToolHive.&lt;br&gt;
This setup combines Okta’s identity management, Google Cloud’s token exchange, and Kubernetes automation into a single cohesive workflow. The result is a developer-friendly environment that’s easy to scale and safe to expose beyond your local machine.&lt;br&gt;
If you’re interested in exploring further, join the &lt;a href="https://discord.gg/stacklok" rel="noopener noreferrer"&gt;ToolHive Discord community&lt;/a&gt; to share what you’ve built. The possibilities with ToolHive, Okta, and Kubernetes together are just getting started.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>security</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>How to use Okta to remotely authenticate to your BigQuery MCP Server</title>
      <dc:creator>Yolanda Robla Mota</dc:creator>
      <pubDate>Thu, 06 Nov 2025 12:07:50 +0000</pubDate>
      <link>https://dev.to/stacklok/how-to-use-okta-to-remotely-authenticate-to-your-bigquery-mcp-server-5a35</link>
      <guid>https://dev.to/stacklok/how-to-use-okta-to-remotely-authenticate-to-your-bigquery-mcp-server-5a35</guid>
      <description>&lt;p&gt;This article builds on our &lt;a href="https://dev.to/stacklok/beyond-api-keys-token-exchange-identity-federation-mcp-servers-5dm8"&gt;previous post&lt;/a&gt;, where we explored the high-level architecture of token exchange, identity federation, and how to run MCP servers in a secure and IdP-agnostic way. Now we shift into the &lt;strong&gt;hands-on phase&lt;/strong&gt;: how to use ToolHive to enable an MCP server to query Google BigQuery for users authenticated via Okta. While we use Okta and Google Cloud as the example stack, this flow is adaptable to any IdP and any cloud provider with a compatible STS / federation service.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scenario overview
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;You run an MCP server that receives requests from users who are authenticated via Okta.&lt;/li&gt;
&lt;li&gt;The MCP server must execute queries in Google Cloud BigQuery.&lt;/li&gt;
&lt;li&gt;You don’t want to manage Google service-account keys, embed JSON credentials in config, or lose per-user audit.&lt;/li&gt;
&lt;li&gt;You want: user-level attribution, least-privilege roles, secure, short-lived access, and federation between Okta and Google Cloud.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this example, we’re implementing the IdP federation approach described as scenario “B” in the previous blog post. The diagram below shows how ToolHive, Okta, and Google Cloud interact in this flow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu0xcpaqqelbznbpgu9gw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu0xcpaqqelbznbpgu9gw.png" alt="IDP federation diagram" width="512" height="102"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before you start, make sure you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Okta admin access&lt;/strong&gt;: You’ll need permissions to create an OIDC app and an authorization server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A Google Cloud project&lt;/strong&gt;: With BigQuery enabled and permissions to create a Workforce Identity Pool.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ToolHive CLI&lt;/strong&gt;: &lt;a href="https://toolhive.dev" rel="noopener noreferrer"&gt;download it from toolhive.dev&lt;/a&gt; and confirm it’s in your system path.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Container runtime&lt;/strong&gt;: Docker, Podman, or Rancher Desktop are supported.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;An MCP client&lt;/strong&gt; such as Claude Code (or any other client supporting the MCP protocol).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Detailed configuration steps
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Configure Okta as Identity Provider
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;In the Okta Admin Console, navigate to &lt;strong&gt;Applications → Applications&lt;/strong&gt; and click &lt;strong&gt;Create App Integration&lt;/strong&gt;. See &lt;a href="https://help.okta.com/en-us/content/topics/apps/apps_app_integration_wizard_oidc.htm" rel="noopener noreferrer"&gt;https://help.okta.com/en-us/content/topics/apps/apps_app_integration_wizard_oidc.htm
&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Choose &lt;strong&gt;OIDC – OpenID Connect&lt;/strong&gt; and then &lt;strong&gt;Web Application&lt;/strong&gt; for the app type.&lt;/li&gt;
&lt;li&gt;Configure the &lt;strong&gt;sign-in redirect URI&lt;/strong&gt; to &lt;a href="http://localhost:8666/callback" rel="noopener noreferrer"&gt;http://localhost:8666/callback&lt;/a&gt; (this is the callback needed for the MCP server that we will run later using ToolHive).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;IMPORTANT: Note the client ID and client secret; you’ll need them in later steps.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmtr0jgbe98fiexu72szt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmtr0jgbe98fiexu72szt.png" alt="Okta client" width="800" height="880"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Create an Authorization Server in Okta
&lt;/h3&gt;

&lt;p&gt;Your OIDC app issues tokens via an Authorization Server. For the Workforce Federation and token exchange, you need one configured correctly.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In the Okta Admin Console, Navigate to &lt;strong&gt;Security → API → Authorization Servers&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Click Add Authorization Server.&lt;/li&gt;
&lt;li&gt;Name: &lt;strong&gt;BigQuery MCP Server&lt;/strong&gt; (or any descriptive name)&lt;/li&gt;
&lt;li&gt;Audience: set this to match the audience expected by your MCP server configuration (for example, &lt;strong&gt;mcpserver&lt;/strong&gt;).&lt;/li&gt;
&lt;li&gt;Click Save.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Configure an additional &lt;strong&gt;gcp.access&lt;/strong&gt; scope:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffnc2ioab9kfw90o0sr8o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffnc2ioab9kfw90o0sr8o.png" alt="Okta scopes" width="800" height="677"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;And the access policies for the types of tokens to generate, including Token Exchange:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7f25dmq25vyxdku6815l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7f25dmq25vyxdku6815l.png" alt="Okta rules" width="800" height="1452"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With this setup, Okta will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Issue standards-compliant OIDC tokens to your MCP server through ToolHive.&lt;/li&gt;
&lt;li&gt;Include the claims Google Cloud expects during the token exchange.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;IMPORTANT: Note the issuer URL for the Authorization Server, you’ll need it in the next steps.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Create Workforce Identity Pool in Google Cloud
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;In the Google Cloud console, create a &lt;strong&gt;Workforce Identity Pool&lt;/strong&gt; and a matching provider, using the Issuer URL you noted in the previous step:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftzqku8asoiz3nmawsfyc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftzqku8asoiz3nmawsfyc.png" alt="Workforce identity pool" width="800" height="717"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Define custom audiences. The Okta client ID needs to be passed as an audience, so start by copying the default audience. Then select &lt;strong&gt;Allowed audiences&lt;/strong&gt;, add the default value, and include your Okta client ID as well.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F366pso8eb4y1lckuthdu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F366pso8eb4y1lckuthdu.png" alt="Allowed audiences" width="800" height="708"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Configure permissions for the Okta user so they can read BigQuery data. Repeat this for each user you want to map:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcloud projects add-iam-policy-binding &amp;lt;PROJECT_NAME&amp;gt; \
--member="principalSet://iam.googleapis.com/projects/&amp;lt;PROJECT_ID&amp;gt;/locations/global/workloadIdentityPools/okta-pool/attribute.email/&amp;lt;MAPPED_OKTA_EMAIL&amp;gt;" \
--role="roles/bigquery.dataViewer"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Deploy MCP server + proxy with remote authentication via ToolHive
&lt;/h3&gt;

&lt;p&gt;In this step, we bring together the MCP server and the remote authentication/federation flow. Using ToolHive, we’ll run the server and wrap it with a proxy that handles user authentication with Okta and token exchange into Google Cloud.&lt;/p&gt;

&lt;p&gt;Start by creating a group. ToolHive automatically manages clients registered to your default group, adding or removing MCP servers as you run them. Since this server will sit behind an authenticated proxy, we don’t want that auto-configuration behavior, so we’ll create a separate group for it instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;thv group create toolbox-group
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then start the open source &lt;a href="https://github.com/googleapis/genai-toolbox" rel="noopener noreferrer"&gt;MCP Toolbox for Databases&lt;/a&gt; server using the ToolHive CLI. ToolHive automatically pulls the server image using metadata from the ToolHive registry. You can view details about the image with &lt;code&gt;thv registry info database-toolbox&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;thv run --group toolbox-group database-toolbox \
--env BIGQUERY_PROJECT=&amp;lt;YOUR_PROJECT_ID&amp;gt; \
--env BIGQUERY_USE_CLIENT_OAUTH=true \
--proxy-port 6000 \
-- --prebuilt bigquery --address 0.0.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here’s what each parameter does:&lt;br&gt;
&lt;strong&gt;--group toolbox-group&lt;/strong&gt;: Name of the ToolHive group that the MCP server belongs to&lt;br&gt;
&lt;strong&gt;database-toolbox&lt;/strong&gt;: The MCP server image from the ToolHive registry&lt;br&gt;
&lt;strong&gt;--env BIQUERY_PROJECT&lt;/strong&gt;: Your Google Cloud project ID containing BigQuery resources&lt;br&gt;
&lt;strong&gt;--env BIGQUERY_USE_CLIENT_OAUTH=true&lt;/strong&gt;: Use the OAuth flow instead of static service account credentials&lt;br&gt;
&lt;strong&gt;--proxy-port&lt;/strong&gt;: Port exposed on your host for the containerized MCP server&lt;br&gt;
&lt;strong&gt;--&lt;/strong&gt;: CLI arguments passed into the MCP server&lt;br&gt;
&lt;strong&gt;--prebuilt bigquery&lt;/strong&gt;: Use the prebuilt configuration for BigQuery&lt;br&gt;
&lt;strong&gt;--address 0.0.0.0&lt;/strong&gt;: Bind the server to all network interfaces so the proxy can reach it&lt;/p&gt;

&lt;p&gt;ToolHive spins up the MCP server container and HTTP proxy process, ready to handle BigQuery queries using the MCP protocol. Using &lt;a href="http://toolhive.dev" rel="noopener noreferrer"&gt;ToolHive&lt;/a&gt; ensures the server is containerized, isolated, and managed securely — avoiding the “run-it-manually” friction.&lt;/p&gt;

&lt;p&gt;Next, the &lt;code&gt;thv proxy&lt;/code&gt; command starts a proxy process that sits in front of the MCP server and handles all incoming requests. It prompts you to sign in with Okta, exchanges your Okta token for a Google Cloud access token, and then forwards your request to the MCP server using that token.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;thv proxy \
  --target-uri http://127.0.0.1:6000 \
  --remote-auth-client-id &amp;lt;OKTA_CLIENT_ID&amp;gt; \
  --remote-auth-client-secret &amp;lt;OKTA_CLIENT_SECRET&amp;gt; \
  --remote-auth okta \
  --remote-auth-issuer &amp;lt;AUTHORIZATION_SERVER_URL&amp;gt; \
  --remote-auth-callback-port 8666 \
  --remote-auth-scopes 'openid,profile,email,gcp.access' \
  --port 62614 \
  --token-exchange-url https://sts.googleapis.com/v1/token \
  --token-exchange-scopes 'https://www.googleapis.com/auth/bigquery,https://www.googleapis.com/auth/cloud-platform' \
  --token-exchange-audience //iam.googleapis.com/projects/&amp;lt;GOOGLE_PROJECT_NUMBER&amp;gt;/locations/global/workloadIdentityPools/okta-pool/providers/okta-provider
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here’s what each flag does:&lt;br&gt;
&lt;strong&gt;--target-uri&lt;/strong&gt;: Points to the MCP server’s proxy port (from the previous step)&lt;br&gt;
&lt;strong&gt;--remote-auth-client-id&lt;/strong&gt;: Client ID of your Okta app (from step 1)&lt;br&gt;
&lt;strong&gt;--remote-auth-client-secret&lt;/strong&gt;: Client secret of your Okta app (from step 1)&lt;br&gt;
&lt;strong&gt;--remote-auth okta&lt;/strong&gt;: Specifies the remote auth provider&lt;br&gt;
&lt;strong&gt;--remote-auth-issuer&lt;/strong&gt;: URL of the Okta authorization server’s issuer (from step 2)&lt;br&gt;
&lt;strong&gt;--remote-auth-callback-port&lt;/strong&gt;: Local port used for the OAuth callback (must match the callback URL used in step 1)&lt;br&gt;
&lt;strong&gt;--remote-auth-scopes&lt;/strong&gt;: Scopes requested from Okta during authentication&lt;br&gt;
&lt;strong&gt;--port&lt;/strong&gt;: Port the ToolHive proxy exposes to clients&lt;br&gt;
&lt;strong&gt;--token-exchange-url&lt;/strong&gt;: Google STS endpoint for exchanging tokens&lt;br&gt;
&lt;strong&gt;--token-exchange-scopes&lt;/strong&gt;: Google Cloud scopes required to access BigQuery and related APIs&lt;br&gt;
&lt;strong&gt;--token-exchange-audience&lt;/strong&gt;: Google Workload Identity Pool audience for Okta federation&lt;/p&gt;

&lt;p&gt;When your browser opens, sign in with Okta. The proxy uses your Okta credentials to generate ID tokens, exchange them for valid Google tokens with the right scopes, and then continues the request automatically.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 5: Run the MCP server with Claude or another client
&lt;/h3&gt;

&lt;p&gt;Let’s use Claude Code as an example. Because ToolHive doesn’t automatically manage client configurations for proxied MCP servers, you’ll need to add it manually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Add the authenticated ToolHive proxy
claude mcp add --scope user --transport http database-toolbox http://127.0.0.1:62614/mcp

# Run Claude Code
claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Toolbox MCP server uses the token provided by the ToolHive proxy and passes it to Google Cloud, giving you access to the resources available to your account.&lt;/p&gt;

&lt;p&gt;Any other MCP-compatible client can connect the same way. Just point it to the ToolHive proxy endpoint.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2lcbdunl8b64c8skcip.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2lcbdunl8b64c8skcip.png" alt="Claude and MCP" width="800" height="524"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this architecture is powerful
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simple for clients&lt;/strong&gt;: Apps connect to the ToolHive proxy just like any other MCP server endpoint.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secure authentication flow&lt;/strong&gt;: The proxy makes you log in through Okta, so every request carries a verified user identity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Federated access to Google Cloud&lt;/strong&gt;: Instead of embedding service account keys in your server, the proxy handles a token exchange so Google recognizes your identity through the workforce identity provider.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Least-privilege and auditable&lt;/strong&gt;: BigQuery jobs run under your federated Okta identity, so logs show “&lt;a href="mailto:user@domain.com"&gt;user@domain.com&lt;/a&gt; ran a BigQuery job” rather than “service-account X”.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Separation of concerns&lt;/strong&gt;: The MCP server (Toolbox) focuses on data tools and queries, while the proxy handles auth, token exchange, and routing. It’s a cleaner, safer architecture.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Of course, it’s easy to get started with ToolHive, since it’s free and open source. I encourage you to visit &lt;a href="https://toolhive.dev/" rel="noopener noreferrer"&gt;toolhive.dev&lt;/a&gt;, where you can download the project and explore our docs.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>security</category>
      <category>devops</category>
    </item>
    <item>
      <title>Using Token Exchange with ToolHive and Okta for MCP Server to GraphQL Authentication</title>
      <dc:creator>Yolanda Robla Mota</dc:creator>
      <pubDate>Tue, 04 Nov 2025 16:37:21 +0000</pubDate>
      <link>https://dev.to/stacklok/using-token-exchange-with-toolhive-and-okta-for-mcp-server-to-graphql-authentication-3ehi</link>
      <guid>https://dev.to/stacklok/using-token-exchange-with-toolhive-and-okta-for-mcp-server-to-graphql-authentication-3ehi</guid>
      <description>&lt;p&gt;This article builds on our &lt;a href="https://dev.to/stacklok/beyond-api-keys-token-exchange-identity-federation-mcp-servers-5dm8"&gt;previous post&lt;/a&gt;, where we introduced the core concepts of token exchange and its role in secure authentication. Here, we delve into a practical application, demonstrating how to leverage Okta and ToolHive to facilitate token exchange for authenticating an MCP server with a GraphQL API.&lt;/p&gt;

&lt;h2&gt;
  
  
  Environment
&lt;/h2&gt;

&lt;p&gt;This demo mimics a (hopefully!) real world example where we run an API service and we want to expose it with an MCP server. The back end API requires a token with &lt;em&gt;aud=backend&lt;/em&gt; and &lt;em&gt;scopes=[backend-api:read]&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;"Aud" (audience) in a token specifies the intended recipient of the token, indicating which service or application is meant to consume it. "Scopes" define the specific permissions or access rights granted by the token, detailing what actions the token holder is authorized to perform. Only tokens having the expected audience and the expected scopes authorize the caller to use the service.&lt;/p&gt;

&lt;p&gt;We don’t want to expose the back end service directly to the AI client, but only through the MCP server. We also want to maintain a clean audit trail showing us who accessed what.&lt;/p&gt;

&lt;p&gt;The MCP server requires a token with &lt;em&gt;aud=mcpserver&lt;/em&gt; and &lt;em&gt;scopes=mcp:tools:call&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;Both the API service and the MCP server are part of the same Okta realm, but we’ll use different Authorization Servers to ensure that both the token the MCP server receives and the token use different audiences.&lt;/p&gt;

&lt;p&gt;We’ll simulate the whole flow as a developer connecting to this setup by adding the MCP server to VSCode and calling the tools it provides.&lt;/p&gt;

&lt;p&gt;It should be noted that in this example, we’ll be using an &lt;a href="https://www.apollographql.com/docs" rel="noopener noreferrer"&gt;Apollo&lt;/a&gt;-based GraphQL service as the backend API service and the existing &lt;a href="https://www.apollographql.com/docs/apollo-mcp-server" rel="noopener noreferrer"&gt;Apollo MCP server&lt;/a&gt;, but the same setup applies to any kind of API services as long as they both use OAuth tokens from the same realm as the authentication mechanism. &lt;/p&gt;

&lt;p&gt;In order to follow along, you can clone the Apollo GraphQL service from &lt;a href="https://github.com/StacklokLabs/apollo-mcp-auth-demo" rel="noopener noreferrer"&gt;a demo repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Okta setup
&lt;/h2&gt;

&lt;p&gt;I’ve used the Okta integrator setup to prepare this demo and therefore the instructions cover the whole setup from the ground up including creating the Authorization Servers. This is likely not needed or needs to be adjusted in a real world environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Authorization Servers
&lt;/h3&gt;

&lt;p&gt;To logically separate the MCP server from the back end API service, we’ll configure two Okta Authorization servers - one for the MCP server and client and the other for the backend server. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftyrkj0nwo8nmto911fag.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftyrkj0nwo8nmto911fag.png" alt="Okta authorization servers" width="512" height="306"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Create the Authorization Servers and then the following scopes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;mcpserver AS &lt;em&gt;mcp:tools:call&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;backend AS &lt;em&gt;backend-api:read&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Trust between authorization servers
&lt;/h4&gt;

&lt;p&gt;In order to enable token exchange between two authorization servers - the one that issues tokens for access to the MCP server and the one that issues tokens for accessing the back end, we need to establish trust between the two.&lt;/p&gt;

&lt;p&gt;Go to the back end AS and down at the settings tab, add the mcpserver AS as trusted:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wycf3a7pyhzfjua0719.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wycf3a7pyhzfjua0719.png" alt="Okta trusted server" width="419" height="512"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Applications
&lt;/h3&gt;

&lt;p&gt;We’ll set up two Applications:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A &lt;em&gt;VSCode client&lt;/em&gt; to authenticate to the MCP server. We create a client directly to avoid Dynamic Client registration. This will be an OIDC application with a client ID and a secret. It is important to match the Redirect URIs that VSCode uses. Set the Redirect URIs to &lt;a href="http://127.0.0.1:33418" rel="noopener noreferrer"&gt;http://127.0.0.1:33418&lt;/a&gt; and &lt;a href="https://vscode.dev/redirect" rel="noopener noreferrer"&gt;https://vscode.dev/redirect&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;A &lt;em&gt;toolhive client&lt;/em&gt; that will perform the Token Exchange. This is an API Services type in Okta lingo. To create the application, go to:&lt;/li&gt;
&lt;li&gt;Applications -&amp;gt; Create App Integration and select API Services&lt;/li&gt;
&lt;li&gt;Name your application&lt;/li&gt;
&lt;li&gt;In the application page, navigate to the General Settings page and uncheck the “Require Demonstrating Proof of Possession” header as this is not yet supported by ToolHive&lt;/li&gt;
&lt;li&gt;Check the Token Exchange grant&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyfkajllk1la4o93y0vve.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyfkajllk1la4o93y0vve.png" alt="Token exchange grant" width="478" height="512"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Policies
&lt;/h3&gt;

&lt;p&gt;In order for applications to authenticate, we need to include them in policies, otherwise Okta will not issue tokens to the clients. We’ll define two policies: One that allows the MCP Client (VSCode) to request tokens with &lt;em&gt;mcp:tools:call&lt;/em&gt; and another one that allows the token exchange by the ToolHive process.&lt;/p&gt;

&lt;h4&gt;
  
  
  MCP client to MCP server
&lt;/h4&gt;

&lt;p&gt;This policy is to be defined on the mcpserver AS side. Select “Add New Access Policy”, then “Assign to the following Clients” and select the VSCode client. When the policy is created, click “Add Rule” in the policy and in the “And the following scopes” section add both the “OpenID Connect” scopes and the &lt;em&gt;mcp:tools:call&lt;/em&gt; scopes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22mxe559jdbd901o637c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22mxe559jdbd901o637c.png" alt="Scopes" width="512" height="326"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  MCP server token exchange
&lt;/h4&gt;

&lt;p&gt;This policy is to be defined on the back end AS side. Select “Add New Access Policy”, then “Assign to the following Clients” and select the ToolHive client. When adding the rule, don’t forget to unroll “Advanced” under the “If Grant Type Is” section and add Token Exchange. Add “&lt;em&gt;backend-api:read&lt;/em&gt;” to the scopes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fed28ypvi5nioio3fpu5z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fed28ypvi5nioio3fpu5z.png" alt="Scopes" width="512" height="341"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs8uarrmw0krmln9zc3vi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs8uarrmw0krmln9zc3vi.png" alt="Token exchange" width="512" height="463"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Running the GraphQL server
&lt;/h3&gt;

&lt;p&gt;Let’s clone our server locally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/StacklokLabs/apollo-mcp-auth-demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, let’s configure the IDP settings in the .env file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cp .env.example .env
vim .env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using my Okta integrator account, the .env file looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Okta Configuration
# Your Okta domain (e.g., dev-123456.okta.com)
OKTA_DOMAIN=integrator-3683736.okta.com

# Your Okta issuer URL (authorization server)
# For default authorization server: https://your-domain.okta.com/oauth2/default
# For custom authorization server: https://your-domain.okta.com/oauth2/{authServerId}
OKTA_ISSUER=https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697

# JWT Validation Configuration
# Expected audience in JWT tokens (space-separated if multiple)
OKTA_AUDIENCE=backend
# Required scopes in JWT tokens (space-separated)
REQUIRED_SCOPES=backend-api:read

# Authentication Configuration
# Set to 'true' to require valid tokens for all requests (recommended)
# Set to 'false' to disable authentication requirement (for testing)
REQUIRE_AUTH=true

# Server Configuration
PORT=4000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we’re ready to start the server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npm install
npm start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running ToolHive
&lt;/h3&gt;

&lt;p&gt;In our testing, we’re using the already existing Apollo MCP server with no modifications - all the heavy lifting is done by ToolHive. The Apollo MCP server is merely configured to accept the downstream authentication token in the &lt;em&gt;Authorization: Bearer&lt;/em&gt; HTTP header and forward it to the external API.&lt;br&gt;
The MCP server configuration can be found in the &lt;a href="https://github.com/StacklokLabs/apollo-mcp-auth-demo/blob/main/mcp-server-data/apollo-mcp-config.yaml" rel="noopener noreferrer"&gt;mcp-server-data directory&lt;/a&gt; in the demo repository.&lt;/p&gt;

&lt;p&gt;Because the unmodified MCP server also validates the incoming tokens, we need to set the &lt;em&gt;transport.auth.servers&lt;/em&gt; attribute in the config file to the &lt;em&gt;back end&lt;/em&gt; Authorization server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;vim mcp-server-data/apollo-mcp-config.yaml

...
transport:
  type: sse
  port: 8000
  auth:
    servers:
      - https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can run the server with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;thv run \
--debug \
--foreground \
--transport streamable-http \
--name apollo \
--target-port 8000 \
--proxy-port 8000 \
--volume $(pwd)/mcp-server-data/apollo-mcp-config.yaml:/config.yaml \
--volume $(pwd)/mcp-server-data:/data \
--oidc-audience mcpserver \
--resource-url http://localhost:8000/mcp \
       --oidc-issuer https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697 \
--oidc-jwks-url https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697/v1/keys \
--token-exchange-audience backend \
--token-exchange-client-id 0oawdgw7krVBSwzIx697 \
--token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 \
--token-exchange-scopes backend-api:read \
--token-exchange-url https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token \
apollo-mcp-server -- /config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s unpack the parameters:&lt;br&gt;
--oidc-audience mcpserver - When the OIDC token from VSCode arrives to toolhive, then toolhive checks if the token’s aud field matches this value and rejects the connection otherwise&lt;/p&gt;

&lt;p&gt;--resource-url &lt;a href="http://localhost:9090/mcp" rel="noopener noreferrer"&gt;http://localhost:9090/mcp&lt;/a&gt; - Setting the resource explicitly helps VSCode discover the proper Protected Resource Metadata Endpoint as per the MCP specification and in effect points VSCode to the Okta instance. Typically not needed in e.g. Kubernetes environments where the service name can be used&lt;/p&gt;

&lt;p&gt;--oidc-issuer &lt;a href="https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697" rel="noopener noreferrer"&gt;https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697&lt;/a&gt; - This is the issuer of the mcpserver Authorization Server (see the first screenshot of the document)&lt;/p&gt;

&lt;p&gt;--oidc-jwks-url &lt;a href="https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697/v1/keys" rel="noopener noreferrer"&gt;https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697/v1/keys&lt;/a&gt; - The JWKS endpoint of the mcpserver Authorization Server&lt;/p&gt;

&lt;p&gt;--token-exchange-audience 'backend' - We want ToolHive to take the incoming tokens and exchange them for tokens with audience of “backend”&lt;/p&gt;

&lt;p&gt;--token-exchange-client-id 0oawdgw7krVBSwzIx697 - The Client ID of the “ToolHive client”, the one who has assigned the token exchange policy to itself&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;--token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 - the client secret of the ToolHive client. Outside demos, please use the --token-exchange-client-secret-file switch instead, or the TOOLHIVE_TOKEN_EXCHANGE_CLIENT_SECRET environment variable.

--token-exchange-scopes 'backend-api:read' - The scopes we request for the external token. Must match what’s in the policy.

--token-exchange-url [https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token](https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token) - the token endpoint of the back end Authorization Server.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Note that the example above uses &lt;em&gt;thv run&lt;/em&gt;, but it’s equally possible to use the token exchange from thv proxy which can then also provide authentication to the MCP server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;thv proxy demo-mcp-server \
    --target-uri http://localhost:8091 \
    --port 3000 \
    --remote-auth \
    --remote-auth-client-id 0oawdhc2mlgHOwNvW697 \
    --remote-auth-client-secret Ag0Zj6ALuxxqascP6KJ-CA4uCRcOLmIKtQeR_o3ClGgxMxx0zcgZYYtg-TmHF6U- \
    --remote-auth-issuer https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697 \
    --remote-auth-scopes 'mcp:tools:call,openid,email' \
    --token-exchange-audience 'backend' \
    --token-exchange-client-id 0oawdgw7krVBSwzIx697 \
    --token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 \
    --token-exchange-scopes 'backend-api:read' \
    --token-exchange-url https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Authentication from VSCode and putting it all together
&lt;/h3&gt;

&lt;p&gt;Once the server is running, it should automatically appear in the list of the configured MCP servers in VSCode. Clicking Start will prompt authentication against Okta. The first time, you’ll be prompted to enter the client ID and secret as well. Once Okta authenticates, VSCode receives the token, uses it to authenticate to the MCP server (toolhive) which exchanges the token which enables calling the back end API.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqacyukmvg7hdiuh9sbi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqacyukmvg7hdiuh9sbi.png" alt="VSCode" width="800" height="221"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Past the initial setup on the IDP side, authentication and authorization to the MCP server fronted by ToolHive and by extension the back end service is seamless and allows partition access to the back end services as well as provides a cleaner audit trail.&lt;/p&gt;

&lt;p&gt;As the last step, we can invoke one of the MCP tools to verify the setup end-to-end:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0aakq5mtogxz8bdcq0d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0aakq5mtogxz8bdcq0d.png" alt="MCP tools" width="800" height="811"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As seen on the screenshot above, the GetCountry tool of the Apollo server was called and returned a reply! If we check the logs of the API server we ran earlier we also see details of the token that was validated:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4frsoeoauh9o04zydipb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4frsoeoauh9o04zydipb.png" alt="Tool usage" width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This token has different audience than the one passed to the ToolHive - if you recall the thv run parameters, they specified, through the &lt;em&gt;--oidc-audience&lt;/em&gt; mcpserver argument that the tokens must set the &lt;em&gt;aud&lt;/em&gt; claim to &lt;em&gt;mcpserver&lt;/em&gt; while the token that arrived to the back end API has audience &lt;em&gt;backend&lt;/em&gt;. Looking closely at the issuer, we also see that the token was issued by the back end Authorization Server, while the tokens issued to authenticate to ToolHive were issued by the mcpserver Authorization Server. This shows that the token exchange works correctly. In the next section, we’ll illustrate for completeness’ sake how the tokens look exactly and how the whole flow works.&lt;/p&gt;

&lt;h2&gt;
  
  
  The token exchange under the hood
&lt;/h2&gt;

&lt;p&gt;The flow is described in the Mermaid diagram below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F89fvfxlk28z6kfz5dohy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F89fvfxlk28z6kfz5dohy.png" alt="Diagram" width="512" height="452"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The client authenticates to the toolhive which exposes the interface and endpoints as the &lt;a href="https://modelcontextprotocol.io/docs/tutorials/security/authorization" rel="noopener noreferrer"&gt;MCP standard describes&lt;/a&gt;. The toolhive authentication middleware verifies the token was issued by the expected IDP and has the expected audience. After authentication, the token is then passed to the Token Exchange middleware which contacts the IDP and exchanges the token meant for the MCP server for the token meant for the external service.&lt;/p&gt;

&lt;p&gt;The token issued to the client might look like this (simplified):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "iss": https://idp.example.com/oauth2/default",
    "aud": "mcp-server",
    "scp": [
        "backend-mcp:tools:call",
        "backend-mcp:tools:list",
    ],
    "sub": "user@example.com",
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While the exchanged token would have different scopes and a different audience, allowing the MCP server to authenticate to the back end service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "iss": https://idp.example.com/oauth2/default",
    "aud": "backend-server",
    "scp": [
        "backend-api:read",
    ],
    "sub": "user@example.com",
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This exchanged token is then injected into the &lt;em&gt;Authorization: Bearer&lt;/em&gt; HTTP header and passed on to the actual MCP server running under Toolhive. The MCP server can then use the token.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary and benefits
&lt;/h2&gt;

&lt;p&gt;By leveraging token exchange, ToolHive enables MCP servers to authenticate to third-party APIs in a &lt;strong&gt;secure, efficient, and tenant-aware&lt;/strong&gt; way. MCP servers receive properly scoped, short-lived access tokens instead of embedding long-lived secrets or bespoke authentication logic. Each API call made upstream can be attributed to the &lt;strong&gt;individual user identity&lt;/strong&gt; rather than a generic service account, making audit trails clearer and more meaningful.&lt;/p&gt;

&lt;h3&gt;
  
  
  References
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://modelcontextprotocol.io/docs/tutorials/security/authorization" rel="noopener noreferrer"&gt;https://modelcontextprotocol.io/docs/tutorials/security/authorization&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developer.okta.com/docs/guides/set-up-token-exchange/main/" rel="noopener noreferrer"&gt;https://developer.okta.com/docs/guides/set-up-token-exchange/main/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>security</category>
      <category>devops</category>
    </item>
    <item>
      <title>Using Token Exchange with ToolHive and Okta for MCP Server to GraphQL Authentication</title>
      <dc:creator>Yolanda Robla Mota</dc:creator>
      <pubDate>Tue, 04 Nov 2025 16:37:21 +0000</pubDate>
      <link>https://dev.to/stacklok/using-token-exchange-with-toolhive-and-okta-for-mcp-server-to-graphql-authentication-12in</link>
      <guid>https://dev.to/stacklok/using-token-exchange-with-toolhive-and-okta-for-mcp-server-to-graphql-authentication-12in</guid>
      <description>&lt;p&gt;This article builds on our &lt;a href="https://dev.to/stacklok/beyond-api-keys-token-exchange-identity-federation-mcp-servers-5dm8"&gt;previous post&lt;/a&gt;, where we introduced the core concepts of token exchange and its role in secure authentication. Here, we delve into a practical application, demonstrating how to leverage Okta and ToolHive to facilitate token exchange for authenticating an MCP server with a GraphQL API.&lt;/p&gt;

&lt;h2&gt;
  
  
  Environment
&lt;/h2&gt;

&lt;p&gt;This demo mimics a (hopefully!) real world example where we run an API service and we want to expose it with an MCP server. The back end API requires a token with &lt;em&gt;aud=backend&lt;/em&gt; and &lt;em&gt;scopes=[backend-api:read]&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;"Aud" (audience) in a token specifies the intended recipient of the token, indicating which service or application is meant to consume it. "Scopes" define the specific permissions or access rights granted by the token, detailing what actions the token holder is authorized to perform. Only tokens having the expected audience and the expected scopes authorize the caller to use the service.&lt;/p&gt;

&lt;p&gt;We don’t want to expose the back end service directly to the AI client, but only through the MCP server. We also want to maintain a clean audit trail showing us who accessed what.&lt;/p&gt;

&lt;p&gt;The MCP server requires a token with &lt;em&gt;aud=mcpserver&lt;/em&gt; and &lt;em&gt;scopes=mcp:tools:call&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;Both the API service and the MCP server are part of the same Okta realm, but we’ll use different Authorization Servers to ensure that both the token the MCP server receives and the token use different audiences.&lt;/p&gt;

&lt;p&gt;We’ll simulate the whole flow as a developer connecting to this setup by adding the MCP server to VSCode and calling the tools it provides.&lt;/p&gt;

&lt;p&gt;It should be noted that in this example, we’ll be using an &lt;a href="https://www.apollographql.com/docs" rel="noopener noreferrer"&gt;Apollo&lt;/a&gt;-based GraphQL service as the backend API service and the existing &lt;a href="https://www.apollographql.com/docs/apollo-mcp-server" rel="noopener noreferrer"&gt;Apollo MCP server&lt;/a&gt;, but the same setup applies to any kind of API services as long as they both use OAuth tokens from the same realm as the authentication mechanism. &lt;/p&gt;

&lt;p&gt;In order to follow along, you can clone the Apollo GraphQL service from &lt;a href="https://github.com/StacklokLabs/apollo-mcp-auth-demo" rel="noopener noreferrer"&gt;a demo repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Okta setup
&lt;/h2&gt;

&lt;p&gt;I’ve used the Okta integrator setup to prepare this demo and therefore the instructions cover the whole setup from the ground up including creating the Authorization Servers. This is likely not needed or needs to be adjusted in a real world environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Authorization Servers
&lt;/h3&gt;

&lt;p&gt;To logically separate the MCP server from the back end API service, we’ll configure two Okta Authorization servers - one for the MCP server and client and the other for the backend server. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftyrkj0nwo8nmto911fag.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftyrkj0nwo8nmto911fag.png" alt="Okta authorization servers" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Create the Authorization Servers and then the following scopes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;mcpserver AS &lt;em&gt;mcp:tools:call&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;backend AS &lt;em&gt;backend-api:read&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Trust between authorization servers
&lt;/h4&gt;

&lt;p&gt;In order to enable token exchange between two authorization servers - the one that issues tokens for access to the MCP server and the one that issues tokens for accessing the back end, we need to establish trust between the two.&lt;/p&gt;

&lt;p&gt;Go to the back end AS and down at the settings tab, add the mcpserver AS as trusted:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wycf3a7pyhzfjua0719.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wycf3a7pyhzfjua0719.png" alt="Okta trusted server" width="419" height="512"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Applications
&lt;/h3&gt;

&lt;p&gt;We’ll set up two Applications:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A &lt;em&gt;VSCode client&lt;/em&gt; to authenticate to the MCP server. We create a client directly to avoid Dynamic Client registration. This will be an OIDC application with a client ID and a secret. It is important to match the Redirect URIs that VSCode uses. Set the Redirect URIs to &lt;a href="http://127.0.0.1:33418" rel="noopener noreferrer"&gt;http://127.0.0.1:33418&lt;/a&gt; and &lt;a href="https://vscode.dev/redirect" rel="noopener noreferrer"&gt;https://vscode.dev/redirect&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;A &lt;em&gt;toolhive client&lt;/em&gt; that will perform the Token Exchange. This is an API Services type in Okta lingo. To create the application, go to:&lt;/li&gt;
&lt;li&gt;Applications -&amp;gt; Create App Integration and select API Services&lt;/li&gt;
&lt;li&gt;Name your application&lt;/li&gt;
&lt;li&gt;In the application page, navigate to the General Settings page and uncheck the “Require Demonstrating Proof of Possession” header as this is not yet supported by ToolHive&lt;/li&gt;
&lt;li&gt;Check the Token Exchange grant&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyfkajllk1la4o93y0vve.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyfkajllk1la4o93y0vve.png" alt="Token exchange grant" width="478" height="512"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Policies
&lt;/h3&gt;

&lt;p&gt;In order for applications to authenticate, we need to include them in policies, otherwise Okta will not issue tokens to the clients. We’ll define two policies: One that allows the MCP Client (VSCode) to request tokens with &lt;em&gt;mcp:tools:call&lt;/em&gt; and another one that allows the token exchange by the ToolHive process.&lt;/p&gt;

&lt;h4&gt;
  
  
  MCP client to MCP server
&lt;/h4&gt;

&lt;p&gt;This policy is to be defined on the mcpserver AS side. Select “Add New Access Policy”, then “Assign to the following Clients” and select the VSCode client. When the policy is created, click “Add Rule” in the policy and in the “And the following scopes” section add both the “OpenID Connect” scopes and the &lt;em&gt;mcp:tools:call&lt;/em&gt; scopes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22mxe559jdbd901o637c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22mxe559jdbd901o637c.png" alt="Scopes" width="512" height="326"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  MCP server token exchange
&lt;/h4&gt;

&lt;p&gt;This policy is to be defined on the back end AS side. Select “Add New Access Policy”, then “Assign to the following Clients” and select the ToolHive client. When adding the rule, don’t forget to unroll “Advanced” under the “If Grant Type Is” section and add Token Exchange. Add “&lt;em&gt;backend-api:read&lt;/em&gt;” to the scopes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fed28ypvi5nioio3fpu5z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fed28ypvi5nioio3fpu5z.png" alt="Scopes" width="512" height="341"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs8uarrmw0krmln9zc3vi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs8uarrmw0krmln9zc3vi.png" alt="Token exchange" width="512" height="463"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Running the GraphQL server
&lt;/h3&gt;

&lt;p&gt;Let’s clone our server locally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/StacklokLabs/apollo-mcp-auth-demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, let’s configure the IDP settings in the .env file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cp .env.example .env
vim .env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using my Okta integrator account, the .env file looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Okta Configuration
# Your Okta domain (e.g., dev-123456.okta.com)
OKTA_DOMAIN=integrator-3683736.okta.com

# Your Okta issuer URL (authorization server)
# For default authorization server: https://your-domain.okta.com/oauth2/default
# For custom authorization server: https://your-domain.okta.com/oauth2/{authServerId}
OKTA_ISSUER=https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697

# JWT Validation Configuration
# Expected audience in JWT tokens (space-separated if multiple)
OKTA_AUDIENCE=backend
# Required scopes in JWT tokens (space-separated)
REQUIRED_SCOPES=backend-api:read

# Authentication Configuration
# Set to 'true' to require valid tokens for all requests (recommended)
# Set to 'false' to disable authentication requirement (for testing)
REQUIRE_AUTH=true

# Server Configuration
PORT=4000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we’re ready to start the server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npm install
npm start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running ToolHive
&lt;/h3&gt;

&lt;p&gt;In our testing, we’re using the already existing Apollo MCP server with no modifications - all the heavy lifting is done by ToolHive. The Apollo MCP server is merely configured to accept the downstream authentication token in the &lt;em&gt;Authorization: Bearer&lt;/em&gt; HTTP header and forward it to the external API.&lt;br&gt;
The MCP server configuration can be found in the &lt;a href="https://github.com/StacklokLabs/apollo-mcp-auth-demo/blob/main/mcp-server-data/apollo-mcp-config.yaml" rel="noopener noreferrer"&gt;mcp-server-data directory&lt;/a&gt; in the demo repository.&lt;/p&gt;

&lt;p&gt;Because the unmodified MCP server also validates the incoming tokens, we need to set the &lt;em&gt;transport.auth.servers&lt;/em&gt; attribute in the config file to the &lt;em&gt;back end&lt;/em&gt; Authorization server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;vim mcp-server-data/apollo-mcp-config.yaml

...
transport:
  type: sse
  port: 8000
  auth:
    servers:
      - https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can run the server with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;thv run \
--debug \
--foreground \
--transport streamable-http \
--name apollo \
--target-port 8000 \
--proxy-port 8000 \
--volume $(pwd)/mcp-server-data/apollo-mcp-config.yaml:/config.yaml \
--volume $(pwd)/mcp-server-data:/data \
--oidc-audience mcpserver \
--resource-url http://localhost:8000/mcp \
       --oidc-issuer https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697 \
--oidc-jwks-url https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697/v1/keys \
--token-exchange-audience backend \
--token-exchange-client-id 0oawdgw7krVBSwzIx697 \
--token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 \
--token-exchange-scopes backend-api:read \
--token-exchange-url https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token \
apollo-mcp-server -- /config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s unpack the parameters:&lt;br&gt;
--oidc-audience mcpserver - When the OIDC token from VSCode arrives to toolhive, then toolhive checks if the token’s aud field matches this value and rejects the connection otherwise&lt;/p&gt;

&lt;p&gt;--resource-url &lt;a href="http://localhost:9090/mcp" rel="noopener noreferrer"&gt;http://localhost:9090/mcp&lt;/a&gt; - Setting the resource explicitly helps VSCode discover the proper Protected Resource Metadata Endpoint as per the MCP specification and in effect points VSCode to the Okta instance. Typically not needed in e.g. Kubernetes environments where the service name can be used&lt;/p&gt;

&lt;p&gt;--oidc-issuer &lt;a href="https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697" rel="noopener noreferrer"&gt;https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697&lt;/a&gt; - This is the issuer of the mcpserver Authorization Server (see the first screenshot of the document)&lt;/p&gt;

&lt;p&gt;--oidc-jwks-url &lt;a href="https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697/v1/keys" rel="noopener noreferrer"&gt;https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697/v1/keys&lt;/a&gt; - The JWKS endpoint of the mcpserver Authorization Server&lt;/p&gt;

&lt;p&gt;--token-exchange-audience 'backend' - We want ToolHive to take the incoming tokens and exchange them for tokens with audience of “backend”&lt;/p&gt;

&lt;p&gt;--token-exchange-client-id 0oawdgw7krVBSwzIx697 - The Client ID of the “ToolHive client”, the one who has assigned the token exchange policy to itself&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;--token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 - the client secret of the ToolHive client. Outside demos, please use the --token-exchange-client-secret-file switch instead, or the TOOLHIVE_TOKEN_EXCHANGE_CLIENT_SECRET environment variable.

--token-exchange-scopes 'backend-api:read' - The scopes we request for the external token. Must match what’s in the policy.

--token-exchange-url [https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token](https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token) - the token endpoint of the back end Authorization Server.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Note that the example above uses &lt;em&gt;thv run&lt;/em&gt;, but it’s equally possible to use the token exchange from thv proxy which can then also provide authentication to the MCP server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;thv proxy demo-mcp-server \
    --target-uri http://localhost:8091 \
    --port 3000 \
    --remote-auth \
    --remote-auth-client-id 0oawdhc2mlgHOwNvW697 \
    --remote-auth-client-secret Ag0Zj6ALuxxqascP6KJ-CA4uCRcOLmIKtQeR_o3ClGgxMxx0zcgZYYtg-TmHF6U- \
    --remote-auth-issuer https://integrator-3683736.okta.com/oauth2/ausw8f1ut6X0WMjZN697 \
    --remote-auth-scopes 'mcp:tools:call,openid,email' \
    --token-exchange-audience 'backend' \
    --token-exchange-client-id 0oawdgw7krVBSwzIx697 \
    --token-exchange-client-secret O2zqVb-evhKgfBOD-PRVDs5HFyCXAnRZAwxAtQOH9oGt72aBrLBiwEVlyyTengj9 \
    --token-exchange-scopes 'backend-api:read' \
    --token-exchange-url https://integrator-3683736.okta.com/oauth2/auswdh3wurjeJ62La697/v1/token
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Authentication from VSCode and putting it all together
&lt;/h3&gt;

&lt;p&gt;Once the server is running, it should automatically appear in the list of the configured MCP servers in VSCode. Clicking Start will prompt authentication against Okta. The first time, you’ll be prompted to enter the client ID and secret as well. Once Okta authenticates, VSCode receives the token, uses it to authenticate to the MCP server (toolhive) which exchanges the token which enables calling the back end API.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqacyukmvg7hdiuh9sbi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqacyukmvg7hdiuh9sbi.png" alt="VSCode" width="800" height="221"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Past the initial setup on the IDP side, authentication and authorization to the MCP server fronted by ToolHive and by extension the back end service is seamless and allows partition access to the back end services as well as provides a cleaner audit trail.&lt;/p&gt;

&lt;p&gt;As the last step, we can invoke one of the MCP tools to verify the setup end-to-end:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0aakq5mtogxz8bdcq0d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0aakq5mtogxz8bdcq0d.png" alt="MCP tools" width="800" height="811"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As seen on the screenshot above, the GetCountry tool of the Apollo server was called and returned a reply! If we check the logs of the API server we ran earlier we also see details of the token that was validated:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4frsoeoauh9o04zydipb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4frsoeoauh9o04zydipb.png" alt="Tool usage" width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This token has different audience than the one passed to the ToolHive - if you recall the thv run parameters, they specified, through the &lt;em&gt;--oidc-audience&lt;/em&gt; mcpserver argument that the tokens must set the &lt;em&gt;aud&lt;/em&gt; claim to &lt;em&gt;mcpserver&lt;/em&gt; while the token that arrived to the back end API has audience &lt;em&gt;backend&lt;/em&gt;. Looking closely at the issuer, we also see that the token was issued by the back end Authorization Server, while the tokens issued to authenticate to ToolHive were issued by the mcpserver Authorization Server. This shows that the token exchange works correctly. In the next section, we’ll illustrate for completeness’ sake how the tokens look exactly and how the whole flow works.&lt;/p&gt;

&lt;h2&gt;
  
  
  The token exchange under the hood
&lt;/h2&gt;

&lt;p&gt;The flow is described in the Mermaid diagram below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F89fvfxlk28z6kfz5dohy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F89fvfxlk28z6kfz5dohy.png" alt="Diagram" width="512" height="452"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The client authenticates to the toolhive which exposes the interface and endpoints as the &lt;a href="https://modelcontextprotocol.io/docs/tutorials/security/authorization" rel="noopener noreferrer"&gt;MCP standard describes&lt;/a&gt;. The toolhive authentication middleware verifies the token was issued by the expected IDP and has the expected audience. After authentication, the token is then passed to the Token Exchange middleware which contacts the IDP and exchanges the token meant for the MCP server for the token meant for the external service.&lt;/p&gt;

&lt;p&gt;The token issued to the client might look like this (simplified):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "iss": https://idp.example.com/oauth2/default",
    "aud": "mcp-server",
    "scp": [
        "backend-mcp:tools:call",
        "backend-mcp:tools:list",
    ],
    "sub": "user@example.com",
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While the exchanged token would have different scopes and a different audience, allowing the MCP server to authenticate to the back end service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "iss": https://idp.example.com/oauth2/default",
    "aud": "backend-server",
    "scp": [
        "backend-api:read",
    ],
    "sub": "user@example.com",
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This exchanged token is then injected into the &lt;em&gt;Authorization: Bearer&lt;/em&gt; HTTP header and passed on to the actual MCP server running under Toolhive. The MCP server can then use the token.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary and benefits
&lt;/h2&gt;

&lt;p&gt;By leveraging token exchange, ToolHive enables MCP servers to authenticate to third-party APIs in a &lt;strong&gt;secure, efficient, and tenant-aware&lt;/strong&gt; way. MCP servers receive properly scoped, short-lived access tokens instead of embedding long-lived secrets or bespoke authentication logic. Each API call made upstream can be attributed to the &lt;strong&gt;individual user identity&lt;/strong&gt; rather than a generic service account, making audit trails clearer and more meaningful.&lt;/p&gt;

&lt;h3&gt;
  
  
  References
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://modelcontextprotocol.io/docs/tutorials/security/authorization" rel="noopener noreferrer"&gt;https://modelcontextprotocol.io/docs/tutorials/security/authorization&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developer.okta.com/docs/guides/set-up-token-exchange/main/" rel="noopener noreferrer"&gt;https://developer.okta.com/docs/guides/set-up-token-exchange/main/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>security</category>
      <category>devops</category>
    </item>
    <item>
      <title>Beyond API Keys: Token Exchange, Identity Federation &amp; MCP Servers</title>
      <dc:creator>Yolanda Robla Mota</dc:creator>
      <pubDate>Thu, 30 Oct 2025 11:04:03 +0000</pubDate>
      <link>https://dev.to/stacklok/beyond-api-keys-token-exchange-identity-federation-mcp-servers-5dm8</link>
      <guid>https://dev.to/stacklok/beyond-api-keys-token-exchange-identity-federation-mcp-servers-5dm8</guid>
      <description>&lt;p&gt;Modern backend systems—especially in the era of AI agents, MCP servers, and multi-cloud architectures—are evolving far beyond static credentials and monolithic identity models. In this post we explore the architecture of token exchange, identity federation, and how a system like &lt;a href="https://toolhive.dev" rel="noopener noreferrer"&gt;ToolHive&lt;/a&gt; enables secure deployment of MCP servers in this world.&lt;/p&gt;

&lt;h2&gt;
  
  
  The legacy problem: static credentials
&lt;/h2&gt;

&lt;p&gt;The MCP authorization specification focuses on how to authorize access to the MCP server itself. It doesn't specify how an MCP server should authenticate with the server it's connecting to. This leaves MCP server creators without clear guidance.&lt;/p&gt;

&lt;p&gt;In many deployments of MCP (Model Context Protocol) servers and tooling services today, developers still default to patterns like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A service-account JSON key or a long-lived API key embedded in configuration.&lt;/li&gt;
&lt;li&gt;All calls executed under a single “shared identity” with elevated permissions.&lt;/li&gt;
&lt;li&gt;If the key is compromised, the impact spans many users or tenants; rotating or tracking the key is operationally heavy.&lt;/li&gt;
&lt;li&gt;Least-privilege is often compromised because the shared identity needs broad access to avoid blocking tool invocation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach doesn’t align with how modern identity systems, federated services and cloud tools are designed. It’s less secure, harder to govern, and doesn’t scale across users or multi‐tenant environments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step up: Short-lived tokens via an IdP
&lt;/h2&gt;

&lt;p&gt;A much better pattern emerges when you shift to short-lived tokens:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A user (or service) authenticates via an Identity Provider (IdP) — for example, Okta or Azure AD.&lt;/li&gt;
&lt;li&gt;They receive a short-lived token (OIDC ID token or OAuth access token) that's scoped to their identity and minimal permissions.&lt;/li&gt;
&lt;li&gt;This token is used to authenticate to the MCP server (with the help of ToolHive), which validates it and establishes the user's identity.&lt;/li&gt;
&lt;li&gt;Toolhive then acquires a separate token for the downstream backend API—either through token exchange (if using the same IdP) or federation (if crossing identity domains).&lt;/li&gt;
&lt;li&gt;Your MCP server receives this backend-scoped token and uses it when calling downstream services or tools.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because tokens are scoped, time-limited, and mapped to a specific user context, you get better auditability, enforce least-privilege, and eliminate static credentials. Next, we’ll show you how to ensure that your MCP server always has the right credentials for its backend API without embedding secrets or handling complex auth flows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Token Exchange &amp;amp; Federation: crossing trust-boundaries
&lt;/h2&gt;

&lt;p&gt;Token exchange refers to the process where one security token (issued by one identity domain) is presented to a “Security Token Service” (STS) or similar endpoint, and in return you receive a new token valid for another domain, audience, or scope.&lt;br&gt;
The standard for this is &lt;a href="https://www.rfc-editor.org/rfc/rfc8693.html" rel="noopener noreferrer"&gt;RFC 8693&lt;/a&gt; (OAuth 2.0 Token Exchange) which lets you request a new token via a grant like &lt;em&gt;urn:ietf:params:oauth:grant-type:token-exchange&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use-cases for token exchange include:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A token issued by your internal IdP being exchanged for a token valid for a cloud provider’s API.&lt;/li&gt;
&lt;li&gt;A token from one IdP being reused to obtain tokens in another trust domain without forcing the user to log in again.&lt;/li&gt;
&lt;li&gt;A service acting on behalf of a user, exchanging its own token for one with narrower scopes or different audiences.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Two common scenarios
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;A) The downstream service uses the same IdP as the MCP server&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In this case your identity provider (IdP) issues tokens for both the MCP server and the downstream resources. No cross-domain trust is needed.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User authenticates via IdP → obtains a token for the MCP server.&lt;/li&gt;
&lt;li&gt;ToolHive validates the token and performs access control checks.&lt;/li&gt;
&lt;li&gt;ToolHive exchanges that token with the same IdP for a new token with the downstream service's audience and scopes.&lt;/li&gt;
&lt;li&gt;MCP server receives this exchanged token and uses it to call the downstream service.
​​- Simpler, fewer moving parts, since the exchange happens within the same IdP ecosystem.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F80v5zykok3bry9locp99.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F80v5zykok3bry9locp99.png" alt="Token exchange with single IDP" width="800" height="160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The token issued to the client might look like this (simplified):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
   "iss": https://idp.example.com/oauth2/default",
   "aud": "**mcp-server**",
   "scp": [
     "**backend-mcp:tools:call**",
     "**backend-mcp:tools:list**",
   ],
   "sub": "user@example.com",
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While the exchanged token would have different scopes and a different audience, allowing the MCP server to authenticate to the back end service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "iss": https://idp.example.com/oauth2/default",
    "aud": "**backend-server**",
    "scp": [
        "**backend-api:read**",
    ],
    "sub": "user@example.com",
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;B) The downstream service uses a different IdP and you rely on federation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here you have two distinct identity/trust domains: one used by the MCP server (or its IdP) and another used by the back end resource. Instead of issuing separate credentials or having users login twice, you rely on federation and token exchange.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User authenticates via IdP A → receives a token for domain A that is presented to ToolHive&lt;/li&gt;
&lt;li&gt;ToolHive validates the token and performs access control checks.&lt;/li&gt;
&lt;li&gt;ToolHive presents the token to an STS or federation service (e.g., Google Cloud STS) → obtains a federated token valid for domain B (cloud provider).&lt;/li&gt;
&lt;li&gt;Downstream service validates the token from domain B and executes requests under that identity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach enables your system to be IdP-agnostic and cloud-agnostic: authenticate with any IdP, then federate into any trust-configured domain.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgf5czj7ibj0nmg8trmra.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgf5czj7ibj0nmg8trmra.png" alt="Flow diagram about federation" width="800" height="160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The token issued to the client might look like this (simplified):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "iss": "**https://idp.example.com/oauth2/default**",
  "aud": "**mcp-server**",
  "sub": "user@example.com",
  "email": "user@example.com",
  "scp": [
    "**mcp:tools:call**",
    "**mcp:tools:list**"
  ],
  "exp": 1729641600,
  "iat": 1729638000
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The exchanged federated access token would have a different issuer, audience, and scopes, allowing the MCP server to authenticate to the upstream service as the federated user identity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "iss": "**https://sts.googleapis.com**",
  "aud": "**https://bigquery.googleapis.com/**",
  "sub": "**principal://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/subject/user@example.com**",
  "email": "user@example.com",
  "scp": [
    "**https://www.googleapis.com/auth/bigquery**",
  ],
  "exp": 1729641600,
  "iat": 1729638000
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why this matters for MCP servers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;MCP servers are often deployed to call different services on behalf of users. If they rely on static credentials or simplistic “shared identity” models, you lose user-level attribution, least-privilege control, and auditability.&lt;/li&gt;
&lt;li&gt;By using token exchange + federation, you allow your MCP server to operate under the right identity context, even when the target service sits in a different trust domain.&lt;/li&gt;
&lt;li&gt;It also lets you design your architecture so the authentication piece (login, token issuance) is decoupled from the MCP server logic — the server can remain auth-agnostic and medium-agnostic.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where ToolHive fits
&lt;/h2&gt;

&lt;p&gt;ToolHive simplifies deployment of MCP servers by handling the operational and security heavy-lifting.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You run your MCP servers in containers with minimal permissions and network access — ToolHive manages that.&lt;/li&gt;
&lt;li&gt;ToolHive acts as a gateway: it verifies the user's token (via your IdP), enforces access policies, then acquires the appropriate backend token—either through exchange or federation—before passing that to your MCP server.&lt;/li&gt;
&lt;li&gt;This separation means your MCP server remains auth-agnostic — ToolHive handles authN/authZ and you plug in any IdP or downstream STS.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;This blog post is the first in a series&lt;/strong&gt;. Over the coming posts we’ll dive into a set of &lt;strong&gt;practical examples using ToolHive&lt;/strong&gt; — showing how to wire up different IdPs, federate into different clouds, run MCP servers securely, and deal with real-world edge cases.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: ToolHive is an open source project, and we encourage you to download it (from &lt;a href="https://toolhive.dev" rel="noopener noreferrer"&gt;toolhive.dev&lt;/a&gt;) and start using it. We value your feedback and would love to engage with you via our &lt;a href="https://github.com/stacklok/toolhive" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; and/or &lt;a href="https://discord.gg/stacklok" rel="noopener noreferrer"&gt;Discord channel&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>devops</category>
      <category>security</category>
    </item>
    <item>
      <title>Cut token waste from your AI workflow with the ToolHive MCP Optimizer</title>
      <dc:creator>Dan Barr</dc:creator>
      <pubDate>Tue, 28 Oct 2025 17:12:08 +0000</pubDate>
      <link>https://dev.to/stacklok/cut-token-waste-from-your-ai-workflow-with-the-toolhive-mcp-optimizer-3oo6</link>
      <guid>https://dev.to/stacklok/cut-token-waste-from-your-ai-workflow-with-the-toolhive-mcp-optimizer-3oo6</guid>
      <description>&lt;p&gt;If you’ve ever hit a rate limit in your AI assistant or felt the sting of regret after checking your usage bill, you’re not alone. Whether you’re exploring an open source repo or triaging issues for a sprint, running into token walls is disruptive. It breaks your flow and burns your time and money.&lt;/p&gt;

&lt;p&gt;Turns out, there’s a hidden cost in many of today’s AI-enhanced dev workflows: &lt;strong&gt;tool metadata bloat&lt;/strong&gt;. When dozens (or hundreds) of tools get injected into each prompt, it drives up token usage and slows down responses. Input tokens aren’t free, and cluttering the context window with irrelevant content degrades model performance.&lt;/p&gt;

&lt;p&gt;At Stacklok, we’ve been working with the &lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt; and discovered something surprising. A significant chunk of the tokens burned during AI coding sessions doesn’t come from your prompt, or even the code. It comes from tool descriptions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP Optimizer&lt;/strong&gt;, now available in ToolHive, tackles this problem at the root. It reduces token waste by acting as a smart broker between your AI assistant and MCP servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the waste comes from
&lt;/h2&gt;

&lt;p&gt;Let’s say you’ve installed MCP servers for GitHub, Grafana, and Notion. You ask your assistant:&lt;/p&gt;

&lt;p&gt;“List the 10 most recent issues from my GitHub repo.”&lt;/p&gt;

&lt;p&gt;That simple prompt uses &lt;strong&gt;102,000 tokens&lt;/strong&gt; &lt;em&gt;(total input &amp;amp; output)&lt;/em&gt;, not because the task is complex, but because the model receives metadata for &lt;strong&gt;114 tools&lt;/strong&gt;, most of which have nothing to do with the request.&lt;/p&gt;

&lt;p&gt;Other common prompts create similar waste:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;“Summarize my meeting notes from October 19, 2025”&lt;br&gt;
uses &lt;strong&gt;240,600 tokens&lt;/strong&gt;, again with &lt;strong&gt;114 tools&lt;/strong&gt; injected, even though only the Notion server is relevant&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;“Search dashboards related to RDS”&lt;br&gt;
consumes &lt;strong&gt;93,600 tokens&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In each case, only a small fraction of those tokens are relevant to the task. Even saying “hello” burns more than 46,000 tokens.&lt;/p&gt;

&lt;p&gt;Multiply that across even a few dozen prompts per day, and you’re burning &lt;strong&gt;millions of tokens&lt;/strong&gt; on context the model doesn’t need. That’s not just expensive, it’s disruptive. In rate-limited enterprise environments or time-sensitive projects, this inefficiency slows down responses, breaks flow, and cuts directly into productivity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing MCP Optimizer: Smarter tool selection for leaner prompts
&lt;/h2&gt;

&lt;p&gt;Instead of flooding the model with all available tools, MCP Optimizer introduces two lightweight primitives:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;find_tool&lt;/code&gt;: Searches for the most relevant tools using hybrid semantic + keyword search
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;call_tool&lt;/code&gt;: Routes the selected tool request to the appropriate MCP server&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here’s how it works:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You send a prompt that requires tool assistance (for example, interacting with a GitHub repo)
&lt;/li&gt;
&lt;li&gt;The assistant calls &lt;code&gt;find_tool&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;MCP Optimizer returns the most relevant tools (up to 8 by default, but this is configurable)
&lt;/li&gt;
&lt;li&gt;Only those tools are included in the context
&lt;/li&gt;
&lt;li&gt;The assistant uses &lt;code&gt;call_tool&lt;/code&gt; to execute the task&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The results are dramatic. Using the GitHub, Grafana, and Notion MCP servers from the example above:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Prompt&lt;/th&gt;
&lt;th&gt;MCP server used&lt;/th&gt;
&lt;th&gt;Without MCP Optimizer&lt;/th&gt;
&lt;th&gt;With MCP Optimizer&lt;/th&gt;
&lt;th&gt;Token reduction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Hello&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Tokens*: 46.8k Tools sent: 114&lt;/td&gt;
&lt;td&gt;Tokens: 11.2k Tools sent: 3&lt;/td&gt;
&lt;td&gt;76%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;List the latest 10 issues from the stacklok/toolhive repository.&lt;/td&gt;
&lt;td&gt;GitHub&lt;/td&gt;
&lt;td&gt;Tokens: 102k Tools sent: 114&lt;/td&gt;
&lt;td&gt;Tokens: 32.4k Tools sent: 11&lt;/td&gt;
&lt;td&gt;68%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Summarize my meeting notes from Oct 19th 2025&lt;/td&gt;
&lt;td&gt;Notion&lt;/td&gt;
&lt;td&gt;Tokens: 240.6k Tools sent: 114&lt;/td&gt;
&lt;td&gt;Tokens: 86.8k Tools sent: 11&lt;/td&gt;
&lt;td&gt;64%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Search the dashboards related to "RDS" in my Grafana workspace&lt;/td&gt;
&lt;td&gt;Grafana&lt;/td&gt;
&lt;td&gt;Tokens: 93.6k Tools sent: 114&lt;/td&gt;
&lt;td&gt;Tokens: 13.7k Tools sent: 11&lt;/td&gt;
&lt;td&gt;85%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;* Total input &amp;amp; output tokens for the request&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;By sending only what’s needed, MCP Optimizer reduces total token usage, shortens response times, and prevents the assistant from thrashing through irrelevant tools.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy33zspy7ovfkbo418bx9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy33zspy7ovfkbo418bx9.png" alt="Bar chart comparing token usage before and after the MCP Optimizer" width="800" height="494"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No tokens wasted on excessive metadata. No LLMs spiraling as they try to reason through 100+ tools. Just fast, efficient execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it now
&lt;/h2&gt;

&lt;p&gt;MCP Optimizer is available today as an experimental feature in the ToolHive desktop app. Here’s how to get started:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://toolhive.dev/download/" rel="noopener noreferrer"&gt;Download ToolHive&lt;/a&gt; for your platform.
&lt;/li&gt;
&lt;li&gt;Follow the &lt;a href="https://docs.stacklok.com/toolhive/tutorials/quickstart-ui" rel="noopener noreferrer"&gt;Quickstart guide&lt;/a&gt; and &lt;a href="https://docs.stacklok.com/toolhive/guides-mcp" rel="noopener noreferrer"&gt;MCP usage guides&lt;/a&gt; to install a few MCP servers into the &lt;code&gt;default&lt;/code&gt; group (or another group of your choice).
&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Settings&lt;/strong&gt; (⚙️) screen, enable &lt;em&gt;MCP Optimizer&lt;/em&gt; under &lt;strong&gt;Experimental Features&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;On the &lt;strong&gt;MCP Servers&lt;/strong&gt; screen, click &lt;strong&gt;MCP Optimizer&lt;/strong&gt;, and enable optimization for the &lt;code&gt;default&lt;/code&gt; group.
&lt;/li&gt;
&lt;li&gt;Open the &lt;code&gt;default&lt;/code&gt; group and click &lt;strong&gt;Manage Clients&lt;/strong&gt; to connect your favorite AI client.
&lt;/li&gt;
&lt;li&gt;The optimizer discovers the MCP servers and tools in the default group, and ToolHive automatically connects your clients to the optimizer MCP server.
&lt;/li&gt;
&lt;li&gt;In your AI client, send prompts that require tool usage, like:
“Find a good first issue in the stacklok/toolhive repo to start working on.”&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6kscupkt4mga0zq52kqu.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6kscupkt4mga0zq52kqu.gif" alt=" " width="1328" height="708"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For more, see the &lt;a href="https://docs.stacklok.com/toolhive/tutorials/mcp-optimizer" rel="noopener noreferrer"&gt;full tutorial&lt;/a&gt; in the ToolHive documentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s next
&lt;/h2&gt;

&lt;p&gt;We’re building ToolHive and MCP Optimizer in the open, and your feedback helps shape what comes next.&lt;/p&gt;

&lt;p&gt;Explore the project at &lt;a href="https://toolhive.dev" rel="noopener noreferrer"&gt;toolhive.dev&lt;/a&gt; and join our &lt;a href="https://discord.gg/stacklok" rel="noopener noreferrer"&gt;community on Discord&lt;/a&gt; to share your experiences, suggest features, and help make tool-driven AI workflows faster, safer, and more developer-friendly.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
    </item>
    <item>
      <title>Simplify Your AI Agent Development: Test and Tune MCP Servers Instantly with the ToolHive Playground</title>
      <dc:creator>Samuele Verzi</dc:creator>
      <pubDate>Tue, 14 Oct 2025 12:20:05 +0000</pubDate>
      <link>https://dev.to/stacklok/simplify-your-ai-agent-development-test-and-tune-mcp-servers-instantly-with-the-toolhive-playground-5c3a</link>
      <guid>https://dev.to/stacklok/simplify-your-ai-agent-development-test-and-tune-mcp-servers-instantly-with-the-toolhive-playground-5c3a</guid>
      <description>&lt;p&gt;Developing capable AI agents means more than just connecting to a model. It requires testing, tuning, and managing the external tools and servers your agents rely on. That’s where the Model Context Protocol (MCP) comes in, enabling agents to interact with real-world systems through well-defined interfaces.&lt;/p&gt;

&lt;p&gt;But validating and iterating on those MCP servers can be tedious. The ToolHive playground streamlines that process by giving you a sandboxed, conversational environment to test and tune your MCP servers instantly, no complex configuration required. With the playground, you can move from debugging tools to building smarter, production-ready agents in record time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://toolhive.dev" rel="noopener noreferrer"&gt;ToolHive UI&lt;/a&gt; is an open-source project that makes it easy to test and manage MCP servers and their connection to AI clients. You can see the full source code on &lt;a href="https://github.com/stacklok/toolhive-studio" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here is how you can leverage the ToolHive playground to simplify your AI agent workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the playground offers
&lt;/h2&gt;

&lt;p&gt;The playground delivers powerful capabilities, all wrapped in a single, unified interface:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Instant testing&lt;/strong&gt;: You can immediately validate MCP server functionality. Just enter your AI model API key (such as for &lt;strong&gt;Anthropic&lt;/strong&gt; or &lt;strong&gt;OpenAI&lt;/strong&gt;), select the MCP servers, and begin testing. This eliminates the need for external tooling just to confirm your MCP server works correctly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detailed information&lt;/strong&gt;: Every interaction with your AI agent is meticulously logged. You see the tool's name, the exact input parameters passed to it, the execution status (success or failure), the raw response data, and the timing information. This visibility ensures you understand exactly how your MCP servers respond.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conversational server management&lt;/strong&gt;: The playground's built-in MCP server (&lt;code&gt;toolhive mcp&lt;/code&gt;) lets you manage your infrastructure using simple natural language commands, no command lines, no manual setup. It's integrated, clear management that feels like a conversation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local and remote server support&lt;/strong&gt;: ToolHive lets you run both local MCP servers (on your machine using Docker) and remote MCP servers (accessed via URL), giving you flexibility in how you deploy and test your tools.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting started in the playground
&lt;/h2&gt;

&lt;p&gt;Starting with the playground is straightforward. You only need to complete a few simple setup steps:&lt;/p&gt;

&lt;h3&gt;
  
  
  Access the playground
&lt;/h3&gt;

&lt;p&gt;Click the &lt;strong&gt;playground&lt;/strong&gt; tab in the ToolHive UI navigation bar.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F75u2e9i0sa2hknp9azwd.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F75u2e9i0sa2hknp9azwd.webp" alt="ToolHive playground starting page" width="800" height="522"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Configure a provider
&lt;/h3&gt;

&lt;p&gt;Click &lt;strong&gt;Configure your API Keys&lt;/strong&gt; to set up access to your chosen AI model providers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft57on86tzsg5cyc2kv45.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft57on86tzsg5cyc2kv45.webp" alt="ToolHive playground API Keys configuration panel showing multiple provider options" width="800" height="524"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can configure multiple accounts to test different models and providers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI&lt;/strong&gt; (for GPT models)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic&lt;/strong&gt; (for Claude models)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google&lt;/strong&gt; (for Gemini models)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;xAI&lt;/strong&gt; (for Grok models)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenRouter&lt;/strong&gt; (for access to multiple model providers)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Select MCP tools
&lt;/h3&gt;

&lt;p&gt;Click the tools icon to manage which MCP servers and tools are available to the AI model in the playground. Here, you can toggle the availability of tools from each server, and search or filter them. The &lt;strong&gt;&lt;code&gt;toolhive mcp&lt;/code&gt;&lt;/strong&gt; management server is enabled by default, providing infrastructure management capabilities for both your local and remote MCP servers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbj1z3lkjueg53ige0v26.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbj1z3lkjueg53ige0v26.webp" alt="ToolHive playground tools selection panel with enabled MCP servers and searchable tools list" width="800" height="523"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Start testing
&lt;/h3&gt;

&lt;p&gt;Once configured, you can start a conversation. The model will utilize all enabled MCP tools to respond to your queries.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz5vap0cvon17wwik15dz.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz5vap0cvon17wwik15dz.webp" alt="ToolHive playground main chat interface showing conversation with AI and MCP tool execution" width="800" height="523"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing complex workflows
&lt;/h2&gt;

&lt;p&gt;The playground isn't just for simple server validation, it offers an end-to-end testing environment with the features you'd expect from a modern AI client like rich media attachments and multi-server orchestration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-server orchestration
&lt;/h3&gt;

&lt;p&gt;You can combine multiple MCP servers to create powerful workflows. For example, enable both a &lt;a href="https://docs.stacklok.com/toolhive/guides-mcp/filesystem" rel="noopener noreferrer"&gt;filesystem MCP&lt;/a&gt; server and a data processing server simultaneously. The AI can intelligently coordinate between them to read files, process data, and write results—all through natural conversation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before testing&lt;/strong&gt;: Make sure your MCP servers are enabled in ToolHive, running, and also enabled in the playground's tool selection panel.&lt;/p&gt;

&lt;p&gt;Example workflow:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Read the JSON file from /projects/data/products.json, analyze the inventory levels, and create a summary report&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The AI will use the filesystem server to read the file, process the data using available tools, and provide structured insights.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rich media attachments
&lt;/h3&gt;

&lt;p&gt;The playground supports attaching images and PDF documents directly in the conversation, just like any modern AI client. This capability is essential for testing document analysis, image processing, or multimodal workflows.&lt;/p&gt;

&lt;p&gt;Use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Document processing&lt;/strong&gt;: Upload a PDF invoice and ask the AI to extract key information using your custom MCP tools&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image analysis&lt;/strong&gt;: Attach screenshots or diagrams and test how your MCP servers interact with visual data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data validation&lt;/strong&gt;: Share files that your MCP servers need to process and verify the output in real-time&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  End-to-end testing
&lt;/h3&gt;

&lt;p&gt;Because the playground behaves like a production MCP client, you can validate complete user journeys before deployment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Test tool discovery and selection by the AI&lt;/li&gt;
&lt;li&gt;Verify parameter passing and error handling&lt;/li&gt;
&lt;li&gt;Validate multi-step workflows that require tool chaining&lt;/li&gt;
&lt;li&gt;Confirm proper handling of different file formats and media types&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This comprehensive testing environment means you can catch integration issues early, reducing the risk of problems when you connect your MCP servers to external AI clients like GitHub Copilot, Cursor, or other applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conversational power: managing servers with natural language
&lt;/h2&gt;

&lt;p&gt;The true elegance of the playground lies in managing your MCP infrastructure using the same chat interface you use to test its functionality.&lt;/p&gt;

&lt;p&gt;The built-in &lt;code&gt;toolhive mcp&lt;/code&gt; server enables powerful, conversational commands, offering a streamlined approach with significant benefits:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Benefit to you&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Unified interface&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Manage infrastructure using the exact same conversational interface as testing.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Contextual operations&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The AI understands your current server state and can make intelligent decisions about which servers to manage.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reduced complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;You don't need to switch between traditional command-line interfaces and the chat interface. Everything can be done through conversation.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Observability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;All management actions are logged alongside tool executions, providing clear visibility.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For example, to check the running state of all hosted servers, you can simply ask:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Can you list all my MCP servers and show their current status?&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The AI executes the &lt;code&gt;list_servers&lt;/code&gt; tool, providing immediate, structured feedback directly in the conversation panel:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpw6y38qzo9621cpgpzqg.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpw6y38qzo9621cpgpzqg.webp" alt="ToolHive playground showing list_servers tool execution results with server statuses" width="800" height="523"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can also carry out complex, maintenance-focused requests easily, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Start the fetch MCP server for me&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Stop all unhealthy MCP servers&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Show me the logs for the meta-mcp server&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Recommended practices for effective testing
&lt;/h2&gt;

&lt;p&gt;To get the most out of the playground, keep these best practices in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Isolated testing&lt;/strong&gt;: Test individual MCP servers one at a time to validate their core functionality.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integration testing&lt;/strong&gt;: Enable multiple servers to test how they work together and prevent tool conflicts. Use the same models as in production to ensure consistent behavior and expected tool calls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance validation&lt;/strong&gt;: Monitor tool execution times under different loads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error handling&lt;/strong&gt;: Intentionally create error conditions to ensure your tools, and the AI's response, handle failures gracefully.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The ToolHive playground transforms the intricate process of setting up, managing, and validating Model Context Protocol servers into an intuitive, seamless experience. It provides you with the visibility and control you need to confidently deploy secure and effective AI agents.&lt;/p&gt;

&lt;p&gt;
  &lt;a href="https://toolhive.dev" rel="noopener noreferrer"&gt;
    Try ToolHive UI Now
  &lt;/a&gt;
&lt;/p&gt;

</description>
      <category>toolhive</category>
      <category>agents</category>
      <category>mcp</category>
    </item>
  </channel>
</rss>
