If you're working with AI coding assistants like GitHub Copilot or Claude, you've probably encountered MCP (Model Context Protocol) servers. They're powerful, connecting your AI to GitHub, Jira, Slack, cloud providers, and more. But here's the problem: each connection requires separate configuration, authentication, and maintenance.
Managing MCP server connections gets messy fast. That’s why we built the Virtual MCP Server (vMCP) in ToolHive to solve this problem by aggregating multiple MCP servers into a single unified endpoint.
The problem: connection overload
Picture this: you're an engineer on a platform team. Your AI assistant needs access to GitHub for code, Jira for tickets, Slack for notifications, PagerDuty for incidents, Datadog for metrics, AWS for infrastructure, Confluence for docs, and your internal knowledge base. That's 8 separate MCP server connections, each exposing 10-20+ tools. Now your AI's context window is filling up with 80+ tool descriptions, burning tokens and degrading performance as the LLM struggles to select the right tools from an overwhelming list.
Each MCP server connection requires:
- Individual configuration in your AI client
- Separate authentication credentials
- Manual coordination when tasks span multiple systems
- Repeated parameter entry (same repo, same channel, same database)
- Tool filtering to avoid context bloat and wasted tokens
Want to investigate a production incident? You're manually running commands across 4 different systems and piecing together the results yourself. Deploying an app? You're orchestrating a sequence of operations: merge PR, wait for CI, get approval, deploy, notify team. It's tedious, error-prone, and not reusable.
The solution: aggregate everything
vMCP transforms those 8 connections into one. You configure a single MCP endpoint that aggregates all your backend servers.
Before vMCP:
{
"servers": {
"github": { "url": "..." },
"jira": { "url": "..." },
"slack": { "url": "..." },
"pagerduty": { "url": "..." },
"datadog": { "url": "..." },
"aws": { "url": "..." },
"confluence": { "url": "..." },
"docs": { "url": "..." }
}
}
With vMCP:
{
"servers": {
"company-tools": {
"url": "http://vmcp.company.com/mcp"
}
}
}
One connection. One authentication flow. All your tools available.
And here’s the key: you can run as many vMCP instances as you need. Your frontend team connects to one vMCP with their specific tools. Your platform team connects to another with infrastructure access. Each vMCP aggregates exactly the backends that each team needs, with appropriate security policies and permissions.
This matters for two reasons: security (no more giving everyone access to everything) and efficiency (fewer tools means smaller context windows, which means lower token costs and better AI performance).
What vMCP does
vMCP is part of the ToolHive Kubernetes Operator. It acts as an intelligent aggregation layer that sits between your AI client and your backend MCP servers.
1. Multi-server aggregation with tool filtering
All MCP tools appear through a single endpoint, but you cherry-pick exactly which tools to expose.
Example: An engineer on the ToolHive team gets a single vMCP connection with:
- GitHub’s
search_codetool (scoped to thestacklok/toolhiverepo only) - The ToolHive docs MCP server
- An internal docs server hooked up to Google Drive and filtered to ToolHive design docs
- Slack (only the
#toolhive-teamchannel)
No irrelevant tools cluttering the LLM's context. No wasted tokens on unused tool descriptions. Just the tools needed for their work, making it easier for the AI to select the right tool every time.
When multiple MCP servers have tools with the same name (both GitHub and Jira have create_issue), vMCP automatically prefixes them: github_create_issue and jira_create_issue. You can customize these names however you want.
2. Declarative multi-system workflows
Real tasks often require coordinating across multiple systems. vMCP lets you define deterministic workflows that execute in parallel with conditionals, error handling, and approval gates.
Example: Incident investigation
Instead of manually jumping between 4 different systems, copy/pasting data, and aggregating the results, a single “composite tool” could:
→ Query logs from logging system
→ Fetch metrics from monitoring platform
→ Pull traces from tracing service
→ Check infrastructure status from cloud provider
→ Manually combine everything into a report
→ Create Jira ticket with findings
vMCP executes all queries in parallel, automatically aggregates the data, and creates the ticket. Define the workflow once, use it for every incident.
Example: App deployment
A typical deployment workflow handled end-to-end:
→ Merge pull request in GitHub
→ Wait for CI tests to pass
→ Request human approval (using MCP elicitation)
→ Deploy (only if approved)
→ Notify team in Slack
3. Pre-configured defaults and guardrails
Stop typing the same parameters repeatedly. Configure defaults once in vMCP.
Before: Every GitHub query requires specifying repo: stacklok/toolhive
After: The repo is pre-configured. Engineers never specify it, and they can't accidentally query the wrong one.
This isn’t just convenience, it’s about deterministic behavior and security. By pre-configuring parameters, you ensure tools behave consistently, and users can only access resources you’ve explicitly exposed. No more accidental queries to the wrong repo, Slack channels, databases, cloud regions, or anything else you reference repeatedly.
4. Tool customization and security policies
Third-party MCP servers often expose generic, unrestricted tools. vMCP lets you wrap and restrict them without modifying upstream servers.
Security policy enforcement: Restrict a website fetch tool to internal domains only (*.company.com), validate URLs before calling the backend, and provide clear error messages for violations.
Simplified interfaces: That AWS EC2 tool with 20+ parameters? Create a wrapper that only exposes the 3 parameters your frontend team actually needs, with safe defaults for everything else.
5. Centralized authentication
vMCP implements a two-boundary authentication model with a complete audit trail. Your AI client authenticates once to vMCP using the OAuth 2.1 methods defined in the official MCP spec. vMCP handles authorization to each backend independently based on its requirements.
When it’s time to revoke access, disable the user in your identity provider, and all backend access is revoked instantly.
Real-world benefits
Let's look at the incident investigation example with concrete numbers:
Without vMCP:
- 4 sequential manual commands
- 2-3 minutes per command
- 5-10 minutes aggregating and formatting
- 15-20 minutes total per incident
- Results vary by engineer
- Process isn't documented or reusable
With vMCP:
- One command triggers the workflow
- Parallel execution: 30 seconds
- Automatic aggregation and formatting
- Consistent results every time
- Workflow is documented as code
- Any team member can use it
For a team handling 20 incidents per week, that's 5-6 hours saved. More importantly, the response is faster, more consistent, and doesn't require senior engineers to handle routine investigations.
How it works
vMCP runs in Kubernetes alongside your backend MCP servers. You define three types of resources:
MCPGroup: Organizes backend servers logically (e.g., "platform-tools")
MCPServer: Individual backend MCP servers (GitHub, Jira, etc.)
VirtualMCPServer: The aggregation layer that combines servers from a group
The ToolHive operator discovers backends, resolves tool name conflicts, applies security policies, and exposes everything through a single endpoint. Your AI client connects to vMCP just like any other MCP server.
Since each VirtualMCPServer is a separate Kubernetes resource, you can deploy as many as needed. One per team, one per environment, or organized however makes sense for your security model.
For a working example, check out the quickstart tutorial.
When to use vMCP
vMCP makes sense when you're managing multiple MCP servers (typically 5+), curating a subset of MCP tools for specific teams and workflows, or need tasks that coordinate across systems. It's especially valuable for:
- Teams requiring centralized authentication and authorization
- Workflows that should be reusable across the entire team
- Security policies that need centralized enforcement
- Reducing onboarding complexity for new engineers
If you're using a single MCP server for simple one-step operations, you probably don't need vMCP. It's built for managing complexity at scale.
Get started
vMCP is available now as part of ToolHive. To try it out:
- Install the ToolHive Kubernetes Operator
- Follow the vMCP quickstart
- Connect your AI client to the aggregated endpoint
We'd love to hear how you're using vMCP. What workflows are you building? Which MCP servers are you aggregating? Join the ToolHive community on Discord and let us know.
Looking to leverage vMCP within your enterprise organization? Book a demo with us.
ToolHive is an open-source MCP platform focused on security and enterprise operationalization. Learn more at toolhive.dev.

Top comments (0)