Hosting MCP Servers at Scale: The Orchestrator

#mcp #ai #claude #cursor

How MCPNest deploys isolated Docker containers for MCP servers and routes tool calls intelligently across your workspace.

MCP servers have a deployment problem.

Most servers are stdio-based — they run as a subprocess via npx and communicate over stdin/stdout. That works fine on a developer's laptop. It does not work for a team.

You cannot share a stdio process across machines. You cannot health-check it remotely. You cannot restart it when it crashes without someone noticing. You cannot give five engineers access to the same instance running on one person's laptop.

The standard solution is to wrap stdio servers in an HTTP adapter and deploy them to a server. The problem: that is significant infrastructure work for every MCP server you want to use. Dockerfile, networking, health checks, restart policies, resource limits — multiply that by every server your team needs.

We built MCPNest Hosted Servers to handle all of it.

How hosted MCP servers work

When you click Deploy in your workspace Hosted tab, MCPNest:

Creates an instance record in the database (status: pending)
Sends a deploy request to the MCPNest Orchestrator service
Orchestrator pulls the verified image from our allowlist
Orchestrator starts the container with security hardening applied
Orchestrator polls health until the container responds
Updates the instance to status: running, health_status: healthy
The server automatically appears in your Gateway tools/list

Total time measured in production today: 6 seconds from click to RUNNING + HEALTHY.

The Bridge layer

Inside each hosted container runs MCPNest Bridge — a FastAPI server on port 8080 that translates between HTTP (external) and stdio JSON-RPC (internal).

Bridge handles:

MCP protocol handshake (initialize → notifications/initialized)
Managing the stdio subprocess lifecycle
Translating POST /tools/list → tools/list JSON-RPC request → response
Translating POST /tools/call → tools/call JSON-RPC request → response
Draining stderr in the background so it never blocks the main process
asyncio.Lock per request to prevent concurrent stdio corruption

The subprocess command is configurable per image via the mcp_allowed_images table. For node:20-alpine with @modelcontextprotocol/server-filesystem, the command is npx -y @modelcontextprotocol/server-filesystem /workspace.

Security model

Every container runs with these constraints — no exceptions, not configurable by the user:

Process isolation:

no-new-privileges: true — no privilege escalation
cap_drop: ALL — all Linux capabilities dropped
cap_add: [CHOWN, SETUID, SETGID] — only minimum needed

Network isolation:

Containers bind to 127.0.0.1 only — never exposed on a public interface
Traffic flows: Gateway → Orchestrator → container (all internal)

Filesystem:

/tmp mounted as tmpfs with noexec, nosuid, nodev
No host filesystem mounts

Resources:

CPU and memory caps enforced at container creation by plan profile
Small: 0.25 vCPU, 256 MB RAM
Medium: 0.5 vCPU, 512 MB RAM
Large: 1.0 vCPU, 1 GB RAM

Image allowlist:

Only MCPNest-verified images can be deployed
Images are validated against a regex pattern before pull
The DB allowlist is the primary gate; the regex is defence-in-depth

The Orchestrator

Above the containers sits the MCPNest Orchestrator — a FastAPI service that aggregates tools from all running instances in a workspace and routes tool calls to the correct container.

tools/list aggregation:

When your Gateway receives a tools/list request and orchestrator_enabled is true, it calls the Orchestrator. The Orchestrator:

Queries the database for all running instances in the workspace
Fans out POST /tools/list to each container in parallel
Collects results, detects naming conflicts
Applies namespacing: if two servers expose a tool called query, they become postgres_mcp__query and mysql_mcp__query
Caches the result for 30 seconds
Returns a unified tool list

tools/call routing:

When a tools/call arrives, the Orchestrator:

Finds the ToolInfo record for the requested tool name (from cache)
Looks up the live endpoint URL from the database
Strips the namespace prefix if present (postgres_mcp__query → query)
Forwards the call to the correct container
Logs the call to mcp_tool_calls (best-effort, non-blocking)
Returns the result

Fallback behaviour:

If the Orchestrator is unreachable, or if a workspace has no running hosted instances, the Gateway automatically falls back to direct fan-out to remote HTTP servers. No downtime. No manual intervention.

Current state

Three verified images are available today:

Server	Image	Tools
filesystem-mcp	node:20-alpine + @modelcontextprotocol/server-filesystem	read_file, write_file, list_directory
github-mcp	ghcr.io/modelcontextprotocol/servers:github	create_issue, list_prs, search_code
postgres-mcp	ghcr.io/modelcontextprotocol/servers:postgres	query, list_tables, describe_table

Plan limits: Team plan supports 3 running instances per workspace. Enterprise supports 20.

What's next

Auth headers per server (bring your own API keys — fixes degraded status on auth-required servers)
More verified images: Slack, Notion, Linear
Usage metering per instance for billing
Auto-deploy via workspace registry URL (mcpnest-registry-client npm package)

Ricardo Rodrigues — Platform Engineer @ BCP, Founder @ MCPNest
Porto, Portugal

MCPNest — The App Store for MCP Servers
mcpnest.io/workspace → Hosted tab