DEV Community

Ricardo Rodrigues
Ricardo Rodrigues

Posted on

Hosting MCP Servers at Scale: The Orchestrator


How MCPNest deploys isolated Docker containers for MCP servers and routes tool calls intelligently across your workspace.


MCP servers have a deployment problem.

Most servers are stdio-based — they run as a subprocess via npx and communicate over stdin/stdout. That works fine on a developer's laptop. It does not work for a team.

You cannot share a stdio process across machines. You cannot health-check it remotely. You cannot restart it when it crashes without someone noticing. You cannot give five engineers access to the same instance running on one person's laptop.

The standard solution is to wrap stdio servers in an HTTP adapter and deploy them to a server. The problem: that is significant infrastructure work for every MCP server you want to use. Dockerfile, networking, health checks, restart policies, resource limits — multiply that by every server your team needs.

We built MCPNest Hosted Servers to handle all of it.


How hosted MCP servers work

When you click Deploy in your workspace Hosted tab, MCPNest:

  1. Creates an instance record in the database (status: pending)
  2. Sends a deploy request to the MCPNest Orchestrator service
  3. Orchestrator pulls the verified image from our allowlist
  4. Orchestrator starts the container with security hardening applied
  5. Orchestrator polls health until the container responds
  6. Updates the instance to status: running, health_status: healthy
  7. The server automatically appears in your Gateway tools/list

Total time measured in production today: 6 seconds from click to RUNNING + HEALTHY.


The Bridge layer

Inside each hosted container runs MCPNest Bridge — a FastAPI server on port 8080 that translates between HTTP (external) and stdio JSON-RPC (internal).

Bridge handles:

  • MCP protocol handshake (initializenotifications/initialized)
  • Managing the stdio subprocess lifecycle
  • Translating POST /tools/listtools/list JSON-RPC request → response
  • Translating POST /tools/calltools/call JSON-RPC request → response
  • Draining stderr in the background so it never blocks the main process
  • asyncio.Lock per request to prevent concurrent stdio corruption

The subprocess command is configurable per image via the mcp_allowed_images table. For node:20-alpine with @modelcontextprotocol/server-filesystem, the command is npx -y @modelcontextprotocol/server-filesystem /workspace.


Security model

Every container runs with these constraints — no exceptions, not configurable by the user:

Process isolation:

  • no-new-privileges: true — no privilege escalation
  • cap_drop: ALL — all Linux capabilities dropped
  • cap_add: [CHOWN, SETUID, SETGID] — only minimum needed

Network isolation:

  • Containers bind to 127.0.0.1 only — never exposed on a public interface
  • Traffic flows: Gateway → Orchestrator → container (all internal)

Filesystem:

  • /tmp mounted as tmpfs with noexec, nosuid, nodev
  • No host filesystem mounts

Resources:

  • CPU and memory caps enforced at container creation by plan profile
  • Small: 0.25 vCPU, 256 MB RAM
  • Medium: 0.5 vCPU, 512 MB RAM
  • Large: 1.0 vCPU, 1 GB RAM

Image allowlist:

  • Only MCPNest-verified images can be deployed
  • Images are validated against a regex pattern before pull
  • The DB allowlist is the primary gate; the regex is defence-in-depth

The Orchestrator

Above the containers sits the MCPNest Orchestrator — a FastAPI service that aggregates tools from all running instances in a workspace and routes tool calls to the correct container.

tools/list aggregation:

When your Gateway receives a tools/list request and orchestrator_enabled is true, it calls the Orchestrator. The Orchestrator:

  1. Queries the database for all running instances in the workspace
  2. Fans out POST /tools/list to each container in parallel
  3. Collects results, detects naming conflicts
  4. Applies namespacing: if two servers expose a tool called query, they become postgres_mcp__query and mysql_mcp__query
  5. Caches the result for 30 seconds
  6. Returns a unified tool list

tools/call routing:

When a tools/call arrives, the Orchestrator:

  1. Finds the ToolInfo record for the requested tool name (from cache)
  2. Looks up the live endpoint URL from the database
  3. Strips the namespace prefix if present (postgres_mcp__queryquery)
  4. Forwards the call to the correct container
  5. Logs the call to mcp_tool_calls (best-effort, non-blocking)
  6. Returns the result

Fallback behaviour:

If the Orchestrator is unreachable, or if a workspace has no running hosted instances, the Gateway automatically falls back to direct fan-out to remote HTTP servers. No downtime. No manual intervention.


Current state

Three verified images are available today:

Server Image Tools
filesystem-mcp node:20-alpine + @modelcontextprotocol/server-filesystem read_file, write_file, list_directory
github-mcp ghcr.io/modelcontextprotocol/servers:github create_issue, list_prs, search_code
postgres-mcp ghcr.io/modelcontextprotocol/servers:postgres query, list_tables, describe_table

Plan limits: Team plan supports 3 running instances per workspace. Enterprise supports 20.


What's next

  • Auth headers per server (bring your own API keys — fixes degraded status on auth-required servers)
  • More verified images: Slack, Notion, Linear
  • Usage metering per instance for billing
  • Auto-deploy via workspace registry URL (mcpnest-registry-client npm package)

Ricardo Rodrigues — Platform Engineer @ BCP, Founder @ MCPNest
Porto, Portugal

MCPNest — The App Store for MCP Servers
mcpnest.io/workspace → Hosted tab

Top comments (0)