Sergey Byvshev

Posted on Apr 15

MCP servers for the entire team: from local launch to centralized access

#ai #kubernetes #mcp #devops

When you have six MCP servers and ten colleagues, "just run npx locally" stops working. Not everyone wants to install Node.js, managers don't have Docker, and your local claude_desktop_config.json starts looking like a secrets vault for every production system.

I went from remote MCP → local setup → Docker → Kubernetes with a universal Helm chart and JWT auth via Envoy. Here's what I hit along the way, what worked, and what's still unsolved.

Level 1: Remote MCP — When the Vendor Did the Work

My first MCP experience was dead simple. I added the Atlassian MCP server to Claude as a remote MCP, authenticated, and it just worked:

{
  "mcpServers": {
    "atlassian": {
      "type": "http",
      "url": "https://mcp.atlassian.com/v1/sse"
    }
  }
}

The problem? Very few SaaS products offer this. Everything self-hosted or without native MCP support is a different story.

Level 2: Local Setup — The Dependency Zoo

Next, I wanted to connect my IDE to Kubernetes. No built-in MCP support here, so dependencies it is:

{
  "mcpServers": {
    "kubernetes": {
      "command": "npx",
      "args": ["-y", "kubernetes-mcp-server@latest"]
    }
  }
}

It worked, but one server needs Node.js, another needs Python and uvx, a third needs a Go binary. The runtime zoo on your machine grows with every new MCP server. Not great when you're not even a developer.

Level 3: Docker — Isolation Without the Mess

The logical next step — containers. Each MCP server with its own runtime, no host pollution:

{
  "mcpServers": {
    "grafana": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-e", "GRAFANA_URL",
        "-e", "GRAFANA_SERVICE_ACCOUNT_TOKEN",
        "grafana/mcp-grafana",
        "-t", "stdio"
      ],
      "env": {
        "GRAFANA_URL": "https://grafana.example.com",
        "GRAFANA_SERVICE_ACCOUNT_TOKEN": "<token>"
      }
    }
  }
}

For one engineer on one machine — enough. But when ten people need access, questions pile up:

Production tokens are scattered across laptops.
Automated workflows (n8n, CI/CD) need MCP access too — and they run remotely.
Managers and analysts want AI tools but aren't ready to deal with docker run.

One conclusion: MCP servers need to move into shared infrastructure.

Level 4: Kubernetes — Centralized Deployment

The initial idea was straightforward: deploy remote MCP servers inside your infrastructure perimeter. At minimum, you can restrict access via corporate VPN.

Anyone who's tackled this has hit the same wall: most MCP servers communicate via stdio (stdin/stdout). You can't reach them over HTTP directly.

This is where MCP Gateway comes in — a proxy that translates Streamable HTTP to stdio and back.

The flow: client (Claude Desktop, IDE, n8n) → HTTPS → Ingress → Kubernetes Service → Pod with MCP Gateway sidecar (HTTP → stdin) → MCP server process.

Universal Helm Chart

To avoid writing manifests for every MCP server, I built a universal Helm chart: mcp-helm-chart on ArtifactHub.

What it supports:

mode: proxy — runs MCP Gateway as a sidecar, translating HTTP ↔ stdio
mode: native — for servers that already support HTTP (no sidecar needed)
Vault and ExternalSecrets integration for secrets management
Gateway API and classic Ingress support
HPA for horizontal scaling

Installation with Ingress-nginx (no auth):

helm repo add mcp https://javdet.github.io/mcp-helm-chart
helm install my-mcp mcp/mcp -f values.yaml

Key sections of values.yaml for deploying DigitalOcean MCP:

mode: proxy

proxy:
  image:
    repository: node
    tag: "20-bookworm"
    pullPolicy: IfNotPresent
  gateway:
    package: "@michlyn/mcpgateway"
    stdioCommand: "npx -y @digitalocean/mcp --services apps,droplets,doks,networking"
    outputTransport: streamable-http
    port: 8080
    httpPath: /mcp

# Token stored in HashiCorp Vault, injected via Vault Webhook
vault:
  enabled: true
  role: "mcp"
  path: "kubernetes_dev-fra1-01"

env:
  - name: DIGITALOCEAN_API_TOKEN
    value: vault:devops/data/ai/mcp/digitalocean#token

ingress:
  enabled: true
  className: "internal"
  annotations:
    nginx.ingress.kubernetes.io/proxy-buffering: "off"
    nginx.ingress.kubernetes.io/proxy-http-version: "1.1"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
    nginx.ingress.kubernetes.io/use-regex: "true"
    nginx.ingress.kubernetes.io/rewrite-target: /$2
  hosts:
    - host: aitool.example.com
      paths:
        - path: /digitalocean(/|$)(.*)
          pathType: ImplementationSpecific
  tls:
    - secretName: ssl-certificate
      hosts:
        - aitool.example.com

MCP servers in Streamable HTTP mode are stateless. They scale horizontally with a standard HPA without any issues.

The most pressing question here is authentication — or better yet, authorization. Most MCP servers don't support incoming authentication, so you have to handle it yourself.

Authentication: JWT via Envoy

Basic auth is barely better than nothing, so — straight to JWT. I used Envoy API Gateway since it natively supports JWT validation and was already in our stack.

Key and Token Generation

# 1. Generate RSA keys
openssl genrsa -out mcp-jwt-private.pem 4096
openssl rsa -in mcp-jwt-private.pem -pubout -out mcp-jwt-public.pem

# 2. Generate Key ID
KID=$(openssl rand -hex 16)

# 3. Build JWT header (base64url)
HEADER=$(echo -n "{\"alg\":\"RS256\",\"typ\":\"JWT\",\"kid\":\"${KID}\"}" \
  | base64 -w0 | tr '+/' '-_' | tr -d '=')

# 4. Build JWT payload (1 year expiry)
PAYLOAD=$(echo -n "{\"sub\":\"claude-desktop\",\"aud\":\"mcp-servers\",\"iss\":\"https://your-domain.com\",\"iat\":$(date +%s),\"exp\":$(( $(date +%s) + 31536000 ))}" \
  | base64 -w0 | tr '+/' '-_' | tr -d '=')

# 5. Sign
SIGNATURE=$(echo -n "${HEADER}.${PAYLOAD}" \
  | openssl dgst -sha256 -sign mcp-jwt-private.pem \
  | base64 -w0 | tr '+/' '-_' | tr -d '=')

# 6. Final token
echo "${HEADER}.${PAYLOAD}.${SIGNATURE}"

The public key is packaged into JWKS and stored in a ConfigMap. Envoy validates every incoming request by checking issuer, audience, and signature.

Auth configuration in the chart values (Gateway API variant):

gatewayApi:
  enabled: true
  parentRefs:
    - name: internal
      namespace: ai-infra
      sectionName: https
  hostnames:
    - mcptools.example.com
  timeouts:
    request: "3600s"
    backendRequest: "3600s"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /digitalocean
      filters:
        - type: URLRewrite
          urlRewrite:
            path:
              type: ReplacePrefixMatch
              replacePrefixMatch: /
  auth:
    type: jwt
    jwt:
      providers:
        - name: mcp-jwt-auth
          issuer: mcp-issuer
          audiences:
            - mcptools.example.com
          localJWKS:
            type: ValueRef
            valueRef:
              group: ""
              kind: ConfigMap
              name: jwks-config

# If you use External Secrets Operator, secrets can be fetched through it
externalSecrets:
  enabled: true
  refreshInterval: 1h
  secretStoreRef:
    name: aws
    kind: ClusterSecretStore
  target:
    creationPolicy: Owner
  dataFrom:
    - extract:
        key: infra/mcp/digitalocean

Currently, access to target systems (DigitalOcean, Grafana, Kubernetes) goes through a single service account. For read-only tasks — monitoring, diagnostics, fetching info — this is enough. For write operations, the question remains open..

Automated Access

Periodic tasks (n8n workflows, CI/CD pipelines) connect to the same MCP servers over Streamable HTTP with separate service JWT tokens. The setup is identical — only the subject in the token payload differs, and optionally the access scope at the Gateway level.

What's Working, What's Not

MCP tooling and infrastructure still have a few steps to take toward each other before usage becomes truly simple, reliable, and secure.

The current setup works: six MCP servers in Kubernetes, one Helm chart, JWT auth via Envoy, secrets in Vault. Colleagues connect to remote MCP servers with zero local dependencies, automation uses the same endpoints.

What's still missing:

Per-user authorization. The MCP protocol doesn't support passing user context. We're living with service accounts for now.
Audit logging. Who called which tool with what parameters — not logged at the MCP level. You can collect this at the Envoy layer, but without call context.
Auth standard. Every vendor does it differently. OAuth, API Key, Bearer — no unified approach.

The Helm chart is open source: ArtifactHub.

How do you handle per-user authorization for MCP? We're still on a single service account — would love to hear from anyone who's moved past that.

DEV Community