DEV Community

Cover image for Routing Observable and Secure Traffic Through Claude
Michael Levan
Michael Levan

Posted on • Originally published at cloudnativedeepdive.com

Routing Observable and Secure Traffic Through Claude

AI traffic that goes through enterprise systems should include everything from servers, cloud environments, and even laptops, desktops, and mobile devices. This level of observability and security isn't "new"; the industry has had it for years with Mobile Device Management (MDM) software. With AI workloads, however, the concepts of properly observing and securing local systems seem to have been forgotten.

And we can't forget about AI traffic.

In this blog post, you'll learn how to route local AI traffic through agentgateway when tools like Claude desktop are interacting with MCP Servers.

Prerequisites

To follow along from a hands-on perspective, you'll need the following:

  1. A Kubernetes cluster.
  2. Claude Desktop.
  3. Agentgateway installed.

The Low-Hanging Fruit

Organizations, enterprises, teams, and engineers are working on consistent ways to implement Agentic infrastructure, whether that be on systems, domain-specific Agents, generic Agents, MCP, and everything in between. This is typically happening in many places today at the, what we can call "backend layer". The "backend layer" are the cloud environments, servers running AI workloads, and networks.

However, there's one piece to the puzzle that seems to be overlooked - the "frontend layer". These are the user devices (laptops, desktops, mobile devices) within the organization that are being used at work.

In the engineering space, that typically falls into the LLM, Agents, or desktop software that engineers are using (Claude Code, Claude Desktop, Gemini CLI, etc.). With these "frontend layer" tools, it's open to all with zero observability or security. Now, the goal isn't to completely lock everything down to where no one can use AI, but there needs to be defense in depth, security practices, and perhaps most importantly, observability for all AI traffic even, and especially, when it's coming from a local machine.

Much like all systems (laptops, desktops, mobile devices) go through networks within the enterprise that are the internal networks (traffic through a router and rules in place by a firewall and observed at the packet level), AI traffic needs to be looked at the same way.

Deploying An MCP Server

The first step in the journey is to give Claude Code desktop "something" to route to. This could be another Agent, various Models, or an MCP Server for specific tool selection needs. This section will walk you through how to deploy an MCP Server on a Kubernetes cluster.

  1. Deploy the following configuration which contains a configmap that has the MCP Server configuration, a Kubernetes Deployment, and a Kubernetes Service.
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: mcp-math-script
  namespace: default
data:
  server.py: |
    import uvicorn
    from mcp.server.fastmcp import FastMCP
    from starlette.applications import Starlette
    from starlette.routing import Route
    from starlette.requests import Request
    from starlette.responses import JSONResponse, Response

    mcp = FastMCP("Math-Service")

    @mcp.tool()
    def add(a: int, b: int) -> int:
        return a + b

    @mcp.tool()
    def multiply(a: int, b: int) -> int:
        return a * b

    async def handle_mcp(request: Request):
        try:
            data = await request.json()
            method = data.get("method")
            msg_id = data.get("id")
            result = None

            if method == "initialize":
                result = {
                    "protocolVersion": "2024-11-05",
                    "capabilities": {"tools": {}},
                    "serverInfo": {"name": "Math-Service", "version": "1.0"}
                }

            elif method == "notifications/initialized":
                # Notifications are fire-and-forget, return empty 202 response
                return Response(status_code=202)

            elif method == "tools/list":
                tools_list = await mcp.list_tools()
                result = {
                    "tools": [
                        {
                            "name": t.name,
                            "description": t.description,
                            "inputSchema": t.inputSchema
                        } for t in tools_list
                    ]
                }

            elif method == "tools/call":
                params = data.get("params", {})
                name = params.get("name")
                args = params.get("arguments", {})

                # Call the tool
                tool_result = await mcp.call_tool(name, args)

                # --- FIX: Serialize the content objects manually ---
                serialized_content = []
                for content in tool_result:
                    if hasattr(content, "type") and content.type == "text":
                        serialized_content.append({"type": "text", "text": content.text})
                    elif hasattr(content, "type") and content.type == "image":
                         serialized_content.append({
                             "type": "image",
                             "data": content.data,
                             "mimeType": content.mimeType
                         })
                    else:
                        # Fallback for dictionaries or other types
                        serialized_content.append(content if isinstance(content, dict) else str(content))

                result = {
                    "content": serialized_content,
                    "isError": False
                }

            elif method == "ping":
                result = {}

            else:
                return JSONResponse(
                    {"jsonrpc": "2.0", "id": msg_id, "error": {"code": -32601, "message": "Method not found"}},
                    status_code=404
                )

            return JSONResponse({"jsonrpc": "2.0", "id": msg_id, "result": result})

        except Exception as e:
            # Print error to logs for debugging
            import traceback
            traceback.print_exc()
            return JSONResponse(
                {"jsonrpc": "2.0", "id": None, "error": {"code": -32603, "message": str(e)}},
                status_code=500
            )

    app = Starlette(routes=[
        Route("/mcp", handle_mcp, methods=["POST"]),
        Route("/", lambda r: JSONResponse({"status": "ok"}), methods=["GET"])
    ])

    if __name__ == "__main__":
        print("Starting Fixed Math Server on port 8000...")
        uvicorn.run(app, host="0.0.0.0", port=8000)
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-math-server
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mcp-math-server
  template:
    metadata:
      labels:
        app: mcp-math-server
    spec:
      containers:
      - name: math
        image: python:3.11-slim
        command: ["/bin/sh", "-c"]
        args:
        - |
          pip install "mcp[cli]" uvicorn starlette &&
          python /app/server.py
        ports:
        - containerPort: 8000
        volumeMounts:
        - name: script-volume
          mountPath: /app
        readinessProbe:
          httpGet:
            path: /
            port: 8000
          initialDelaySeconds: 5
          periodSeconds: 5
      volumes:
      - name: script-volume
        configMap:
          name: mcp-math-script
---
apiVersion: v1
kind: Service
metadata:
  name: mcp-math-server
  namespace: default
spec:
  selector:
    app: mcp-math-server
  ports:
  - port: 80
    targetPort: 8000
EOF
Enter fullscreen mode Exit fullscreen mode

The MCP Server should now be running in a Pod via the default Namespace with the mcp-math-server k8s Service sitting in front of the Pod.

Configuring A Gateway

With the MCP Server deployed, you need a way to pass traffic through to it. If you think about when Agents communicate to other Agents, MCP Servers, or LLMs, there's a "middle layer", which is how the Agent gets from point A (itself) to point B (the MCP Server in this case), that "middle layer" is where the packets flow, which is the Gateway.

If you aren't running on a Kubernetes cluster that has the ability to create a public ALB with an IP address that's accessible externally, you can use something like Metallb or port-forward the Gateway in your terminal.

  1. Create a new Gateway, which will use the agentgateway Gateway Class. It will be listening on port 8080 and allow traffic from the same Namespace as where the Gateway is deployed (agentgateway-system).
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: agentgateway-mcp
  namespace: agentgateway-system
spec:
  gatewayClassName: enterprise-agentgateway
  listeners:
  - name: http
    port: 8080
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: Same
EOF
Enter fullscreen mode Exit fullscreen mode
  1. Implement an agentgateway backend, which is what tells the Gateway what to route to. In this case, it's the MCP Server that you deployed in the previous section.
kubectl apply -f - <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: demo-mcp-server
  namespace: agentgateway-system
spec:
  mcp:
    targets:
      - name: demo-mcp-server
        static:
          host: mcp-math-server.default.svc.cluster.local
          port: 80
          path: /mcp
          protocol: StreamableHTTP
EOF
Enter fullscreen mode Exit fullscreen mode
  1. Create an HTTP route so there's a path for the Gateway to route to. In this case, the "path" is the MCP Server via the agentgateway backend.
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: mcp-route
  namespace: agentgateway-system
spec:
  parentRefs:
  - name: agentgateway-mcp
  rules:
  - backendRefs:
    - name: demo-mcp-server
      namespace: agentgateway-system
      group: agentgateway.dev
      kind: AgentgatewayBackend
EOF
Enter fullscreen mode Exit fullscreen mode
  1. Retrieve the IP address of the Gateway. If an external one doesn't exist, you can port-forward the Gateway service.
export GATEWAY_IP=$(kubectl get svc agentgateway-mcp -n agentgateway-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo $GATEWAY_IP
Enter fullscreen mode Exit fullscreen mode
  1. Open MCP Inspector to test the traffic to the MCP Server.
npx modelcontextprotocol/inspector#0.16.2
Enter fullscreen mode Exit fullscreen mode
  1. Add the following URL into MCP Inspector. If you're port forwarding the Gateway service, use localhost instead of an IP address.
http://YOUR_ALB_LB_IP:8080/mcp
Enter fullscreen mode Exit fullscreen mode

If you search for tools, you should see an add and multiply tool.

Configure Claude Desktop With An MCP Server

The last step is to configure Claude Desktop to route through/use the AI gateway (agentgateway) that you deployed in the previous section. This will ensure that the traffic flowing from Claude Desktop to the MCP Server is observable, has the ability to be secured, and is going through a properly built Gateway designed specifically for AI workloads.

  1. Create a new file called claude_desktop_config.json in the path where Claude exists (like in the following example).
mkdir -p ~/Library/Application\ Support/Claude
cat > ~/Library/Application\ Support/Claude/claude_desktop_config.json << 'EOF'
{
  "mcpServers": {
    "math-service": {
      "command": "npx",
      "args": ["-y", "supergateway", "--streamableHttp", "http://YOUR_ALB_LB_IP:8080/mcp"]
    }
  }
}
EOF
Enter fullscreen mode Exit fullscreen mode
  1. After saving the config, restart Claude Desktop for changes to take effect. If you don't see any errors when opening Claude Desktop, that means the configuration that you added in step 1 worked as expected.
  2. With Claude Desktop open, ask it a simple question like What is 2 + 2.

Traffic is now routing through agentgateway via Claude Code!

Top comments (0)