Discussion on: We Built 17 MCP Servers to Let AI Run Our Internal Operations

View post

Running 17 MCP servers across internal operations is a significant architectural achievement — most teams struggle to maintain even 3 or 4 reliably. The coordination overhead alone across Slack, GitHub, BigQuery, and custom internal tools must have been substantial to get right.

The question that becomes unavoidable at that scale is governance: when 17 agents are running with write access across your internal systems, how do you reconstruct exactly what each one did when something unexpected happens? How do you prevent PII from customer data flowing through to the LLM context during routine operations? And if one server starts misbehaving at 2am, how do you shut it down without taking down the others?

These are the questions Vinkius (vinkius.com) was built to answer. It runs pre-governed MCP servers inside V8 Isolate sandboxes — each call generates a SHA-256 cryptographic audit trail, PII is redacted at the protocol level before reaching the model, and there's a global kill switch per server. The SDK is Vurb.ts, which wraps MCP tool calls with these controls natively rather than as middleware.

Your architecture proves the technical feasibility of agent-driven internal operations. The governance layer is what makes it auditable and safe enough to trust at scale. Really impressive operational experiment — would love to see a follow-up on how you handle incidents across 17 concurrent servers.

Ryosuke Tsuji • Apr 11 • Edited

@renato_marinho
PII Protection:
PII redaction is handled at the data layer, not as middleware. Our DB Graph MCP server (detailed in this post) automatically anonymizes query results from production databases — emails become @.com, names become ***, phone numbers/addresses/card numbers are all masked before they ever reach the LLM context. This runs at both the MCP layer and the Lambda layer (dual validation), so even if one layer is compromised, PII doesn't leak. On the observability side, Grafana's log data is also protected — structured logs are designed to exclude PII fields, and Grafana access itself is scoped via Google OAuth with domain restrictions. Servers that don't touch customer data (infrastructure, CI/CD, documentation) simply don't have access to PII-containing databases in the first place — scope separation by design.

Observability & Incident Response:
Every MCP server is instrumented with OpenTelemetry, and all logs/traces/metrics are aggregated in Grafana. Grafana alerting rules are configured per server — latency spikes, error rate thresholds, and availability checks all trigger Slack notifications automatically. So if a server misbehaves at 2am, the on-call engineer gets a Slack alert immediately.

For investigation, we built a Grafana MCP server — meaning Claude Code itself can query logs and metrics. "Show me error logs from the DB Graph MCP in the last hour" returns
structured results directly in the AI context. This closes the loop: the same AI that uses the MCP servers can also diagnose issues with them.

Independent Deployment:
Each server is a separate Cloud Run service with its own Pulumi stack, service account, and IAM roles. Deploying, scaling, or shutting down one server has zero impact on the others. There's no shared runtime or process — they're fully isolated at the infrastructure level.