Ryosuke Tsuji

Posted on Apr 7 • Edited on May 31 • Originally published at ryantsuji.dev

We Built 17 MCP Servers to Let AI Run Our Internal Operations

#ai #gcp #typescript #webdev

Introduction

In a previous article, I introduced "DB Graph MCP" — a system that enables safe, cross-schema search and query execution across our entire database estate of 17 DBs and 994 tables.

/posts/db-graph-mcp

Thanks to the positive response, this time I'd like to introduce the rest of our MCP server fleet beyond DB Graph.

These were all built in roughly 3 months starting January 2026. We now have 17 MCP servers in production, covering databases, infrastructure, documentation, project management, observability, CI/CD, and even code editing and deployment by non-engineers — making virtually every aspect of our operations accessible to AI.

Overview

Here's the full lineup:

Category	Server	Description
Data	DB Graph	Company-wide DB dictionary + query execution (previous article)
Infrastructure	GCloud	GCP resources, read-only
	AWS	AWS resources, read-only
Docs & Knowledge	GWS	Full Google Workspace access
	Git Server	All Git repos, read-only
Graph	Code Graph	Codebase analysis (function → API → DB → event dependency tracking)
	Product Graph	Unified knowledge graph: code + DB + docs
	Biz Graph	Business initiative × KPI relationship graph
Observability	Grafana	Logs, metrics, and alert inspection
CI/CD	CircleCI	Pipeline execution, build logs, test results
Project Management	Project Management	BQ/Firestore/Sheets-integrated PM support
Domain-Specific	Stylist Insights	Stylist performance & KPI data
	UX Insights	UX analytics from BQ
	freee	Accounting API integration
Dev Platform	Workspace	ACL-gated monorepo editing & deployment
	Sandbox	App deployment for non-engineers

All servers are implemented in TypeScript, deployed to GCP via Pulumi, and authenticated with Google OAuth.

Design Philosophy

Why So Many Servers?

We could have built one monolithic MCP server, but we deliberately split them. Here's why:

Auth scope isolation — GWS needs Workspace API scopes; the DB query server doesn't. Minimizing scopes prevents privilege escalation.
Deploy independence — A Grafana server change doesn't affect DB queries. Blast radius stays small.
Per-user selection — Engineers add everything; marketing adds only GWS. Just put what you need in .mcp.json.

Shared Foundation

Every server shares common patterns:

Auth: A shared package implements Google OAuth 2.0 + PKCE with RFC 8414 auto-discovery. Just add the URL to .mcp.json and Claude Code handles the auth flow automatically. For business users, we simply register them as custom connectors in the Claude organization settings.

{
  "mcpServers": {
    "server-name": {
      "type": "http",
      "url": "https://mcp-xxx.your-domain.example/mcp"
    }
  }
}

That's it. No auth block needed. Same format for every server.

Session management: Upstash Redis as a shared session store across all servers. SSO cookies mean one login grants access to everything.

Tool usage logging: Every tool invocation is recorded in BigQuery. Who used what, when — fully auditable. We monitor usage rates, error rates, and usage patterns to drive improvements.

Infrastructure: GCloud / AWS

Have you ever wanted to let AI investigate your cloud environment? And simultaneously thought: "Is it safe to let it do that?"

In my case, I have admin-level privileges, which makes it even scarier. So I built MCP servers that are physically incapable of writing anything.

Two key design decisions:

OIDC / STS / Impersonate for secure auth — Zero persistent credentials
Per-account audit logging — Individual email addresses recorded in GCP Audit Log / CloudTrail

GCloud MCP

Claude Code → MCP Server → gcloud CLI subprocess → GCP APIs

Runs gcloud CLI on Cloud Run. The key point: writes are made impossible at the OAuth scope level.

OAuth scope: cloud-platform.read-only
GCP APIs check both scope and IAM — even admin users cannot write
GCP Audit Log records the user's email address
Account revocation on departure: just disable the Google Workspace account

# What you can do
"Show me the Cloud Run services in prod"
"Check the env vars for this service"
"List the Secret Manager secrets"

AWS MCP

Same philosophy, but AWS can't accept Google OAuth directly, so we use STS as a bridge.

Claude Code → MCP Server → GCP metadata → ID Token
                         → AWS STS AssumeRoleWithWebIdentity → temp credentials
                         → aws CLI subprocess → AWS APIs

Two layers of safety:

IAM Role with ReadOnlyAccess policy only
Temporary credentials with 1-hour expiry

Supports multiple AWS accounts via profile parameter. CloudTrail records assumed-role/mcp-aws-readonly/user@example.com.

Docs & Knowledge: GWS / Git Server

GWS (Google Workspace) MCP

Operate all Google Workspace services from Claude Code.

Claude Code → MCP Server → gws CLI subprocess → Google Workspace APIs

Runs gws CLI remotely, passing the user's OAuth access token directly. Each user accesses resources with their own permissions — you can see your Drive but not someone else's.

Since OAuth authentication and Google Workspace authorization happen simultaneously, the moment you connect to the MCP you have immediate access to your Workspace resources. No additional login or token setup required — the experience is seamless.

# What you can do
"Summarize the sales data in this spreadsheet"
"Extract meeting notes from last week's calendar"
"Summarize this document"

Git Server MCP

A read-only server for all company Git repositories.

The motivation: bypassing GitHub MCP rate limits. GitHub's official MCP server hits the GitHub API under the hood, and the rate limit kicks in surprisingly fast when AI is investigating a codebase.

Git Server MCP keeps main-branch clones of all repos on a GCE VM, operating via local git commands with zero rate limiting. Query as much as you want.

Tool	Description
`git_blame`	Last change commit per line
`git_log`	Commit history
`git_grep`	Cross-repo text search
`git_show`	Commit details
`git_diff`	Diff between commits
`read_file`	Read file contents
`list_files`	List directory contents
`search_repos`	Search repositories

No GitHub account needed — OAuth authentication is sufficient.

Observability: Grafana MCP

The official mcp/grafana Docker image deployed on Cloud Run, with an OAuth proxy in front.

Claude Code → OAuth Proxy → mcp-grafana → Grafana Cloud

Supports PromQL/LogQL queries, dashboard inspection, and alert rule review.

What's important is that Grafana dashboards and alert rules are also defined in the same repository as Pulumi (TypeScript). This means:

Write application code
Define alert rules in the same repo
Alert fires in production
Claude Code reads logs via Grafana MCP
Fix the code in the same repo

The code → infra → observability → investigation → fix loop is completely closed.

CI/CD: CircleCI MCP

Integrates with CircleCI API v2. A shared CircleCI token sits behind Google SSO, so the whole team uses it without managing tokens.

Claude Code → OAuth Proxy → CircleCI MCP (sidecar) → CircleCI API v2

Cloud Run multi-container setup: the official @circleci/mcp-server-circleci runs as a sidecar, with our OAuth proxy in front.

# What you can do
"What's the status of the latest pipeline on main?"
"Show me the failure logs for this build"
"Find flaky tests"

Project Management MCP

A server for managing issues in Firestore and semantically searching Slack/Meet conversations.

Key capabilities:

Issue management: Create, update status, and list Issues in Firestore (with spreadsheet dual-write)
Context search: Vector search + Gemini summarization across Meet notes and Slack conversations
Project overview: View milestones, members, design docs, and test cases for your projects
Backlog integration: Retrieve ticket parent-child relationships via BQ

Domain-Specific

Stylist Insights / UX Insights MCP

Servers providing access to stylist performance/KPI data and UX analytics, respectively. Query interfaces over BQ aggregate tables.

freee MCP

An OAuth-authenticated proxy to the freee API for accounting data access.

Dev Platform: Workspace / Sandbox

This might be the most unique part.

Workspace MCP — Code Editing Without a GitHub Account

Provides ACL-gated file editing, commits, PR creation, and deployment for our internal monorepo.

No GitHub account required. Only a Google Workspace account (OAuth) is needed.

1. workspace_init          → Create worktree, initialize branch
2. workspace_write_file    → Edit code
3. workspace_diff          → Review changes
4. workspace_commit        → Commit
5. workspace_push          → Push to GitHub
6. workspace_deploy        → Deploy from feature branch (test)
7. Verify it works
8. workspace_create_pr     → Request review

Access control is managed in Firestore. Admins configure which stacks (directories) each user can edit and deploy.

{
  "allowedPaths": ["apps/web/xxx/", "apps/api/xxx/"],
  "allowedStacks": ["api-xxx", "pages-xxx"],
  "role": "developer"
}

Non-engineers can safely edit and deploy only the stacks they're authorized for. In practice, a non-engineer team member is already using AI + Workspace MCP to improve a full-scratch KPI dashboard.

Sandbox MCP — App Deployment for Non-Engineers

Going even further: non-engineers can deploy their own apps for internal use.

1. sandbox_init_repo(app_name: "my-tool")    → Initialize repo
2. sandbox_write_file(...)                    → Write files
3. sandbox_publish(app_name: "my-tool")       → Deploy to Cloud Run
   → https://sbx-{nickname}--my-tool.example.com/

No gcloud, no Docker. Just tell Claude "I want a tool that does X" and it's published on an internal URL.

Deployed apps are protected by Cloudflare Access with Google Workspace authentication, so only internal members can access them. Even though they're on the public internet, access from outside the organization is impossible.

I wrote detail article.

Graph Servers: Code Graph / Product Graph / Biz Graph

A family of servers that analyze codebases and business logic as graph structures.

Server	Scope	Key Feature
DB Graph	Company-wide DBs (previous article)	Table dictionary + semantic search + live DB queries + PII anonymization
Code Graph	All source code (cross-repository)	Static analysis tracking function → API → DB → event dependencies across repos
Product Graph	Internal monorepo	Unified knowledge graph of code + DB + docs. Every node has business context
Biz Graph	Business initiatives & metrics	Initiative × metric relationship graph

Each has a different design philosophy and solves different problems. See the previous article for DB Graph; details on the others are coming in future posts.

Security Model

Here's the security approach shared across all servers.

Defense in Depth

Layer 1: Google Workspace OAuth + domain restriction
  → Organization domain only. External users cannot log in.

Layer 2: SSO + session management
  → Upstash Redis, 7-day TTL, sliding window

Layer 3: Per-server scope restrictions
  → GCloud: cloud-platform.read-only
  → AWS: ReadOnlyAccess policy
  → DB Graph: SELECT only + PII anonymization

Layer 4: Data-level protection
  → Automatic PII anonymization (40+ column patterns)
  → Confidential datasets controlled by BQ IAM
  → Production DBs via read replicas only

Layer 5: Audit logging
  → All tool invocations recorded in BQ
  → Individual email in GCP Audit Log / CloudTrail

Automatic Revocation on Departure

Since every server depends on Google OAuth, disabling a Google Workspace account instantly revokes access to all MCP servers. No individual token revocation or account cleanup needed.

Takeaways

Lessons learned from building and operating our MCP server fleet:

1. Centralize authentication
Building OAuth as a shared package made adding new servers dramatically easier. Auth code per server is about 10 lines.

2. Start read-only
GCloud, AWS, and Git Server are all read-only. Allow reads first; add writes only when truly needed. This keeps security discussions simple.

3. Wrap existing tools
gcloud CLI, aws CLI, gws CLI, CircleCI MCP — put existing CLIs and MCP servers behind an OAuth proxy and the whole team can use them safely. No need to build from scratch.

4. Non-engineer access is the most exciting frontier
Workspace MCP and Sandbox MCP provide the foundation for non-engineers to edit code and deploy without a GitHub account. It's still early and the big wins are ahead, but this is where the most potential lies.

5. Keep everything in one repository
Application code, infrastructure (Pulumi), observability (Grafana alert rules), MCP servers — all in a single monorepo. This closes the loop: write code → deploy → monitor → find issues → fix.

In the DB Graph article, I described the problem of "how tables relate to each other existing only in specific people's heads." Looking at the full MCP server fleet, it's clear this isn't limited to databases.

Infrastructure state, code dependencies, document contents, project progress, user behavior logs — all of these were trapped in people's heads. Eliminating that is the essential role of our MCP server fleet.

Externalizing knowledge into a form that AI can access. That's the common theme across all our MCP servers.

Top comments (4)

SidClaw • Apr 9

17 MCP servers is a lot of surface area. curious how you're handling the "who can call what" question across all of them. when one agent has access to 17 different tool servers, the blast radius of a bad decision gets wide fast.

do you have any per-action controls, or is it more of a trust-the-agent-and-audit-after approach?

Ryosuke Tsuji • Apr 10

@sidclaw
Great question — "who can call what" is exactly the right thing to worry about. The honest answer is that we use a different approach per server depending on the risk, and stack multiple layers. It's definitely not a "trust-the-agent-and-audit-after" setup.

First, a prerequisite: agents don't actually see all 17 at once

The headline says "17 MCP servers" but a single agent session almost never has all 17 loaded at the same time. For engineer-facing tools, we drop a .mcp.json at the root of each repository listing only the servers that make sense for that repo. A monorepo's .mcp.json might list code-graph, DB graph, CI, Grafana, git workspace, etc., while a standalone frontend repo gets just a single graph-rag entry. The result is that when you start Claude Code in a given workspace, only the relevant tools get loaded, naturally.

This is by far the simplest and most effective way to shrink blast radius: we solve the agent's "which of 17 tools should I pick?" problem upstream, at the developer-workflow level, before the model ever has to reason about it.

On top of that, each server internally stacks five more layers.

Layer 1: Enforce read-only at the cloud API level (gcloud / aws / gws)

The gcloud MCP is the clearest example. The user authenticates with the cloud-platform OAuth scope, but when the server actually runs a command, it impersonates a dedicated read-only Service Account and executes the gcloud CLI as that SA. That SA only has roles/viewer, so even if the agent tries to generate something destructive like gcloud run services delete, GCP's IAM itself rejects it with permission denied. Even if our server-side validation has a gap, the cloud API makes it physically impossible.

const readonlySa = new gcp.serviceaccount.Account(...);
new gcp.projects.IAMMember('readonly-viewer', {
  role: 'roles/viewer',  // ← the real safety net
  member: serviceAccountMember(readonlySa.email),
});

aws-server does the same via Google OIDC → AWS STS AssumeRoleWithWebIdentity, swapping into a read-only IAM role. The key point is that the server's own SA is deliberately given no data permissions — all operations happen under the user's token, which has the nice side effect of getting individual email addresses into GCP Audit Log / CloudTrail automatically.

Layer 2: Defense in depth via input validation (DB access)

DB access MCPs do something more aggressive (SELECT against production DBs), so we add another layer in front. A SQL validator allows only SELECT / SHOW / DESCRIBE / EXPLAIN / WITH, and rejects DROP / TRUNCATE / DELETE / INSERT / UPDATE / ALTER / CREATE and multi-statement queries via regex before the query ever touches a driver. On top of that, the DB users themselves are split into view / edit / delete permission levels, and MCP traffic is pinned to view — a read-only account at the DB grant level.

So you get "MCP-side parser rejection × DB-side GRANT" as two independent layers. Slipping through one is plausible, slipping through both is essentially not.

Layer 3: Per-user ACLs in Firestore (write-capable tools for non-engineers)

For any MCP server that can write, we keep per-user ACLs in Firestore.

The permission doc for the non-engineer "edit and deploy from chat" workspace MCP looks like this:

{
  "email": "user@example.com",
  "allowedPaths": ["apps/some-service/", "packages/some-domain/"],
  "allowedStacks": ["api-some-service", "pages-some-service"],
  "role": "developer"
}

write_file / commit tools do a prefix match against allowedPaths, and deploy tools additionally require allowedStacks membership plus role === 'developer'. Path traversal (..) is rejected by regex. Because both file paths and deployable stacks are whitelisted, a team member can't accidentally reach infra they don't own.

One step earlier in the stack, some servers skip individual ACLs and instead gate on users.division / groupName / teamName read from Firestore — a single org-level allow-list per server. When you don't need per-person granularity, it's much cheaper to operate.

Layer 4: PII anonymization at the data layer

This is the strongest layer because it doesn't rely on agent judgment at all. When the DB access MCP reads production data, a PII anonymization step strips the result before returning it: email becomes a***@example.com, personal names become 田***, phone numbers become ***-****-1234. It's two-tiered: a global rule set (matching column names like email, phone, password, etc.) plus DB-specific rules keyed by database.table.column.

Highly sensitive datasets (HR, etc.) go further — they're locked down at the BQ IAM level to a very small set of owners, so the MCP server's own SA can't even read them. We're not filtering these out in application code; BigQuery itself refuses the query.

Layer 5: Audit logging for every tool call

Every MCP server routes tool executions through a shared package (@cortex/mcp-tool-analytics) that writes to a BigQuery mcp_tool_calls table. Schema: timestamp, server_name, tool_name, user_email, nick_name, department, params, duration_ms, status, result_size, estimated_tokens.

One important detail: writes happen via impersonation of a dedicated writer SA, not from each server's own SA, which keeps the log-tampering surface tiny. We roll this up into dashboards — not just for post-hoc audit, but also for usage analysis (finding tools nobody actually calls, so we can delete them).

Summary

Risk	Primary defense
Agent picks the wrong server out of 17	Layer 0: per-repository `.mcp.json` narrows the choice set upstream
Agent generates a destructive command	Layer 1: IAM physically rejects it
Input sanitization gap	Layer 2: SQL validator + DB GRANT
Out-of-scope resource modification	Layer 3: Firestore ACL (paths / stacks / role / org)
PII leakage	Layer 4: result masking + BQ IAM
Detection miss	Layer 5: audit log through the shared package

The overall design principle is: start read-only, and if you need writes, explicitly whitelist exactly those operations. Because implementers get OAuth and audit logging for free from the shared packages (@cortex/mcp-oauth, @cortex/mcp-tool-analytics), new-server development only needs to focus on the ACL that's specific to that tool. That's a big part of how we got to 17 without the blast radius getting out of hand.

Renato Marinho • Apr 11

Running 17 MCP servers across internal operations is a significant architectural achievement — most teams struggle to maintain even 3 or 4 reliably. The coordination overhead alone across Slack, GitHub, BigQuery, and custom internal tools must have been substantial to get right.

The question that becomes unavoidable at that scale is governance: when 17 agents are running with write access across your internal systems, how do you reconstruct exactly what each one did when something unexpected happens? How do you prevent PII from customer data flowing through to the LLM context during routine operations? And if one server starts misbehaving at 2am, how do you shut it down without taking down the others?

These are the questions Vinkius (vinkius.com) was built to answer. It runs pre-governed MCP servers inside V8 Isolate sandboxes — each call generates a SHA-256 cryptographic audit trail, PII is redacted at the protocol level before reaching the model, and there's a global kill switch per server. The SDK is Vurb.ts, which wraps MCP tool calls with these controls natively rather than as middleware.

Your architecture proves the technical feasibility of agent-driven internal operations. The governance layer is what makes it auditable and safe enough to trust at scale. Really impressive operational experiment — would love to see a follow-up on how you handle incidents across 17 concurrent servers.

Ryosuke Tsuji • Apr 11 • Edited

@renato_marinho
PII Protection:
PII redaction is handled at the data layer, not as middleware. Our DB Graph MCP server (detailed in this post) automatically anonymizes query results from production databases — emails become @.com, names become ***, phone numbers/addresses/card numbers are all masked before they ever reach the LLM context. This runs at both the MCP layer and the Lambda layer (dual validation), so even if one layer is compromised, PII doesn't leak. On the observability side, Grafana's log data is also protected — structured logs are designed to exclude PII fields, and Grafana access itself is scoped via Google OAuth with domain restrictions. Servers that don't touch customer data (infrastructure, CI/CD, documentation) simply don't have access to PII-containing databases in the first place — scope separation by design.

Observability & Incident Response:
Every MCP server is instrumented with OpenTelemetry, and all logs/traces/metrics are aggregated in Grafana. Grafana alerting rules are configured per server — latency spikes, error rate thresholds, and availability checks all trigger Slack notifications automatically. So if a server misbehaves at 2am, the on-call engineer gets a Slack alert immediately.

For investigation, we built a Grafana MCP server — meaning Claude Code itself can query logs and metrics. "Show me error logs from the DB Graph MCP in the last hour" returns
structured results directly in the AI context. This closes the loop: the same AI that uses the MCP servers can also diagnose issues with them.

Independent Deployment:
Each server is a separate Cloud Run service with its own Pulumi stack, service account, and IAM roles. Deploying, scaling, or shutting down one server has zero impact on the others. There's no shared runtime or process — they're fully isolated at the infrastructure level.