DEV Community

Cover image for What Building with MCP Taught Me About Its Biggest Gap
Athreya aka Maneshwar
Athreya aka Maneshwar

Posted on

What Building with MCP Taught Me About Its Biggest Gap

I spent the last few weeks wiring up MCP at my org, stitching a handful of internal tools (GitHub, Slack, Datadog) into a shared layer that multiple teams' AI agents could call into.

Useful. Powerful. And, about a week in, slightly alarming.

The same four or five "wait, doesn't MCP handle this?" questions kept coming up. Who's allowed to call this tool? What happens if a tool returns 50MB of data? Where are we logging any of this? How do I give Team A read-only access when Team B needs write?

Turns out: MCP doesn't handle any of it. Not because it's broken, because that's not what it's for.

MCP standardizes how agents talk to tools. It says nothing about who gets to, how much they can pull, or whether anyone's keeping receipts.

I can't drop my org's internal code into a blog post, so I rebuilt the same shape of problem in a tiny public repo.

Three MCP servers, one Gemini-driven agent, one minimal gateway, all runnable in five minutes.

So, MCP. What is it, again?

A thirty-second version of MCP, straight from the official docs:

MCP (Model Context Protocol) is an open-source standard for connecting AI applications to external systems. Think of it like a USB-C port for AI applications — a standardized way to plug data sources, tools, and workflows into Claude, ChatGPT, or whatever model you're wiring up.

modelcontextprotocol.io

The mental model that finally made it click for me: MCP standardizes the plug, not the power grid.

Your agent speaks MCP.

Your tools (GitHub, Slack, Datadog, your database) speak MCP.

They meet in the middle and everything Just Works.

Well. Almost everything.

The demo: one agent, three MCP servers, a Gemini brain

To make this concrete, I built the smallest possible setup, a repo anyone can clone and run in five minutes:

  • A GitHub MCP server that exposes get_readme, get_latest_commit, get_repo_files
  • A Slack MCP server that exposes send_message
  • A SQLite MCP server that exposes log_event, get_logs
  • A Gemini-driven agent that picks a tool, calls it, summarizes the result, and posts to Slack

Five processes, one loop. Here's what that actually looks like on screen:

All five terminals running together

Point the agent at a repo and off it goes:

taco@TCSIND-4shZvZXk:~/mcp$ node agent/agent.js
[agent] starting one-loop run
[agent] chosen tool: github.get_readme
[agent] summary: Ragfolio is an AI-powered portfolio template that uses RAG to answer professional questions based on your resume data. It is built with a modern stack including React, FastAPI, and Google Gemini for high-performance retrieval and generation.
[agent] demo loop complete
Enter fullscreen mode Exit fullscreen mode

Try a bigger, more famous repo? Same agent, no code change:

taco@TCSIND-4shZvZXk:~/mcp$ node agent/agent.js https://github.com/vercel/next.js
[agent] starting one-loop run
[agent] target repo: https://github.com/vercel/next.js
[agent] objective: Summarize the project highlights for a dev audience.
[agent] chosen tool: github.get_readme
[agent] summary: Next.js is a full-stack React framework designed for building high-performance web applications with integrated Rust-based tooling. It extends the latest React features while providing optimized build processes and a robust ecosystem for enterprise-scale development.
[agent] demo loop complete
Enter fullscreen mode Exit fullscreen mode

Slack confirms the summaries landed:

Slack showing ragfolio and Next.js summaries

Everything works. High-fives all around. I'm ready to ship this to teams.

The

And then I actually think about what I just built.

What MCP quietly does not give you

MCP is a protocol. That's wonderful and that's also exactly the problem. Out of the box, vanilla MCP has:

  • No auth. Anyone who can reach port 4001 can call every tool on the GitHub server. In prod, that's a problem.
  • No RBAC. Every caller gets every tool, or no tools.
  • No audit. Unless you add logging to every server, by hand, there is no record of who called what.
  • No guardrails on outputs. If a tool returns a 2MB README, your agent happily eats 2MB of its context window. If a tool returns rm -rf /, your agent happily executes it too.
  • No shared policy layer. Every team ends up copy-pasting the same "validate tool name, wrap in { output, error }" boilerplate, each with its own subtly different bugs.

This is not a knock on MCP.

USB-C also doesn't come with a surge protector.

Those are separate products for good reasons.

But if you're running agents in an environment where the blast radius of "oops" is meaningful, you need that separate product.

The obvious place for that product to live? A gateway, sitting between every agent and every MCP server.

Putting a tiny gateway in front of everything

In my demo repo, the gateway is a single 90-line Express file (gateway/gateway.js). It does three things. Together, they cover 80% of the complaints above.

1. An allowlist — capability control in one Set

Every tool call is namespaced (github.get_readme, slack.send_message, db.log_event).

The allowlist is quite literally a JS Set:

const TOOL_ALLOWLIST = new Set([
  TOOL_NAMES.GITHUB_GET_README,
  TOOL_NAMES.SLACK_SEND_MESSAGE,
  TOOL_NAMES.DB_LOG_EVENT
]);
Enter fullscreen mode Exit fullscreen mode

If an agent (or a prompt-injected-into-misbehavior agent) tries to call github.delete_repo, it never reaches the GitHub server.

The gateway refuses in three lines, logs the attempt, and sends back a polite error.

Notice what this isn't: a prompt that says "please don't call delete_repo."

Prompts are suggestions.

Allowlists are rules.

2. A guardrail — the content contract

Some tools return unbounded blobs.

READMEs in particular love to be 40KB of badges and marketing copy.

The gateway has a hardcoded cap:

if (
  tool === TOOL_NAMES.GITHUB_GET_README &&
  serverResponse.output.content.length > 5000
) {
  return { output: null, error: createError("README content exceeded 5000 characters") };
}
Enter fullscreen mode Exit fullscreen mode

Here's that guardrail earning its keep on a deliberately gnarly repo:

taco@TCSIND-4shZvZXk:~/mcp$ node agent/agent.js https://github.com/juice-shop/juice-shop 
[agent] starting one-loop run
[agent] target repo: https://github.com/juice-shop/juice-shop
[agent] objective: Summarize the project highlights for a dev audience.
[agent] chosen tool: github.get_readme
[agent] first tool call failed: {
  message: 'Guardrail blocked response: README content exceeded 5000 characters',
  details: null
}
Enter fullscreen mode Exit fullscreen mode

Juice Shop's README is enormous.

Without the guardrail, my agent would've burned half its context on emoji-laden marketing.

With the guardrail, the agent got a clean "nope, try something else" and my context window stayed intact.

Gandalf

3. Logging — audit trail for free

Every single call through the gateway i.e success, failure, allowlist rejection, guardrail block gets recorded to SQLite via the db.log_event tool.

Best-effort, fire-and-forget, one await in the middleware.

Now when someone asks "what did the agent do yesterday?" the answer is a query, not a shrug.

That's it.

That's the whole governance layer.

An allowlist, a guardrail, a log roughly 200 lines of Node, no framework, readable in a single sitting.

But a toy gateway is still a toy

Here's where I have to be honest with myself.

My gateway works for the demo.

It would not survive contact with a real organization.

  • The allowlist is one Set shared by everyone. No per-team, per-agent, per-use-case scoping.
  • Guardrails are hardcoded conditionals. Adding a new one means a code change and a redeploy.
  • Authentication is nonexistent. Anyone who can curl :3000/mcp is an agent now.
  • Routing is three localhost URLs in a map. No service discovery, no health checks, no retries.
  • Adding a new tool means editing three files.

Solving each of those is a weekend project.

Solving all of them, operating them, and keeping them maintained as tools come and go across teams, that's a platform team's full-time job.

The feature I wish I'd built first: the Virtual MCP Server

While researching what a grown-up version of this gateway looks like, I came across TrueFoundry's MCP Gateway and specifically their concept of a Virtual MCP Server.

It's one of those ideas that's obvious in retrospect and I'm mildly annoyed I didn't think of it first.

Expanding-brain / galaxy-brain meme — panel 1:

The idea:

You have a bunch of real MCP servers, each exposing lots of tools.

Some tools are safe.

Some are dangerous.

Some are fine for one team and a footgun for another.

Rather than giving teams access to whole servers, you compose a Virtual MCP Server, a curated, custom, named collection of tools pulled from whichever upstream servers you want.

Concretely:

  • Your doc-summary-bot Virtual MCP Server exposes just github.get_readme and slack.send_message. That's the full surface area.
  • Your release-bot Virtual MCP Server exposes github.create_release, github.tag_commit, and slack.send_message — but not github.delete_repo, even though the upstream GitHub server technically supports it.

No new deployments.

The virtual server is just configuration on the gateway, and each one gets its own allowlist, its own guardrails, its own auth scope.

This matters because of the failure mode it quietly prevents.

Here's a solid demo video explaining Virtual MCP Server.

What this looks like in a real workflow

Let me walk through the kind of agent I'd actually want to run in production, a compliance automation bot, operating entirely through a TrueFoundry MCP Gateway endpoint:

  1. A PR merges to main. A webhook wakes the agent.
  2. The agent calls github.get_diff via its Virtual MCP Server. Authenticated, not with a bare PAT pasted into an env var, but with a service token the gateway issued and can rotate.
  3. The diff comes back. The gateway's guardrail notices it's 12,000 lines, well over the "unsupervised review" threshold and pauses the run, requesting human approval before continuing. (Try getting that out of a lone MCP server.)
  4. A reviewer approves. The agent writes the diff plus metadata to MongoDB via db.store_diff.
  5. It opens a Jira ticket via jira.create_issue, linking back to the diff.
  6. It posts a summary to Slack via slack.send_message.

Six tool calls.

Four different upstream MCP servers.

One endpoint. Every call authenticated. Every call logged to the audit trail.

The single dangerous tool the agent isn't supposed to touch, even if a prompt injection convinces it to try it isn't prompted against.

It's simply not in the Virtual MCP Server, so calling it is a 404, not a judgment call.

That, to me, is the jump from protocol to platform.

Wrapping up

MCP gave us a clean, shared language for AI agents to talk to tools.

That's a big deal, and it's easy to underrate how much of a pain this was before MCP existed.

But a shared language isn't a shared policy.

If you're running more than one agent, or letting more than one team build agents, you will need the thing that sits between "call the tool" and "did we mean to let it call the tool."

That thing is a gateway.

You can build a toy version in an afternoon.

My demo repo is proof.

But for anything real — auth, RBAC, audit, per-scope capability boundaries, and the Virtual MCP Server trick you want a platform that treats governance as the product, not the afterthought.

Take a look at TrueFoundry's MCP Gateway and the Virtual MCP Server feature if you're at the "I'm giving real agents real tools and someone in security wants to talk to me" stage.

If you build something interesting on top of either, I'd love to see it. Happy gatewaying.

Top comments (0)