Gursharan Singh

Posted on Apr 8 • Edited on Jun 12

MCP in Practice — Part 6: Your MCP Server Worked Locally. What Changes in Production?

#mcp #ai #architecture #webdev

Part 6 of the MCP in Practice Series · Back: Part 5 — Build Your First MCP Server (and Client)

In Part 5, you built an order assistant that ran on your laptop. Claude Desktop launched it as a subprocess, communicated over stdio, and everything worked. The server could look up orders, check statuses, and cancel items. It was a working MCP server.

Then someone on your team asked: can I use it too?

That question changes everything. Not because the protocol changes — JSON-RPC messages stay identical — but because the deployment changes. This article follows one server, the TechNova order assistant, as it grows from a local prototype to a production system. At each stage, something breaks, something gets added, and ownership shifts. By the end, you will have the complete production picture of MCP before we go deeper on transport or auth in follow-ups.

You do not need to implement every production layer yourself. But you do need to understand where each one appears.

If you already run MCP servers in production, treat this part as the big-picture map. You can skim it for the overall model and jump to the next part for transport implementation details.

Each stage in the diagram above maps to a section below. Start at the top left — that is where you are now.

1. Local Prototype — Your MCP Server Worked Locally

The order assistant from Part 5 runs entirely on your machine. Claude Desktop is the host application. It launches the MCP server as a child process and communicates through standard input and output — the stdio transport. The server reads JSON-RPC requests from stdin, processes them, and writes responses to stdout.

Everything lives inside one machine boundary. The host, the client, the server, and the local SQLite database are all running in the same operating system context. Trust is implicit: if you can launch the process, you are trusted.

There is no network, no token, no authentication handshake. The operating system's process isolation is the only security boundary that exists.

This is not a limitation — it is the correct design for local development. Stdio is fast, simple, and requires zero configuration. Every MCP client is expected to support it. For a single developer building and testing a server, nothing else is needed.

Nothing is broken yet.

2. Team Wants It Too — What Breaks When More Than One Person Needs It

The server still works. What changes is that a second developer on the support team wants to use it too. With stdio, there is only one option: they clone the repository, install the dependencies, configure their own Claude Desktop, and run their own copy of the server on their own machine.

Now there are two copies. Each has its own process, its own local database connection, its own configuration. If you fix a bug or add a tool, the other developer does not get the update until they pull and restart. If a third person wants access, they duplicate everything again. The pattern does not scale — every new user means another full copy of the server.

The protocol itself is fine. JSON-RPC works the same way on every machine. What broke is the deployment model. Stdio assumes a single user running a single process on a single machine. The moment a second person needs access to the same server, that assumption fails.

This is the point where the server needs to stop being a local process and start being a shared service.

3. Shared Remote Server — Moving from stdio to a Shared Remote Server

Once duplication becomes the problem, the next move is straightforward: stop copying the server and make it shared. The order assistant moves off your laptop and onto a server. There is now one shared copy instead of many duplicated local ones. From the team's point of view, the change is simple: instead of everyone running their own copy, everyone connects to one shared deployment.

Instead of stdio, the server now speaks Streamable HTTP — the MCP specification's standard transport for remote servers. It exposes a single HTTP endpoint, something like https://technova-mcp.internal/mcp, and accepts JSON-RPC messages as HTTP POST requests.

The messages themselves did not change. What changed is how they travel — instead of stdin and stdout within a single process, they now cross a network.

That network crossing is the single most important change in the entire journey. Before, the server was only reachable by the process that launched it. Now, anyone who can reach the URL can send it a request. The implicit trust model of stdio — if you can launch it, you are trusted — is gone.

On the left, everything is inside one boundary. On the right, a network separates the client from the server — and that gap is where auth has to live.

4. Auth Enters — Why Auth Appears the Moment You Go Remote

Auth did not appear because someone decided the server needed more features. It appeared because the deployment boundary changed. Locally, the operating system answered the question "who can talk to this server?" Once the server goes remote, you have to answer that question explicitly. Something has to replace the trust that stdio provided for free.

The MCP specification uses OAuth 2.1 as its standard for this. The server's job becomes validating tokens — not issuing them.

An external authorization server, something like Entra, Keycloak, or Auth0, handles user login and token issuance. The client obtains a token from the authorization server and presents it with every request. The MCP server checks whether that token is valid and either allows the request or rejects it.

The key architectural point is separation. The MCP server does not manage users, does not store passwords, and does not issue tokens. The authorization server is a separate system, typically managed by a platform or security team.

But there is an important gap. The token tells the server who the caller is. It does not tell the server what the caller is allowed to do at the tool level. A token might carry a scope like tools.read, but deciding whether that scope allows calling the cancel-order tool versus just the get-order-status tool — that mapping is not part of the specification. It is your responsibility as the server developer.

Authentication is what the specification and SDK handle. Authorization — the per-tool, per-resource access decisions — is always custom.

5. Multiple Servers — When One Server Becomes Several

TechNova does not just need order lookups. The support team also needs to search the product catalog and check inventory availability. Each of these is a separate MCP server — Order Assistant, Product Catalog, Inventory Service — each exposing its own tools, each connecting to its own backend.

The host application now manages multiple MCP clients, one per server. This is how MCP was designed: one client per server connection, with the host coordinating across all of them. The protocol did not change. What changed is the policy surface. Three servers means three sets of tools, three sets of backend credentials, three sets of access decisions. What gets harder is not just the connection count — it is keeping all of those servers consistent and safe.

At this scale, some teams introduce a gateway — a proxy that sits in front of all the MCP servers and centralizes authentication, rate limiting, and logging. This is not required by the specification, and many deployments work fine without one. But more servers means more policy surface, and that surface needs to be managed — either per-server or centrally.

6. Production Controls — The Operational Layer Around the Server

The servers are deployed, authenticated, and serving the support team. Now the operational layer matters: rate limiting to protect against overload, monitoring to track tool invocations and error rates, and audit logging to create the compliance trail of who called what and when.

There is one production concern specific to MCP that deserves attention. Each MCP server needs its own credentials to reach its backend systems — the order database, the product catalog API, the inventory service. These backend credentials are completely separate from the user's OAuth token. The user's token proves who is calling the MCP server. The server's own credentials prove that the server is authorized to reach the backend. These two credential chains must never be mixed.

The MCP specification explicitly prohibits passing the user's token through to backend services — doing so creates a confused deputy vulnerability where the backend trusts a token that was never intended for it.

MCP also introduces security concerns that traditional APIs do not have. Tool descriptions are visible to the LLM, which means a malicious server can embed hidden instructions to manipulate the model's behavior. A server can change its tool descriptions after the client has approved them. And multiple servers connected to the same host can interfere with each other through their descriptions. These threats — tool poisoning, rug pulls, cross-server shadowing — are the subject of the next article.

What You Own vs What Your Platform Team Owns

Scan the three columns. The left column is yours. The middle column is your platform team's. The right column is the conversation between you.

If you remember one practical thing from this article, remember this ownership split. Understanding what you build versus what your platform and security teams manage is the difference between feeling overwhelmed by production and knowing exactly where your responsibility starts and stops.

As the server developer, you own the tool layer. Tool design, tool scope, what each tool can access, and how it interacts with backend systems — these are decisions that only you can make because only you understand the domain. You also own your server's backend credentials: the API keys, service account tokens, or database connection strings that let your server reach the systems it wraps. The principle of least privilege applies here — your server should have access to exactly what it needs and nothing more.

Your platform and security teams typically own the infrastructure layer. TLS termination, ingress configuration, the authorization server itself, token validation middleware or gateway, rate limiting, and the monitoring and audit stack. These are not MCP-specific — they are the same infrastructure concerns that exist for any service your organization deploys.

Some responsibilities are shared. Scope-to-tool mapping — deciding which OAuth scopes grant access to which tools — requires the developer to design it and the security team to review it. Secrets management requires the platform team to provide the infrastructure and the developer to use it correctly.

The clearest way to think about it: you own what the server does. Your platform team owns how it is protected. And you both own the boundary between those two.

Three Takeaways

First, the protocol does not change when you go to production — JSON-RPC messages are identical over stdio and Streamable HTTP. What changes is the deployment boundary, and every production decision flows from that.

Second, auth appears because the trust model changes, not because someone adds a feature. Local stdio has implicit trust through process isolation. Remote HTTP has no implicit trust at all. OAuth 2.1 is how MCP fills that gap — but it fills only the authentication side. Authorization at the tool level is always your job.

Third, know what you own. Tool design, tool scope, backend credentials, and the least-privilege boundary around your server — these are yours. TLS, token issuance, rate limiting, and the monitoring stack — these are your platform team's. The boundary between those two is where production readiness lives.

Next: MCP Transport and Auth in Practice

More in the next part — I'd love to hear your thoughts on this one.

Part of AI in Practice — three practical series on MCP, RAG, and AI Agents, focused on why these patterns exist, where they break, and how to think through the engineering decisions behind them.

Top comments (2)

Renato Marinho • Apr 11

The six-stage progression you've mapped — from stdio to shared remote with auth — is one of the clearest framings of the local-to-production MCP journey I've seen. The diagram alone is worth bookmarking. Most teams skip stages 3 and 4 and pay for it in production.

There's a seventh stage that follows auth which most articles don't address yet: governance. After you've added authentication (stage 6 in your framework), you still don't have answers to: what did the authenticated agent actually do?, did it access or transmit any PII?, can you terminate a specific server without taking down others?

This is the gap that Vinkius (vinkius.com) addresses. It runs pre-governed MCP servers inside V8 Isolate sandboxes — each call generates a SHA-256 cryptographic audit trail, PII is redacted at the protocol level before reaching the LLM, and there's a per-server kill switch. The SDK is Vurb.ts. The framing is that auth + governance = the full production trust model, not auth alone.

Your series is building exactly the conceptual foundation teams need before making production decisions. Would be a strong addition to add a Part 7 on the governance layer. Worth following the rest of this series.

Gursharan Singh • Apr 11

Thanks Renato — you’re exactly right that auth alone isn’t enough, and that’s where the series goes next. Part 8 picks up from that point and looks at the risks that remain after authentication, including tool poisoning, rug pulls, and cross-server shadowing. Part 9 publishes tonight as a hands-on capstone.

The full series index is here: MCP in Practice — Complete Series