Guy for AWS Heroes

Posted on Jun 7

Security for MCP Servers: Governed Access Beats Uploading Spreadsheets to ChatGPT

#mcp #security #ai

An analyst exports a spreadsheet with customer data, uploads it into a generic AI chat, asks for a sales summary, and gets an answer in seconds. It feels productive. It is also one of the least governed ways to bring AI into an enterprise. The model sees the raw file. The organization has little control over which fields were exposed, whether PII was included, whether the same data should have been aggregated first, or whether the user should have been allowed to access every row in the first place.

There is another failure mode that matters just as much in enterprise settings: inconsistent business aggregation. I saw this directly with one company where two managers prepared reports for the same management meeting from the same underlying report, but got dramatically different answers from ChatGPT. One manager instructed it to use the company's fiscal year, which starts in July; the other did not. One mentioned Canada as part of North America; the other forgot to ask for it. Both reports looked polished; both were based on the same raw data, but because the business definitions were stored in ad hoc prompts and sent to a probabilistic LLM rather than a governed interface, the meeting started with conflicting numbers.

That is the wrong operating model for enterprise AI.

The safer pattern is to connect the AI client to an MCP server instead. The MCP server becomes the controlled interface between the model and the underlying data systems. It determines which tools exist, what inputs they accept, what data shape they return, and which authenticated user they act on behalf of. The model does not get the whole spreadsheet. It gets the smallest, governed slice of data needed to answer the question.

A well-designed MCP server prevents this kind of semantic drift. The business analyst can define that fiscal years start in July, that North America includes Canada, that revenue is recognized by a specific accounting rule, and that management reports return a standard aggregation. Every user and every model then works from the same, governed business definition, rather than relitigating those choices in every chat.

This is where the recurring motif from the previous articles comes into play again. Good MCP design is always a balance of strengths across multiple parties. Security is not the job of one component. The business analyst defines the safe business surface. The business user brings the runtime question. The LLM interprets intent. The MCP server validates and executes deterministically. And when the surface becomes more powerful, as we saw in the code mode article, the IT administrator governs policy, approval, and audit. Each party covers the other party's weaknesses.

This article shows how that balance becomes a security architecture for production MCP servers, with layers of security that collectively enhance the overall trust in AI interactions.

What Is MCP? (The 30-Second Version)

The Model Context Protocol (MCP, spec 2025-11-25) is the interface layer between AI clients and external systems through tools, prompts, and resources. In enterprise deployments, the MCP server should be a thin, remote, mostly stateless interface layer over internal data systems. If HTTP-based applications are the human-facing interface to enterprise systems, MCP servers are the AI-facing interface.

That framing matters for security. Once you see the MCP server as an interface tier, the right controls become obvious: authenticated requests, typed inputs, least-privilege downstream access, output filtering, audit trails, rate limits, and active security testing. The server is not a shortcut around enterprise controls. It is where those controls should be made explicit for AI.

There is also a larger industry lesson here. We have already spent decades learning, often painfully, how vulnerable data-facing interfaces become when teams trust inputs too much, over-privilege backends, skip per-request authorization, or expose broad attack surfaces and hope the client behaves well. SQL injection, broken access control, credential leakage, over-broad service identities, tenant isolation failures, and data exfiltration were not abstract risks. They were the history of web security. MCP is the next layer of interfaces between users, models, and data systems. It should start with those lessons, not relearn them from scratch.

The Real Alternative Is Not "MCP vs. No MCP"

A lot of enterprise security discussions start from the wrong comparison. The practical comparison is not:

MCP server
nothing at all

The real comparison is usually:

a governed MCP server
business users pasting or uploading sensitive material into a general AI chat

When a user uploads a spreadsheet directly, the organization loses most of its control points:

no tracing of how fresh the data is
no typed or validated input contract
no server-side output shaping
no per-tool authorization boundary
no guaranteed row-level or field-level policy enforcement
no reliable audit trail of which business operation was governed

An MCP server restores those control points. Instead of exposing "the file," it can expose a tool such as quarterly_sales_summary, customer_health_report, or query_sales_cube, each with deliberately constrained inputs and outputs. That is the security benefit of MCP in enterprise settings: not secrecy by obscurity, but governance by interface design.

Security Starts With Tool Design

The first article argued that tool design is MCP's UX discipline. It is also one of MCP's first security layers.

The business analyst has a direct security role at design time. They decide:

which business actions should exist as tools at all
which inputs are valid for those actions
which outputs the model actually needs
which fields should never be returned
which workflows should be packaged as prompts rather than left to ad hoc exploration

This matters because most enterprise data exposure happens long before a cryptographic control fails. It happens when the wrong interface is published.

If the analyst defines a tool as "run any SQL against finance," the server surface is already too broad. If they define it as "summarize revenue by region for a selected quarter" with an enum for region scope, a validated quarter field, and a fixed output shape, most of the risk is removed before runtime.

That is the same balance we have used throughout this series. The analyst contributes domain judgment. The server contributes deterministic enforcement. The model contributes to language understanding. The user contributes the concrete business question. Security improves because the parties are not all doing the same job badly.

Typed Inputs Are A Security Boundary

One useful pattern, and one that Rust SDKs such as PMCP support well, is to make tool contracts both model-readable and runtime-enforced from the same Rust types.

In earlier articles, we used schemars constraints and deny_unknown_fields to improve tool usability. Those same patterns are security controls:

type safety rejects malformed inputs before business logic runs
range and length constraints reject obviously abusive input early
enums narrow the space of valid values
deny_unknown_fields prevents undeclared parameters from slipping through.

That is not the whole security story, but it is an important first layer. A business analyst can define the valid business shape, and the server implementation can enforce it consistently.

#[derive(Debug, Deserialize, JsonSchema)]
#[schemars(deny_unknown_fields)]
pub struct SalesSummaryInput {
    /// Fiscal quarter to summarize, for example, 2026-Q1
    #[schemars(length(min = 7, max = 7))]
    pub quarter: String,

    /// Business unit to analyze
    pub business_unit: BusinessUnit,

    /// Aggregation granularity
    pub group_by: GroupBy,

    /// Maximum number of rows to return
    #[schemars(range(min = 1, max = 100))]
    pub limit: Option<u32>,
}

This is more than schema decoration. It is the business contract. The LLM sees the same structure during tool discovery that the server enforces at runtime.

This is also where the business analyst's role becomes concrete. They are not writing Rust, but they are deciding that quarters should follow a fiscal format, that group_by should be an enum rather than free text, and that a sales summary tool should never return an unbounded result set. The engineer implements that contract in code. A good MCP server then turns it into a discoverable and enforceable interface.

Output Boundaries Matter More Than Most Teams Think

Enterprises often focus on who may call a tool and underinvest in what the tool can return. For AI workloads, output boundaries are just as important.

This is one reason MCP is safer than directly uploading documents. The server can expose a shaped result rather than raw records. It can return aggregates instead of rows. It can omit PII fields entirely. It can redact or mask values that are useful for joins or filtering, but should never be emitted back into the model context.

That distinction matters in the long-tail cases covered by the code mode article. A database query may legitimately need sensitive fields for internal computation or joins, while still forbidding those fields from appearing in the result.

For example, a code-mode policy can allow a query to join the customer and support tables on the server side, while blocking sensitive output fields such as ssn, salary, or even raw email addresses from appearing in the returned payload. The sensitive field can participate in the computation. It does not have to be exposed to the model.

That is a much better enterprise pattern than "upload the spreadsheet and ask the model to be careful."

The same principle applies outside code mode:

Use fixed output schemas for curated tools. You want to control the shape of the data returned by the data system to optimize AI flows for security, privacy, and cost (fewer tokens).
Prefer aggregates over row dumps. Don't trust the LLM to accurately calculate sum or other statistical measures. The MCP tools are much better for such symbolic computation.
Keep tool responses aligned to the actual business question for consistency, accuracy, and security.
block fields that are unnecessary for the answer

Security is not only about stopping bad requests. It is also about making oversharing hard by design. As a side effect, this usually saves tokens and processing time too.

OAuth Matters Because The Server Must Act On Behalf Of The User

This is the most important point of authentication when building AI agentics workflows.

An enterprise MCP server should not sit in front of a database or application using a single shared application API key, a single database username/password, or a single generic service identity that grants every end user the same access. That recreates the oldest enterprise security mistake: every user gets the power of the integration account.

The correct pattern is that the MCP client and MCP server work on behalf of the authenticated user.

That is why OAuth 2.0 and OIDC matter so much in a serious MCP security model:

The MCP client handles the OAuth flow and token refresh, allowing the users to log in once and then manage the secure handling of the access and refresh tokens.
The MCP server validates the access token on every request and extracts user identity, tenant context, groups, and scopes.
downstream systems enforce the user's own permissions whenever possible, with the pass-through of the user's access tokens.

In the strongest design, that delegated identity continues past the MCP boundary. If the downstream API or data platform supports OAuth, the MCP server should forward the user's token rather than substitute a broad application credential. If the backend uses a different enforcement model, the server should still propagate token-derived user and tenant context into that system so row-level, field-level, or tenant-level policies execute for the real user. The point is to preserve user identity end-to-end, not collapse it into a shared super-account in the middle.

The practical benefit is huge. When a company already uses Entra ID, Okta, Cognito, Auth0, or another identity provider, the MCP server can integrate with existing SSO, group membership, access reviews, and offboarding. When IT disables an employee account, MCP access is revoked.

That is categorically better than a static API key model.

In a well-factored implementation, the server code can stay provider-agnostic and work with an AuthContext rather than baking identity-provider details into every tool:

fn handle_request(auth: &AuthContext) -> Result<(), Error> {
    auth.require_auth()?;
    auth.require_scope("read:sales")?;

    let user_id = auth.user_id();
    let tenant_id = auth.tenant_id();

    // Pass identity downstream so backend policies act on behalf of the user
    tracing::info!(%user_id, ?tenant_id, "authorized sales request");
    Ok(())
}

The design principle underneath this is simple: the server should validate identity, not replace identity.

Token Validation Is Not Optional Plumbing

Because MCP servers follow the web server model, token validation must be treated as a first-class responsibility of the server.

For JWT access tokens, that means validating at least:

signature
algorithm
expiration
not-before time
issuer
audience
required scopes

The principle is straightforward. The MCP server should not invent a custom auth flow. It should validate the tokens it receives and return a stable authenticated context for the rest of the codebase.

This is another place where mature SDKs help enterprise teams. A provider-agnostic authentication model lets you switch between Cognito, Entra, Google, Okta, and Auth0 by configuring the tool rather than rewriting its logic. That keeps authentication a deployment concern rather than scattering auth conditionals throughout the business code.

Security Happens In Layers

Many teams miss an important point: you do not need to implement every security rule inside the MCP server itself. You need to place each rule in the correct layer.

For enterprise MCP, a clean mental model is:

Layer 1: Server access

Validate the token. Reject invalid, expired, or misissued requests.
Layer 2: Tool authorization

Check whether this user may call this tool or workflow at all.
Layer 3: Data-level security in the backend

Let the database, API gateway, GraphQL layer, or data platform enforce row-level security, field-level authorization, column masking, or tenant isolation.

That is the design you want because the MCP server is the interface tier, not the entire security platform. If your warehouse already supports column masking, or your database already supports row-level security, the MCP server should pass through the user identity and let the backend do what it is already good at.

This is also the answer to a common enterprise objection: "Do we need to reimplement all our data security logic in the AI layer?" No. The AI-facing layer should validate access, constrain the interface, and carry the user identity through. The data systems should continue enforcing the data rules closest to the data.

Security Testing Is Another Layer, Not A Final Checkbox

Security controls designed into the server still need to be tested as part of the release process.

That is where the testing article fits directly into the security story. We argued there that MCP production testing has five gates: smoke, conformance, scenarios, load, and pentest. Security is not separate from that stack. It is one of the gates, and it also cuts through the others.

For MCP workloads, pentesting matters because the client is programmable and partially adversarial by default. You are not only defending against a careless user. You are also defending against:

prompt-injection-shaped inputs
malformed parameters
tool misuse across role boundaries
schema edge cases
attempts to exfiltrate blocked fields
tenant boundary violations

So the enterprise posture should be layered:

design-time narrowing of tools and workflows
typed validation at the interface
OAuth-based user identity
backend-enforced data permissions
policy and approval for powerful surfaces
penetration testing before production rollout and after meaningful changes

That is how you turn "secure by design" from a slogan into a release discipline.

Do Not Throw Away Web Security Lessons

One of the easiest mistakes in AI infrastructure is to treat it as so new that older security practices no longer apply. That is usually how teams recreate old failures with new tooling.

MCP servers should start from the hard-learned lessons of web and API security:

never trust client input, even when the client is an LLM
authenticate every request
authorize every operation
prefer least-privilege credentials and delegated user identity
narrow the interface surface instead of wrapping the whole backends
validate and shape outputs, not only inputs
log and audit meaningful actions
assume attackers will probe every exposed edge
test for injection, exfiltration, broken access control, and tenant leaks before production

MCP changes the interface, but it does not repeal these rules. If anything, it makes some of them more important, because the caller is now a probabilistic system that can be induced to misuse the interface in unexpected ways.

Implementation Note: Rust Helps

This article is about MCP best practices, not a specific single SDK. Still, language and tooling choices do matter. Rust is a strong fit for this layer because memory safety, strong typing, and explicit contracts are useful properties for security-sensitive interface services. SDKs such as PMCP are valuable when they make those practices easier to apply consistently, but the architectural lesson comes first: narrow interfaces, user-scoped identity, layered authorization, and governed outputs.

The Core Security Argument

The core argument of this article is not that MCP automatically makes AI safe. Poorly designed MCP servers can absolutely be insecure.

A well-designed MCP server is a more secure enterprise AI pattern than allowing users to upload raw business documents to general AI chat interfaces.

Why?

It can limit the interface to approved business operations.
It can execute symbolic computation on the server side without exposing internal data to the outside world.
It can validate inputs before execution.
It can shape outputs before data reaches the model.
It can block sensitive fields from being returned.
It can act on behalf of the authenticated user instead of a shared super-account.
It can preserve existing backend security controls.
It can be tested, audited, and governed like any other production interface.

The Security Stack In One View

To wrap the article up, here is the layered model worth carrying forward:

Business design layer

The business analyst defines the approved operations, shared business definitions, and safe aggregation rules so users do not reinvent them in ad hoc prompts.
Interface contract layer

Tools expose typed inputs, bounded schemas, constrained outputs, and narrow outcome-oriented surfaces.
Authentication layer

The MCP client and server work on behalf of the authenticated user via OAuth and validated tokens, rather than shared integration credentials.
Authorization layer

Each tool, workflow, or code-mode action is checked against the user's scopes, roles, groups, and policy rules.
Data protection layer

Backend systems enforce row-level security, field-level permissions, masking, tenant isolation, and other controls closest to the data.
Governance layer

Powerful surfaces, such as code mode, add approval, policy administration, blocked fields, execution limits, and audit trails.
Validation and testing layer

Smoke tests, conformance tests, scenario tests, load tests, and penetration tests verify that the security design actually holds in production.

If enterprise teams start MCP with those seven layers, they will begin with the lessons the industry has already paid to learn, rather than paying for them again through avoidable AI-era security failures.

DEV Community