Viola Lykova

for AWS Community Builders

Posted on Apr 5

How to secure MCP tools on AWS for AI agents with authentication, authorization, and least privilege

#aws #mcp #ai #security

Model Context Protocol (or MCP) makes it easier for AI agents to access your existing backend capabilities. It allows AI agents to have access to your system's call services and to use tools such as Lambda functions. That convenience comes with a huge trade-off, a raised bar for security, because it demands a much stronger access model around those interactions. The problem is that once an agent can reach tools, you should be questioning who is calling what, on whose behalf, with which scope, through which boundary, and, most importantly, how to stop the whole thing from becoming an overprivileged mess and ruining the experience for real humans using your product.

The issue is clearly there and AWS is already building for this through Bedrock AgentCore Gateway and AgentCore Identity, while the MCP roadmap is moving in the same direction with enterprise-managed auth, audit trails, gateway patterns, and more fine-grained least-privilege scopes.

But authentication is no longer the main event, even though a lot of teams still treat it like it is. Because authentication answers who got in, authorization answers what they can do, and least privilege answers how much damage is possible when things go sideways. And here it is very important to think in layers as in MCP-based agent systems, you usually need all three at multiple layers; inbound authentication to the agent or gateway, outbound authentication from the gateway to the tool, and policy decisions on whether a given tool call should be allowed at all. AWS's current guidance reflects that layers split. Moreover, their product model is literally built around those layers and that exact order. For instance, AgentCore Gateway supports inbound OAuth-based authorization for incoming tool calls and multiple outbound authorization modes depending on target type, including IAM-based auth with SigV4, OAuth client credentials, authorization code grants, and API keys. We will be diving deeper into this later in this article.

So why does MCP change the ways we should think about security?

For the good part, because it gives AI agents a standard way to reach tools, services, and execution paths that sit outside the model itself. Once that access exists, the problem stops being just about connectivity and starts becoming an access-control problem, since you need to know who is calling which tool, under what identity, with which scope, and across which boundary.

That gets messy quickly when the same system has to support both user-delegated actions and machine-to-machine actions. Without a tight identity model, agents can end up with broad standing access, weak separation between human and non-human callers, and very little control over how requests move from one system to another.

This is where the AWS model starts to make sense. Bedrock AgentCore Gateway gives you a controlled entry point between agents and tools, while AgentCore Identity adds a dedicated layer for identity and credential handling for agents and automated workloads. The direction of the MCP ecosystem also reflects that reality, with current roadmap priorities including enterprise-managed auth, audit trails, gateway patterns, and more fine-grained least-privilege scopes.

The security model I'd use on AWS if I was setting it up today

The cleanest way to secure MCP tool access on AWS is to stop treating it as one big authentication problem. In reality, it breaks down into four separate control layers: inbound authentication, outbound authentication, authorization, and infrastructure-level least privilege.

That split matters because an MCP tool call is not a single trust decision. First, you need to control who is allowed to reach the agent-facing layer. Then you need a safe way for the gateway or runtime to authenticate downstream to the target tool. After that, you still need to decide whether the exact action should be allowed in context. Finally, the underlying AWS roles, scopes, and permissions need to stay narrow enough that a mistake in one layer does not turn into broad access everywhere else.

This is the structure I find easiest to reason about while still keeping it close to how these systems behave in production on AWS:

First, inbound authentication for the caller.
Second, outbound authentication for downstream tool access.
Third, authorization for the action itself.
Fourth, least privilege for the infrastructure underneath it always.

Breaking it down like that gives you a much cleaner outline of the problem before even starting the implementation.

Number one: Inbound authentication to the agent-facing layer

The first control point is the agent-facing layer itself. Before an agent can reach a tool, you need to decide who is allowed to invoke the gateway or runtime in the first place.

On AWS, AgentCore Gateway follows the MCP authorization model for inbound requests and can validate incoming calls against an OAuth provider such as Amazon Cognito, Okta, Auth0, or another compatible provider. That gives you a clear front-door identity check before any downstream tool access happens. AWS also supports different inbound flows depending on the caller, including authorization code flow for user-delegated access and client credentials for service-to-service access, with the ability to restrict access by approved client IDs and audiences.

Distinct between types of calls matters for a very good reason which is all MCP calls represent a different kind of trust. Some calls may come from a signed-in user acting through an application, while others may come from a background service, automated workload, or non-human agent, where this gets so complicated that it can get out of hand in production to the point where it will require more resources to fix it than to set it up correctly from the start. That is why all of these different types of calls must never be treated as equivalent, because not only do they not carry the same identity context, but also they carry a completely different level of user intent.

This is where it starts to snowball into a problem because when every inbound caller gets labelled as authenticated, the distinction between a human user session and a machine credential vanishes. But if we were to design a much safer model from the beginning, a smart choice would be to keep the concerns separated. For example, all user-delegated access will be done using authorization code or PKCE, while any machine-to-machine access will involve client credentials.

Number two: Outbound authentication from the gateway to the tool

This is usually the point where things start getting messy. A team does a decent job on inbound authentication, then quietly lets the gateway or agent call downstream tools with whatever credentials happen to work. That might get the system running, but it is not much of a security model.

Outbound authentication needs to be treated as its own control layer. Once the gateway, runtime, or agent starts talking to tools, APIs, or MCP servers, it still needs a clear and deliberate way to prove its identity to those downstream targets.

AWS separates that part properly. Depending on the target, AgentCore Gateway can use IAM-based authorization with a service role and SigV4, OAuth flows, or API keys. For MCP server targets, OAuth client credentials are supported as well. That matters because not every downstream target should be trusted in the same way, and not every tool should accept the same type of credential.

This is also where AgentCore Identity starts to become genuinely useful. Instead of scattering tokens, secrets, and auth logic across runtimes, tools, and bits of glue code, you can centralize that machinery in a service designed for agent identity and credential handling. That is a much cleaner setup, especially once the number of tools starts growing.

The main thing I would avoid here is treating downstream access as an implementation detail. It is not. If the gateway or runtime can call tools, then the way it authenticates to those tools needs to be deliberate, narrow, and easy to reason about.

Number three: Authorization inside the application path

Being authenticated does not automatically mean being allowed to use every tool or perform every action. That sounds obvious, but this is exactly where systems start to drift. A token is valid, the request gets through, and before long the system is treating "known caller" as if it means "allowed to do whatever comes next."

That is why I would treat authorization as its own layer. Once the caller is authenticated and the gateway can reach the tool, the system still needs to decide whether the specific action should be allowed in that context.

Cognito gives you a good starting point for broad API permissions through resource servers and custom scopes. That works well when you want to express coarse-grained capabilities such as billing.read, orders.update, or reports.export. It gives you a cleaner way to separate what a token can generally do from who the caller is.

But scopes only get you so far. The moment the decision depends on tenant membership, resource ownership, role, environment, or some other piece of context, you are no longer dealing with simple scope checks. You are in fine-grained authorization territory.

That is where something like Amazon Verified Permissions starts to make more sense. Instead of burying authorization logic across handlers, services, and bits of application code, you can move those decisions into a more explicit policy layer. That tends to be easier to reason about and much easier to change later without creating a mess.

The split I would use will pretty much reflect that:

Firstly, I want authentication to establish identity, my OAuth scopes will establish broad capabilities, whilst policy checks will be the decision-making body on whether the exact action should be allowed in context. And that separation is much healthier than trying to force every decision into token validation or scope checks alone because not everything can be solved a token.

Number four: Least privilege for the infrastructure layer

Even if the identity and authorization layers look good on paper, the system is still weak if the underlying roles and permissions are too broad.

This is the part teams often underestimate. They spend time on tokens, OAuth flows, and gateway design, then quietly give the runtime or supporting services far more access than they actually need. At that point, the front door may look secure, but the blast radius behind it is still too large.

Least privilege matters here because MCP-based systems usually involve several moving parts: the gateway, runtimes, identity services, tokens, and the downstream AWS APIs or tools those components need to reach. If one of those layers is over-scoped, the whole system becomes easier to abuse.

AWS's own AgentCore examples point in a much healthier direction. In the FinOps agent architecture, the gateway uses IAM authentication to call runtimes, AgentCore Identity handles the OAuth credential lifecycle, and Cognito client credentials are used so the gateway can obtain tokens for MCP runtimes. The runtime roles themselves are then scoped to the AWS APIs they actually need, such as billing or pricing. That is a much better model than handing the entire agent stack one broad role and hoping the application layer keeps everything under control.

In practice, least privilege here means a few simple things:

the gateway role should only have permission to invoke the specific targets it actually needs
MCP runtimes should only have access to the AWS APIs required for their own domain
tokens and scopes should stay as narrow as they can while still allowing the system to work

The same applies to flow selection. If you use the wrong OAuth flow for the wrong kind of caller, the access model gets messy very quickly. Keeping human and non-human access paths separate from the start makes it much easier to keep scopes, roles, and permissions under control later.

Where Cognito actually fits in this design

Cognito is useful in this design, but it is not the whole story.

It fits well at the front of the system when you want an OAuth provider for inbound gateway authorization, especially if you want familiar OAuth flows, JWT validation, app-client control, and support for machine-to-machine access through client credentials. That makes it a practical option when the gateway, agent, or MCP runtime needs token-based identity rather than a human session.

Where people get confused is assuming Cognito solves the whole access model on its own. It does not. It can help establish identity and broad access boundaries, but it does not automatically solve downstream tool authentication, cross-system credential handling, or fine-grained authorization decisions.

That is why I see Cognito as one building block in the overall design, not the design itself. It works well for inbound identity and token issuance, but once you move into downstream credentials, agent-to-tool access, or context-heavy policy decisions, you need other layers around it. That is where services like AgentCore Identity and a proper authorization layer start to matter much more.

Don't ignore private connectivity

If your gateway or tool layer is reachable over the public internet by default, you are increasing exposure before you even get to identity or policy. That does not automatically make the design wrong, but it does mean you need to be much more deliberate about where your control paths actually live.

Private connectivity is not the whole security model, but it should still be part of it. Once you have agents, gateways, policy services, and downstream tools talking to each other, it makes sense to keep the more sensitive service-to-service paths inside private network boundaries where you can. That reduces unnecessary exposure and gives you a cleaner production shape around the parts of the system that matter most.

It is also worth being clear about what private connectivity does and does not solve. It does not replace authentication, authorization, or least privilege. It does not make a bad access model good. What it does do is reduce the attack surface and make those identity and policy controls easier to enforce within tighter boundaries.

So I would treat private connectivity as a supporting layer in the design: not the main event, but definitely not something to bolt on at the very end either.

A practical reference architecture

If I were putting this together for a real team, the shape would be fairly straightforward.

A user signs into the application through Cognito or another OIDC provider, and the application calls the agent-facing layer with a token that matches that user journey. AgentCore Gateway then validates the inbound token and checks that the client and audience are actually allowed. From there, downstream tool calls use the auth mode that makes sense for the target, whether that is IAM SigV4 for AWS-native targets or OAuth client credentials for MCP runtimes and APIs.

AgentCore Identity handles the OAuth client setup and token retrieval so that each component does not have to manage its own secrets and token logic. On top of that, the application or policy layer still needs to decide whether a sensitive action should be allowed, ideally using narrow scopes and more fine-grained rules where the context matters. Underneath all of that, the IAM roles for the gateway and runtimes should stay tightly scoped to their real responsibilities.

That is much closer to a production-grade setup than the very loose model where an agent has a token and can just start calling things.

The mistakes I would actively avoid

The first mistake is using one broad machine credential for every tool call. Inbound access and outbound access are not the same thing, and different targets should not automatically inherit the same trust model. Once one credential starts working for everything, the system gets convenient very quickly and safe very slowly.

The second mistake is mixing user-delegated access with autonomous machine access without being explicit about the difference. Authorization code and client credentials exist for different reasons. If you blur those paths together, it becomes much harder to tell who actually authorized the action and what kind of trust you are relying on.

The third mistake is assuming JWT validation equals authorization. It does not. Token validation tells you the token is valid. It does not tell you whether the caller should be allowed to perform a specific action on a specific tool or resource in the current context. That gap is where a lot of bad access decisions get hidden.

The fourth mistake is relying on static secrets when a managed OAuth or IAM-based pattern is available. Static credentials tend to spread, linger, and get reused in places they should not. The more agent and tool integrations you add, the worse that gets.

The fifth mistake is treating networking as irrelevant. Identity and authorization matter more, but that does not mean network boundaries stop mattering. If you can keep sensitive control paths private, you should. It is a sensible extra layer, especially once you have multiple services making sensitive calls to each other.

The common thread across all of these mistakes is pretty simple: they make the system easier to build in the short term, but much harder to trust later.

The dangerous version of an AI agent is not the one that can call tools. It is the one that can call tools with vague identity, broad standing access, and no policy boundary.

Final take

Once AI agents can reach tools, the security model has to get more serious. At that point, this stops being just an integration problem and becomes an access-control problem across identity, credentials, authorization, and least privilege.

AWS already gives you the right building blocks to design for that more deliberately: Cognito for inbound identity, AgentCore Gateway for controlled MCP tool access, AgentCore Identity for agent credential handling, IAM for scoped AWS permissions, and Verified Permissions when broader token scopes are no longer enough.

The main thing is not to collapse all of that into one vague idea of "auth." Authenticate every boundary, authorize every sensitive action, and keep every identity and permission narrower than feels convenient. That is the difference between an agent system you can trust and one that only looks tidy until something goes wrong.

If you're working on securing agent tool access on AWS, I'd be curious to hear how you're handling inbound auth, downstream credentials, and policy checks in practice.

Top comments (5)

Anastasiia Gorbatenko • Apr 5

brilliant, thank you! Security level grew up

Archit Mittal • Apr 21

Strong piece. One additional gotcha on AWS specifically: Lambda resource policies and IAM role trust policies need to match the MCP caller identity, not just the function's own execution role. I've seen teams lock down the execution role beautifully but leave the function invokable by any internal principal because they skipped the resource-based policy. Also, CloudTrail + Access Analyzer pays dividends here — you can actually see which agent invoked what tool after the fact, which is otherwise painful to reconstruct.

emilia-ey • Apr 6

Very insightful! I have been working on a project recently that involves mcp and auth and this helped me to understand the angle that I did not see before. Thank you!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.