Kazuya

Posted on Dec 4

AWS re:Invent 2025 - Advanced AI Security: Architecting Defense-in-Depth for AI Workloads (SEC410)

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Advanced AI Security: Architecting Defense-in-Depth for AI Workloads (SEC410)

In this video, Riggs Goodman and Jason Garman present a 400-level session on securing AI workloads using AWS native capabilities. They explain that LLMs perform matrix multiplication without implementing data authorization or identity, emphasizing the critical principle: "implement security outside the model." The session covers four phases: foundational LLM security, data sources (including RAG with metadata filtering and memory namespaces), tools (featuring Model Context Protocol and OAuth integration with Agent Core Identity for two-legged and three-legged flows), and agents (comparing single-agent versus multi-agent workflows using Strands Agent SDK). They demonstrate security through actual API calls and code examples, showing how to use Amazon Bedrock Guardrails, handle permissions with vector databases, and implement deterministic controls like hooks for human-in-the-loop approval. The key takeaway: avoid combining sensitive data, untrusted data, and external access simultaneously, and always authorize data before it reaches the LLM.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Welcome to Advanced AI Security: Setting the Stage for a 400-Level Deep Dive

All right. How many people here is this their first re:Invent? Fifth re:Invent or five or over, ten or over? Any hints? No hands. Okay, so five and over. Welcome to day one of re:Invent. I hope you're enjoying yourself so far. This is your first session, so welcome to your first session. This is an advanced AI security session. This is a 400-level session where we're going to go deep into how to secure AI workloads using AWS native capabilities, open source frameworks, and other things like that. My name is Riggs Goodman. I'm a Principal SA for AI Security at AWS focused on AI security. With me is Jason. Hey everyone, I'm Jason Garman. I'm also a Principal Security Solutions Architect here at AWS focused on our AWS industries customers. So I'm happy to be here today and excited to help you out.

All right. The way we have this presentation structured, we break it up into different phases. The reason we call it phases is that depending on where you are building different AI workloads, we'll determine exactly what type of security you're looking at, whether you're doing things with tools, whether you're doing things with agents, data sources, or other things like that. Throughout this presentation, we'll have these little thought bubbles at the bottom. They might be thought-provoking questions or just additional comments about the presentation. We also have QR codes. QR codes will pop up while I'm still speaking, so it's not just like a click on it and then it goes. We will have QR codes throughout, so if you're looking for documentation, there are a couple of blog posts that we mention here, but just additional information to get the knowledge that you need for what's on the slide.

Again, this is a 400-level session, and the way we're trying to break apart those black boxes is to show it through API calls and also show it through code. People hear LLMs, people hear agents, people hear tools, but a lot of times you just understand that an agent does something, but you don't understand how it works. The overall goal, whether it's showing Amazon Bedrock, AgentCore, Strands, or all those different types of capabilities and managed services from AWS, is to break apart those black boxes so you can understand it from a security perspective and what needs to be done.

How many people here are familiar with AgentCore? There's a reason I put this slide in here. AgentCore is a service that we launched at New York Summit that includes a lot of different primitives in order to build agents on AWS. It includes things like AgentCore runtime, memory, and identity gateway. I put the documentation and the QR code on this to get more information. We're not going to spend a lot of time talking about what AgentCore is. There are a ton of sessions this week if you want to dive into exactly what those are.

Phase 1: Understanding the Foundation - Applications Connecting to LLMs

But with that, let's talk about the first phase. When you're first building anything with generative AI workloads, this is normally how your application looks. You have an application talking to an LLM. It could just be a simple generative AI chatbot, something like that, but it's just an LLM. It doesn't have data sources, it doesn't have tools, it doesn't have agents, or anything like that. This is a 400-level session, so let's dive into what that actually looks like under the covers.

Amazon Bedrock is a managed service. We deploy it in an Amazon-owned account. For your app to actually connect into it, you have to deploy it in a VPC and connect to things like VPC private endpoints. You can also connect through public endpoints if you want to do that. On top of that, you might do things like an ALB or some type of thing that will load balance across your application in order to get access to the application that then goes to the LLM.

Let's talk about security with this. Security groups, roles, and permissions all still come into play when you're talking about generative AI workloads, because those are the traditional controls that still come into play. Then you have to think about things like whether you want DDoS protection, whether you want to do something with WAF at the application level, whether you want to do something with identity, or whether you want to do something with permissions. And then on top of all this, you have things like CloudWatch, CloudTrail, and GuardDuty to get visibility into what the application is doing. These are traditional security controls. Probably 80 percent of what you do with anything with AI workloads is traditional security. It's that extra 20 percent that we're going to spend most of the time on, where you had to do things differently and had to think about from a threat modeling perspective and other things. A lot of that comes down to this guy right here, the LLM.

Breaking Down the Black Box: How Large Language Models Actually Work

How many people could talk about exactly what a large language model is? I want to see one hand. I got one hand. Okay, so large language models. A lot of people view this as a black box because it is complicated. If anybody has ever read the "Attention is All You Need" paper that came out in 2017 or 2018, it goes into granular detail about how to build these transformer architectures.

For instance, to ask questions like, how many R's are in strawberry? Sometimes the LLM gets that right, sometimes it doesn't. But if you look at exactly what the LLM is doing, it is complex math. It's looking at associations between words, associations between tokens. When I ask the question, how many R's are in strawberry, it can convert that into numbers. It can put it through multiple different layers, and then at the outcome it makes a prediction. That prediction could be the number 3. Then it turns it back around again. Then what's the next token? What's the next token until it gets to a stop sequence that says, I am done and you can send that back to the user.

My question on this is, where is identity? In this architecture, do you see anything about rows, columns, tables, anything like a database? The reason I point that out is from an LLM perspective, all it's doing is matrix multiplication—complex math on the data that it's trained on. When you think about it from what it's doing and what it's not, it's not an object store, it's not a database, and it's just doing the complex math. But you had to figure out how to put security around it to make sure if it comes back with something that's not going to leak sensitive information or anything like that.

You can fine-tune the models. You can take one of those large language models that was trained on petabytes upon petabytes of data, and then put your own data with it. This is called fine-tuning, and it's question and answer pairs that you can add to it. But then the question comes into play of what happens when you put sensitive information with a large language model or a fine-tuned model when there's no identity that exists inside that architecture. What type of data or what type of users should get access to that if it does have sensitive information?

Going back to this architecture, we talked about those traditional controls. Now let's dive into this part. This is where we're going to spend the majority of our time today talking about how applications talk to a large language model, add data sources, add tools, and other things like that in order to produce the outputs to provide value to your customers. One of the APIs with Amazon Bedrock is called the Converse API. This is one of the APIs that you can use in order to interact with models on Amazon Bedrock. When you actually call these models, it's a certain API that you can call. You add some natural language query like, shall we play a game? Now this is a very special model that likes movies, and so it can come back with something like, how about global thermonuclear war? Or a nice game of chess.

It's very dependent on what the context is, how you ask the question, what temperature and top P, top K settings that determine exactly what that output is. But remember, it doesn't implement anything with authorization. All it's doing is just predicting that next token depending on what the context is when you're adding stuff into the API. This gets into one of the biggest points that we'll make in this presentation, and it's about data reaching the model. Because the LLM doesn't implement anything with data authorization or identity, anything that you send to the model, either the user or the agent needs to be authorized for that.

Why is that the case? Like I said, LLMs do not implement anything with data authorization. Once that data hits an LLM, it's just going to do what it knows to do. It's going to do the complex math and make a prediction of exactly what the next token should be. It's not saying, okay, who's the identity provider? What type of JWT that it has or anything like that. Because it doesn't have anything with identity in it, when you try to do any type of authorization afterwards, because it's natural language, it turns something that could be a deterministic control into a non-deterministic control because it is natural language. It's about interpretation at that point compared to what data source you're going to send context with or who the user is from an identity perspective and other things.

Implementing Security Outside the Model: Amazon Bedrock Guardrails and Key Principles

The term that we always use is implement security outside the model. Don't hope in an LLM that sometimes acts like a two year old, sometimes listens, sometimes doesn't, to try to implement security inside the model. Now there are some things with the types of content that you want to come back with, or whether you want to prevent hallucinations or harmful content or prompt injections, where you do want to look and do interpretation of exactly what the content is and what it's coming back with. We have something called Amazon Bedrock Guardrails, and the focus on Amazon Bedrock Guardrails is not on the security, but more on the responsible AI. What do I mean by this? It can do things like deny topics. I don't want it to talk about a certain topic. It can do content filters, sensitive information filters.

When I say sensitive information, I'm talking about PII data and PHI data. Because the model doesn't understand identity, it can't determine whether a user should be authorized to access the data being sent to the model or not. Information filters are critical safeguards. We also have word filters and automated reasoning. The overall goal is that you put the user input and then the output through Amazon Bedrock Guardrails, and it will tell you whether it's hitting some of these filters and whether it needs to filter things out.

Identity doesn't exist in these guardrails. There are deterministic controls that exist here, like word filters where you can do a pattern match. However, most of these are non-deterministic controls. For example, if I do a prompt injection and say "ignore all previous instructions and do something," the API has a guardrail config that can be configured to match on prompt injections, word filters, or other things. Depending on whether it matches on something, it can come back with a response saying "sorry, I'm afraid I can't do that."

One of the most important things with guardrails is the information that comes back in the API response. You need to know whether it actually hit a guardrail, what the reason was, and what part of the guardrail it hit. This serves two important purposes. First, it ensures that harmful content, hallucinations, or bias is not coming back. Second, if someone is not trying to do something malicious but is hitting guardrails, you get the visibility you need to determine whether you should tweak the guardrail or add additional context to make sure it works properly for your customers.

In summary on the foundational layer, there are a couple of things to think about. Are large language models deterministic? The answer is no. We say they are functionally non-deterministic because the math with matrix multiplication is deterministic. What's not deterministic is the types of hyperparameters you can use. If you use a temperature of 0.05, the model can be more creative compared to if you set the temperature to zero. It won't predict the most probable token every time.

Can you filter specific data out of LLMs? The answer is no. It's not a table, not a row, not a column, not an object store. Whatever data the model is trained on, you have to assume that a user could get access to that data in the LLM if they ask the right question. Do models do continuous training? The answer is no. Once you train a model, the model is static. It doesn't matter what you put into it; all it's doing is matrix multiplication to predict an output. It doesn't do continuous training on your data. There's no authorization that exists in the model.

Phase 2: Data Sources - Context Engineering and the Authorization Challenge

Phase 2 covers data sources. There are many places you can get data to include as part of the prompt for the large language model. There's data that exists in the model, but also a lot of data that can exist outside the model that you send to it. Things like context engineering or system prompts can be used. You have vector databases or knowledge bases with RAG, where you can do a vector search to understand what data you can include that is similar to the request. Tools are a big one, but I'm going to let Jason talk about that because we have an entire section on that. Memory is one that's coming up a lot with agents, specifically on how you can add additional information, whether it's old session data or facts, that you can include as part of the context to get the LLM to respond the way you want.

We're going to go through each one of these. Context engineering is when you have an authorized user interacting with a generative AI application or agent. You can put additional context as part of that prompt. That can come from user data, system prompts, or other data that exists in the application. The overall goal is that when the user asks a question, you can add additional context so the LLM responds the way you want. All of that goes into the prompt. When we talk about authorization and what data should be included, you have to think about what data you're going to send to the LLM. Is that agent and user authorized for that data to make sure you're not going to leak sensitive information or give them data they shouldn't have access to?

The second one is retrieval augmented generation.

Retrieval Augmented Generation: Vector Databases, Metadata Filtering, and Permission Patterns

Some people call this a knowledge base, and some people call this vector databases. The overall goal is that you have unstructured data that you want to include as part of the search in order to get specific data that you can include in the prompt. Let's talk about how that works. The first thing you have to do with RAG databases is indexing. You have all these documents or unstructured data, and you have to divide those into chunks. We are not going to talk about how to do the chunking strategy, as that gets into implementation of the application. However, chunking strategy is actually very important to make sure that you're including the right data in each chunk. Those chunks are then converted into numbers that get stored in a vector database. The overall goal of the vector database is being able to search in order to get access to the chunks and data that you need.

When you're actually querying the vector database, you have a user query that you send, convert it into embeddings, and send it to the vector database. The overall goal is to find similar chunks to what the user is asking for. If you think of a multi-dimensional space, the user question gets close to some of these chunks, and the database returns back some of those chunks depending on what the user asked for. Those chunks then go into the prompt. One of the things you can do with vector databases is add something called metadata. Metadata allows additional context to be added as part of each individual chunk in order for you to do things like filtering. They're like key-value pairs that you can add on top of it.

One important note with metadata is that it's applied to the entire document, not to every single chunk. You can't say this chunk has this metadata and this chunk has that metadata. It's applied to the entire document. It's important to remember that when you're actually doing this metadata, you can't take a single document that has different sensitive information and put specific metadata in there. Let's say we're at a wizard school and we want to add metadata to this vector database on the different types of defense spells that you have. For example, you have a student year, so let's say I'm a student in year 4. Spell type is charms, difficulty level is 7, and use cases is defense. As part of the query that you send using the Bedrock API, you can do filters with this. The overall goal is to filter out anything that's above year 4 and also filter out anything that is not a defense spell. So I'm matching on defense spell and anything that's less than or equal to year 4.

I can ask that question of the retrieve query, go into the knowledge base, and say, what are the defensive spells that I should know? I include that retrieval configuration and that vector configuration with the metadata in order to filter out anything that's above year 4 or anything that's not defensive. When it comes back, it's going to come back with multiple things. First, the first chunk is going to come back and give you exactly what the text was of the chunk and what the location that chunk came from. So if it's an S3 bucket, it's going to give you the S3 URI. It's also going to include information about the chunk from a metadata perspective. If there are additional chunks, you can also get that additional chunk. It's also going to include the score or the accuracy, showing how close it is to what the user is asking for. You can see on here that one of the defense spells is on year 4 and one is on year 3 because of what the query is, and it's looking for stuff that's less than or equal to that.

Now, one of the important things with anything with Retrieval Augmented Generation or vector databases is permissions. One of the things you have to think about with vector databases is you're taking a data source that has certain permissions and now you're copying it to somewhere else, copying it into a vector database. Permissions is one of the very important things that you have to think about because permissions could be lost when you're copying. When you are copying anything from a data source, the permissions that you have usually stay at the data source. Yes, you can copy them over, but what happens if you change the permissions that exist at that underlying data source? It can cause things where you have to reindex and other things like that. There are multiple different ways that you can configure permissions to make sure that the only thing that you're returning back, especially if it has multiple different permissions, is what certain users should only get access to certain data and other users should only get access to other data. You need to configure that to make sure that you're not leaking sensitive information. Let's talk about a couple of those architecture patterns.

The first one is if everybody who's getting access to that vector database is authorized to any of the data in the vector database, you don't have to do filtering from a permissions perspective. You can do it from metadata in order to filter out specific chunks that you don't want, but from a permissions perspective, you don't really have to do too much because everybody should have access to it. You can do post-retrieval filtering. What that is, is when you receive chunks back, you can look at things like where did this chunk come from, and look at the underlying source database or source data source to see what the permissions are.

One of the examples, and this is the blog that I wrote, is S3 access grants. If it's coming from an S3 bucket, you can say, OK, does this user with this user identity or does this group have access to what this underlying data source is. You can do things like per user and per group vector databases. What this is, is each group or each user is going to have their separate vector database. So the application is making the decision: should I send this user to this vector database? Should I send this user to that vector database? So it separates it.

The last one is pre-retrieval metadata filtering, where you can add a metadata search in order to filter things out. That is a filtering process. It's not using anything with JWTs or anything like that, but it allows you to filter things out as long as you understand exactly what data exists in that data source, which is what the thought bubble is with the data governance.

Memory Systems and Data Source Authorization: Ensuring Proper Separation Before the LLM

Last thing with data sources is memory. Memory is one of the things that I said is coming up more and more with agents, which we'll talk about. There are two types of memory. Short-term memory, which is keeping track of things in the existing conversation. It might summarize the context of what exists if it gets too long, and other things like that. And then long-term memory, where it maintains things like facts, process information, and previous conversations that you can do semantic search on in order to get context with a current session.

So the overall goal, just like you have with RAG or vector databases, is to add additional context as part of the search query that you have in order to send the data that you need to that large language model. One of the most important things with this thought bubble is that it is dependent on the application to configure the memory properly to make sure you have separation with things with memory. For example, if you only want a certain user to get access to certain group information, the same thing that you did with retrieval augmented generation, you want to make sure that you have that separation.

One of the ways to do that is with memory namespaces. This is something that we're implementing with Bedrock AgentCore memory, where it allows you to separate memory using a hierarchical format with slashes and other things in order to say, this is one group, this is another group, this is just this user session, being able to separate that in a database type format in order to give access to the right data that the user needs to get access to.

Let's say that you are working in a restaurant and they have a certain rule about the number of pieces of flair that you have to have every time that you are working there as a waiter or waitress. So as part of the memory, you can say, what is the minimum number of pieces of flair that I have to have? One of the things that memory could be stored in memory is the number of pieces of flair, and it can come back saying that with this policy, you have to have a minimum of 15 pieces of flair. It's similar to what RAG is. It's just using it in a different way, in different contexts. It's not just unstructured data. This is more structured data that you use primarily with agents.

So in summary, authorized users interacting with a generative AI application or an agent with an LLM can have data sources that come from multiple different places. They can come from the query that gets added to the prompt, system prompts, RAG, memory, and tools. There are a lot of different places where this can come from. But the most important thing to think about when you're building security with data sources is that authorization needs to happen before it actually hits the LLM.

Phase 3: Tools - Enabling AI Applications to Take Actions with External Data Sources

With that, I'm going to turn it over to Jason to talk a little bit about tools. Excellent. Thank you very much, Rich. All right, let's talk a little bit about tools. So we went through all of the basics of the large language model itself. We went into some of the data sources using techniques such as retrieval augmented generation. So now let's enter into tools where we want to add additional context into our large language model from other external data sources.

So what are tools? Tools let AI applications take actions. For example, you could control a web browser, you could write a file, you could call different APIs. These are all external data sources that provide additional context or the ability to take actions to the large language model. When you think about using tools with LLMs, you are thinking about how you are going to define why the LLM should choose your tool and then define what are the different parameters that your tool will both take in and will generate back out for the LLM to reason about afterwards. All of that is wrapped up in what's called a tool definition.

When an LLM receives a list of tool definitions and a user prompt, the LLM will reason about what tools, or if any tools, should be selected in order to answer that user's prompt, and then it will generate a tool call based on the parameters that it needs to call with. One very important thing to note here is that the LLM never interacts directly with the tool. The LLM is simply generating context, additional text, which is then interpreted by the application. The application is what's actually calling the tool. This is a way for you to automatically instrument your application so that it will do additional security checks based upon the identity of the user, the type of tool that's trying to be called, any additional parameters, and so on. This way you will be able to instrument that security outside of the scope of the LLM instead of trying to get the LLM to do that security for you.

Let's take a look at what this tool definition actually looks like under the hood. It's a structured format and it's a way that you define what the tool looks like to the LLM. We have an example here for a very simple tool to add two numbers together. You can see the tool definition has a definition of what the name of the tool is. It includes an English language description of what the tool is. This is very important because this is what will provide the LLM the reasoning as to why it should call your tool or not. And then of course there's a list of different properties that are going to be the tool's inputs and a definition of what the tool will be generating as a result.

So what are some security implications from these tool definitions? We'll take a look at those in a second. If we look at taking our tool definition and inputting it into the converse query API, here's an example. We say, here's a message from our user, what is 15 + 27? But you'll see at the very bottom of the JSON structure, here's where we add in the list of tool definitions. We're just going to substitute what we saw from the last slide right there under the tool config tools object in that JSON structure. If the LLM is doing their job correctly, they're going to reason that it should call the add numbers tool because that's what the definition of that tool describes and that's what the user is asking for.

What will happen with the converse query is not that it's going to come back with an answer. Instead, what it's going to come back with is a request to the application: can you please run this tool on my behalf and give me the output, and then I can answer the user's query? How did it know what the input for the tool should be? The LLM is going to be generating those parameters for you and putting that into the converse query response based upon its natural language understanding of what the user's query is combined with the tool definitions that you gave it.

To summarize all of that together, the LLM is going to decide based on those tool definitions and the context, any additional context that was there from either your system prompt or user prompts, what variables and inputs are used for tool calls and really to do a tool call at all. It's going to make those decisions. It has that autonomy and that agency to make those decisions. It's going to convert that natural language context, make that decision, and then policy enforcement you can implement inside of your application because you are in charge of that deterministic call that you're going to make on behalf of the large language model.

This is why permissions and identity are one of the most important things that you can do with tool calls, especially when you're talking about interfacing with very sensitive data or actions.

Once we get that tool call request into our application, we can go ahead and make the tool call and call the converse query API back with the result of our tool call. You can see here in the user segment of the request the result of 42, which is the result that was given by the add numbers tool. You'll notice that this is a multi-turn conversation, so every message both from the user and the request from the LLM to call that tool all get concatenated into the converse query API because we have to give the LLM that context every single time so it can keep track of what was already said and requested. All of those come as part of your converse query API request. The response comes back, interprets the tool result, and gives you a natural language response based upon whatever that tool result was.

Now this can get really cumbersome when you are dealing with lots and lots of tools. The idea of creating tool definitions is very model specific. Different models take different types of tool definitions, and there can be multiple different ways of specifying them. This became very problematic, so a year ago, Anthropic came up with the idea of the Model Context Protocol, or MCP for short. That's where MCP comes into play. It's how we can define a standard way of creating tool definitions and exposing them into a large language model so it can consume those and use those for all of their processing.

Here's an example of what MCP looks like from a communications perspective. You have an application on the left-hand side with a number of MCP clients, and those MCP clients will communicate with an MCP server to go ahead and take an action or retrieve a result. That communication protocol can happen either remotely over HTTPS or it can happen locally on your desktop or laptop using standard IO. Using MCP, AI applications can connect to data sources, tools, and workflows. It allows you to access key information and perform tasks. Think of MCP as this universal protocol for defining these tasks and these tool definitions so that you can connect your AI into those APIs and data sources.

Let's look at how this request flow actually looks under the hood. When we have these different principles, I have them listed at the top. There are five different principles: the end user who is actually using the tool, the MCP host which is the application itself, the MCP client, the MCP server, and then whatever backend APIs that MCP server might be using to get the results. When you launch an application that's using MCP, the first thing that application is going to do is get configured with a list of MCP servers it knows about. It's going to go ahead and query for all of the different tool definitions that are available on that MCP server through the MCP client. The MCP client calls into the MCP server, and the MCP server responds back with a list of tools. Those tool definitions look just like what we saw a couple of slides ago with an English-level description of the tool, a list of the parameters, and the output definitions that gets pumped back into the MCP host. It can store that data away. Now when the user comes back and says they want you to answer a query, the MCP host, that application, can take the list of available tools, concatenate it with the natural language query. The large language model will select a tool based upon that and a set of parameters, which then get communicated back into the MCP server through the call tool command. That may end up triggering additional downstream requests. It could be calling another API or querying a database, whatever that MCP server needs to do in order to fulfill the structured request that it just created. Then that response comes back as a structured response. The MCP server formats that back into the client, and then that gets back into the MCP host with that downstream application, and then that large language model can interpret it.

The large language model can then give the user a natural language response. I'm going to circle back to that question we had earlier when we talked about tool definitions. What are some of the security impacts of exposing an agent in this case to an untrusted MCP server? It's the same thing as we had with those tool definitions. As you can see here, before the user even has a chance to interact with the application, we are already ingesting potentially untrusted data inside of our large language model. Those tool definitions, or in this case the MCP server's list of tools, can include natural language descriptions of those tools themselves. You can prompt inject directly from those descriptions, and these are the sorts of considerations you need to make when you are exposing your large language model based applications into MCP servers and other tools.

Kind of rewinding a little bit, this is how you can create MCP servers in Python code. MCP servers, as I mentioned before, expose tools. There are also two other types of resources that you can expose in MCP. You can expose resources, which are things like files and other things that have a static identifier associated with them, as well as prompts. These are the three different types of things that you can expose via an MCP server. I'm going to focus primarily on tools here.

Here you see an example of using the FastMCP library in Python to create that very simple add numbers tool that we described before. You can see in here it's simply a matter of adding a decorator into your Python application. That add numbers function could come from anywhere. You add that decorator of MCP.tool on top of the function definition. What that will do is take the docstring that was defined there, "add two numbers together," and that becomes the tool definition for the large language model. It will parse through the list of parameters automatically, and that will become the list of parameters and their types as well as the return type that the LLM will be expecting. It actually does a bunch of magic under the hood for you.

What happens here is when you do the MCP communication from the MCP client into the MCP server, it's going to create this JSON RPC structure which then says, "Hey, I want to call your add numbers MCP server tool with these parameters." That's what it looks like under the hood. We use that same flow that we described before where this is a request that comes from the LLM back to the application, the application handles it, and it goes ahead and makes that call for you.

OAuth and Identity in MCP: Two-Legged vs. Three-Legged Flows for Secure Tool Authorization

So what if you want to require authorization as part of MCP for an MCP server or a tool? OAuth was recently added into the MCP specification, not long ago in about June of 2025. OAuth allows you to do both delegated access as well as service authorization to tools using MCP. This allows you to assign distinct user and agent identities so that you can secure those agent actions at scale. There are two different OAuth patterns that we have seen customers use in order to use OAuth with MCP. Those patterns are two-legged OAuth and three-legged OAuth, and we're going to look at the use cases for both here.

Really, the reason why you would want to choose one or the other is this main question right here: What identity do you want to use in order to authorize access to this tool, to this data source, whatever it happens to be? Should you use the user's identity, in which case you would be looking potentially at three-legged OAuth flow, or is it something where the agent's identity itself, sort of like a service identity, would be sufficient, and that would be more of a two-legged OAuth flow?

How do you decide this? First of all, you think about data ownership. If you are dealing with user-specific data such as emails or documents, then you want to think about three-legged OAuth and a user delegated approach versus something that's system or organization owned data that might be shared resources. Then you might be able to use that service approach of two-legged OAuth.

User interaction is another consideration. If the user is present and can perform an authorization step, that again is a user delegated action.

Versus something where you have no user interaction and maybe an automated system, for example. That goes into the operation timing consideration. And then finally, permission scope. If you're thinking about how permissions may vary by the user and their consent choices versus consistent permissions that are designed at the agent level, think about these things. We're going to look at how we can actually implement some of these as well.

Identity really at the end of the day is one of the most important aspects of architecting tools securely. We know that from our traditional workloads. This is not new technology or new terminology, but it is now something we need to figure out how to apply appropriately inside of our AI application infrastructures as well. So how can we actually implement it? Agent Core Identity is going to be the primitive service that you will use inside of AWS. If you're using AWS Agent Core, then this is how you can implement a two-legged or three-legged OAuth flow for your MCP tool calls.

If we think about creating an MCP server configuration that has a tool that we need to provide user-delegated access to, that's going to be a three-legged OAuth call. In this case, if we have a tool that is going to search a personal travel log for a variety of information—here's what I went to Vegas on this date and time, and so forth—that sort of information should be locked down to only authorized principals to access it. Here's an example of using that FastMCP Python library. We can see creating that MCP tool with the tool decorator at the top, but now we've also included a new decorator, which is the requires_access_token. We've included the idea that we need to use user federation or a three-legged OAuth flow for access into this particular function call.

This is going to get us a token that's associated with the end user so that we can make further downstream authorized calls to get those logs, for example, and to create that request and response back to the user. With three-legged OAuth, you see that last one where it's the on_auth_url, so that is going to be the URL that the user will be redirected to so that they can grant that authorization. In contrast, the two-legged OAuth is going to use a static token that's associated with this particular agent. It's a machine-to-machine authorization flow, and so that's going to have no authorization URL obviously because there's not going to be a user there to authorize that access. But it's going to provide the MCP server here in this case a static token that can be used to access protected resources behind the scenes.

Two-legged OAuth expects client credentials, whereas three-legged OAuth expects authorization codes with user federation. So what does this mean from a security perspective with tools? Well, first of all, you need to set the right permissions for tool calls and what identity you want to use. You want to understand as a decision maker what the LLM is going to be authorized to do. Decide what are the different decisions that you're going to delegate to the large language model. Are you comfortable with it deciding what parameters to call to this particular tool?

Think of it this way: if a tool has a parameter that says authorized username, am I going to let the large language model decide what value to put in for the authorized username for my parameter for my tool call? The answer should be no. The identity will be piped through to the tool call through OAuth or some external mechanism, and then the LLM's decision-making power is limited to what are the different parameters maybe for that query, right, as opposed to who that identity is in the first place.

MCP provides a standard way to connect the AI with the outside world, and finally OAuth is your standard for communicating whether a user or service is authorized for an action. And again, don't forget that this all comes down to putting things into a context window for the LLM. Everything just gets pushed into that large context window for the large language model, and where is the authorization inside the LLM?

Phase 4: Agents - Delegating Decision-Making Power Through Agentic Loops and Workflows

It's not there, right? Very important to keep in mind. Alright, phase 4: agents. We're moving up the stack, becoming even more complex. Here we have agents where we are delegating even more decision-making power into the large language model. Before, we had very specific workflows that were well-defined inside code, but with agents, we're taking away some of that decision-making power and giving it to the large language model.

An agent has many definitions. I think every software company defines agents differently. The way we like to think of it is as a software program that can interact with the environment, collect data, and perform self-directed tasks that meet predetermined goals. It's a goal-oriented architecture rather than a task-oriented one like step 1, step 2, step 3. AI agents act autonomously without constant human intervention. They interact with the environment by collecting data sources and combining that environmental data with domain knowledge and past context. They use tools such as agent core memory, which Riggs just talked about, to learn from past interactions and improve over time.

How does this actually work under the hood? Everything with agents works with what's called an agentic loop. An agentic loop is a piece of deterministic software that enables intelligent autonomous behaviors through a cycle of reasoning, tool use, and response generation. A prompt comes in as the input, and the agent can invoke the model to get a response and reasoning to figure out what tools should be called as part of this agentic loop to respond to the user's query. The agent takes that response from the model, which could include a request to call a specific tool. The agent then executes that tool and gets data. The result of that data gets back into the agent, which returns it back to the model, and this loop continues until the model decides that the answer is complete. The agentic loop then terminates and returns the final response to the user.

It's important to realize what parts of this diagram are deterministic code and what parts are non-deterministic AI-based large language models. We have a tool library called the Strands Agent SDK. How many of you have played with Strands in this audience? I've got a couple of hands up, that's excellent. Strands is a great library that allows you to create production-ready agents in just a few lines of Python code. We've got a QR code up there if you want to learn more about it.

What does it look like when you're actually building an agent with Strands with the agent class? It's one of the key components of the Strands SDK. We put together pseudocode for what this looks like. Inside the agentic loop, we can see it is a loop. While true, we're going to call the language model based on the user's prompt. We're going to give it a list of tools and the messages we've already processed through this agentic loop, and then we're going to see what the model thinks we should do. We're going to ask it to make a decision for us. Should it say that we are done, that's our end turn, and we'll return the final answer back to the user. But most likely, the model says we need to call a tool to further answer this user's question.

I'll go ahead and take a look at the tools it has requested me to execute. Again, this is deterministic code sitting inside the Strands Agent SDK, and it's going to call that tool for each one of those tool requests and add the results back into the messages that are going to be concatenated back into this agentic loop for the next time around. Of course, you have error conditions. If I've overflowed the context window, I have to handle that. But you get the idea that this is a constant loop happening, and it's basically a loop of LLM asking what should I do next?

It executes that action on your behalf, and then reports back when it's complete. So what changes with agents? Generative AI applications are really about answering questions, maybe one shot or a couple shot questions. Agents accomplish goals. Agents are all about autonomy because they can act autonomously. We are delegating more decision-making power into the agent to achieve a goal by planning and making its own decisions. They become more complex because they can be multi-step, complex workflows. They do increase our risk profile because we are giving agents more autonomy, we are taking on more risk. We don't necessarily know what the LLM is going to decide for us and whether it's aligned with what we want it to do. So we're going to put guardrails around it. Finally, there's learning and adaptability. Agents can use memory to actively learn and adapt based on the outcomes of their previous actions.

So what do multi-agent workflows look like? There are a couple of different ways you can implement agents. One way is through a deterministic workflow. Let's start with an example where we want to take incoming logs from a SIEM, enrich them, figure out what we should do, and start interacting with our ticketing system. We may even want to do some auto-remediation actions based on existing runbooks that we have. I'll show you two different ways that you can organize this.

In this first way, we're going to have several different agents, each one with a specific task, and we are going to wire them up as a specific workflow. We have our enrichment agent that's going to query the SIEM logs and check threat intelligence to see what it might know about the particular indicators that we're seeing from the logs. That will be fed into a separate agent, a triage agent, to figure out what we should do about it. We can take a decision from the result of that triage agent. We could either execute an automated remediation action which is going to block that IP from our firewall, or we can just go ahead and document what we did. It might be a false positive, but we're going to update the ticket with the reasoning that we have in the resolution agent, and that is the end of our multi-agent workflow. So this way we are breaking apart our workflow into multiple different agents and then we're going to wire them together in code.

In this design, it's a specific workflow that's followed with each agent having specific tools that they can call. You can see here in our pseudocode that we are defining each one of those agents, giving each one of those agents its prompt for what it should be doing, and then it has a list of just a couple of tools associated with that agent. The first enrichment agent has two tools associated with it. The execute agent has a tool associated with it. The triage agent has no tools because the purpose of the triage agent is simply to render a decision based upon the previous context that we've collected. Each agent has a small number of tools with which it's going to be working.

Then what we're going to do is wire all of that together in this graph-based workflow. We'll define our should_execute_action, which is going to take the result of our large language model's decision that it makes as part of the if-then statement that you saw in the flowchart in the previous slide. We have the security workflow, and it's all wired together as you can see there. We have a graph builder, we've added a node for the enrichment, we put the result of the enrichment into the triage node, the triage node enters into the execution and resolution nodes, and so forth. The point here is that we are still relying upon deterministic code for the actual workflow and wiring all of these pieces together, but we are now delegating smaller pieces of decision-making and contextual generation into the large language model.

This is different from another way to design the system with the same goal in mind, in which we are going to use a single agent.

If I wanted to create this and use one agent to do this entire process, I do something a little similar but very different, and it's a nuance. So let's take a look at how this works. We still take our prompting from the user or from an automated process. We put it into this agentic loop, but in this case, the agentic loop is going to have access to all of the tools and all of the capabilities all at once. And so the agentic loop is going to be doing a lot more of the decision making and in fact there is no set workflow here. The agentic loop is going to rely upon the LLM to make a decision about what to do next for every single step of this operation.

So the LLM could simply decide it could query the SIEM logs and then go straight to updating the ticket without even checking the threat intelligence, let's say, right? So this is delegating a lot more autonomy into the AI system, which can be very powerful. However, you have to weigh those risks with how much control you want to have on how that workflow is actually executed, right? Do you want to be more prescriptive about it and use the multi-agent workflow, or do you want to have more autonomy that you've delegated into the AI and you're going to use the single agent design?

So with this design, right, there's one agent it autonomously decides which tools to call and you can see literally there is one agent here in our definition and that agent just happens to have all of the different tools associated with our design at its disposal. So now we have that agent has a property that says you are a tier one network analyst, here are the tools that you have. Give it some alignment and tell it what you want it to do, but ultimately the agentic loop and the LLM will be deciding how it's going to walk through and use the tools at its disposal rather than a deterministic system wiring them all up ahead of time.

Human-in-the-Loop Controls and Final Security Principles: Protecting AI Systems at Scale

So compared to the multi-agent solution, there is way less complexity here as you can tell. However, as I mentioned before, you're giving the agent a lot more autonomy to decide what actions to take and when to take them. So how do we put human in the loop in some of these things, right? We would like to have all this autonomy, but we also want to have the ability to have some way to put human approval in there. This is where you can put things like Strands hooks into play. So hooks provide deterministic controls as part of your agentic loop.

So for example, you can control that autonomy and perform checks as part of the agentic architecture. I'm going to give you an example of an approval hook in this case. So let's say that we have that high risk security remediation action such as blocking an IP from the firewall. We don't want to necessarily have that happen without somebody looking at that first, lest it interact with our production systems. So in this case we can define our approval hook and we can say here are some high risk tools such as block_ip_firewall.

And so when we see that tool call, we can go ahead and say, here's a request for approval, and now we can maybe send a Slack message, we could send a ticket to the appropriate team, and then they can review it and approve it or deny it before the agentic loop can continue. So there's a lot of use cases that you can have for hooks. We just talked about one example, you can have intent breaking, goal manipulation, you want to understand and control for disruptive or deceptive behaviors, all that kind of good stuff.

So in summary, what does this mean from a security perspective with agents? Well, we understand now the agentic loop enables autonomous actions while having an ability to still inject deterministic controls into play. Human in the loop is going to be one of those ways that you can keep understanding of what is going on with your agentic systems, even if you do delegate a lot of autonomy to them. Agent design does have some implications here. We talked about single versus multi-agent, and of course you always want to understand the agency that you're delegating into the AI system.

So in conclusion, we want to understand that security and risk in AI systems is dependent on what the application or agent has access to and what actions it's authorized to make on its own. So this is a very classic thing here. This is sensitive data, untrusted data, and external access. You want to pick two, not all three, because at the center of this Venn diagram is danger, right? If you have an LLM that has access to all three of these things at the same time, this causes security issues. You always want to think about that.

All three of those provide a concern. Place security controls outside the model. If you take nothing away from this session, take a picture of this slide, you always want to have deterministic control around that data. You want to validate your input and output, and you want to use identity to do your tool calls, protecting the entire flow. Agentic identity is a combination of your agent identity and any human identities that you're operating on behalf of.

And finally, I'm going to give you one last slide, which is a list of all of the resources from earlier today. We've got a bunch of different QR codes for you to take a look at for our security blog, reference architectures, secure AI landing pages, and then we have over 100 different sessions today related to AI security. So thank you so much for attending. Have a great re:Invent. Thank you.

; This article is entirely auto-generated using Amazon Bedrock.

DEV Community