Kazuya

Posted on Dec 6, 2025 • Edited on Dec 8, 2025

AWS re:Invent 2025 - A practitioner’s guide to data for agentic AI (DAT315)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - A practitioner’s guide to data for agentic AI (DAT315)

In this video, Tim and Siva from AWS present a practitioner's guide to data for Agentic AI, tracing the evolution from 2023 chatbots to 2025 autonomous agents. Using an auto insurance example with customer Terry, they explain the three pillars of Agentic AI: reasoning (LLMs), action (tool execution), and memory (context management). The session covers the ReAct loop, Model Context Protocol (MCP) for standardizing agent-tool interactions, and various caching strategies to optimize performance. They demonstrate how to transform traditional applications into agentic experiences by exposing APIs as MCP tools, address data governance challenges through data marketplace architecture, and show how trusted identity propagation works using JWT tokens. The presentation concludes with a comprehensive reference architecture using Amazon Bedrock AgentCore, emphasizing the importance of building specialized data APIs and implementing end-to-end security from day one.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction to DAT315: A Practitioner's Guide to Data for Agentic AI

Alright. Hi, welcome to DAT315. Hopefully some of you are practitioners in data. Maybe, right, I'm a practitioner in data. Maybe you're a practitioner in Agentic AI, anybody? Well, by the end of this, hopefully you'll probably be practitioners in data and Agentic AI together, so that's what we aim to do. My name's Tim. I'm a senior principal engineer in Aurora. I'll be joined shortly by my friend Siva who's the director of worldwide specialist solutions architects. And we're going to talk about a practitioner's guide to data for agentic AI.

So, if anybody thinks they've completely got their arms around all of this, congratulations, you're doing better than most of the rest of the world. This is changing really fast, right, so the first thing I want to do is just look at this evolving landscape a little bit. So we started off in 2023, not very long ago. We had these chatbots, we were talking about vector search, we were talking about LLMs just beginning, right? We had these single shot interactions with the agents, that's what we could do.

Then in 2024 we got these more advanced chatbots, we got context, the ability for an agent to remember a little bit about what it was doing, to do hybrid search, to bring in other data types in here, we had RAG, Retrieval Augmented Generation, that was all the rage last year. That was so 2024, right. Then near the end of 2024 came MCP, Model Context Protocol, which gave these agents some tools to talk to each other and data sources that standardized things. And then now 2025, nearly the end, we got to autonomous AI agents. They can reason and plan and they can execute tasks.

This all happened in, count them, just a few years, right, and we're still, I don't think anywhere near the end. But now we can have a much more natural conversation with our agents and they can do things much more autonomously for us. So today we're going to follow an example. We're going to buy some car insurance, which is I think what everybody loves to do all the time. Buy some car insurance. And we're going to pretend that we're Terry who wants to buy some car insurance.

And back in 2023, your AI experience would have been interesting, but it would have been probably just a thin wrapper around some pre-processing, and then hand off to a human to actually do the work. Now in 2025, with our auto insurer example, the agent's able to have a much more fluid conversation, and it's able to actually do probably the entire task for Terry rather than handing off to a human. Whether we like it or not, that's what's going to happen, that's what we're going to dig through today.

The Three Pillars of Agentic AI: Reasoning, Action, and Memory

So we'll cover some fundamentals, we've got these three pillars of Agentic AI. So first off we have reasoning, this is where we use our large language models. And then we have action, we're able to take action from these models, we're going to call some tools, we're going to do some things as instructed by the LLMs. And then we have this memory, we have to remember things, otherwise it's like talking to a goldfish who can't remember anything, right? It's very frustrating, not very useful.

The agentic AI thing, the special sauce there is we put this in a loop, we're able to iterate this, and if you've done any computer science stuff, writing a program without loops, pretty difficult, pretty boring. Once we've got this loop, this ReAct loop, reason and act loop, now we're in agentic AI space. That's where we are today. Okay, so AnyCompany car insurance. Terry needs to buy some car insurance, we'll come through this loop probably a ton of times before we make some answer like your insurance quote is ten dollars thanks for your business.

Okay, so we'll dig in just in one iteration of that loop to start with, reasoning and planning. So the user puts in their request. Inside here we have the system prompt. We're managing context, the system prompt comes in, we have some existing state maybe, we have the user request come in saying I need insurance for my Buick or whatever. Next part, we go through the LLM. The LLM does magic stuff. I'm not a scientist who can tell you about how LLMs actually do their work. And then we pass the output from this thing. We make an action, the agent makes an action plan. We work out what the next step in the action plan is. We might be stopping, we might be sending a response back to the user.

We talk about this acting and managing context thing, because that's where we can first start to talk about data, right? After we've called a tool or maybe we call the tool to get the data and send a response back to the user. So that tool does whatever the tool's going to do. And then it maybe accesses some data store, this is what we're going to talk about today. And then we're going to feed these results into this context manager.

And that's going to maybe do some caching, it's going to do some updating the conversation history so we can remember, and it's also going to evaluate what it was doing for reinforcement learning, so that we can make a better experience of this iteration for the next time we come through, maybe for the next customer. That's important as well.

Walking Through the ReAct Loop: Car Insurance Example with Context Management

So, knowing all of that, we'll actually step through our car insurance example just a little bit, and I'll need you to pay attention to the colors here. So the first interaction says from the customer, Terry, I need some car insurance. We get the system prompt, that's the purple thing here. We go to the agent, we give that, we give whatever other context we have, maybe from previous conversation history or something. We format this up, we give it to the LLM, it does whatever it does, we get the output out again. We determine the next step, and maybe the next step is to call a tool.

Now this is the first interaction, so we don't know anything about this customer yet, so we need to just call a tool called get driver details. We need to work out what this driver's history is about. So this gives us a purple kind of result. So now we're going to go around this loop again. You can see in the state we've remembered the purple result, we remember the driver's information and now we're going to go around and we're going to invoke a different tool to find out about the vehicle's information. Again, the pink information comes in, we remember that, now we're going to calculate some risk about that vehicle and that person. Again, we're going to get a quote, so see how the context is stacking up.

Finally, we have enough information to generate a response for the user, give it to the user and wait for their input returning again. You'll see how the context, all those colors, is stacking up together. So this is great. What happens though when we need to keep going? That context is not infinitely large. I'm a really good artist, you can see from my great colors, so we need to do something called compaction on our context here. What we've done is we've merged those colors.

So there's a few different algorithms we can be using here. You don't have to worry about this too much. The framework provides it for you, but we can be doing summarization. We can be asking the LLM to summarize it for you. We can be doing simple like least recently used kind of thing in there. Different algorithms to throw away the bits that we maybe don't want and remember the bits that we do. The important thing is that we've gone around this loop seven times in this case, and we've remembered most of what we've done. That's the ReAct loop.

Understanding Memory Types: From Short-Term Agent State to Long-Term Prompts

So the memory thing, this is the interesting part. We've got this continuum, short-term memory on the left, long-term memory on the right. We're going to dig into this some more agentic memory. In the short-term memory we have this agent state. This is kind of like the agent's RAM if you like to think about it that way, it's working memory. This is where the framework's probably dealing with it for you. You probably don't actually have to code this thing up yourself. You want it to be fast, it might be like ElastiCache for Valkey here. Strands Agents as a framework, LangGraph as a framework, they'll typically provide these things for you. Right, and they're simple to use, small, short-term.

And in the middle, we have this medium-term semantic memory, episodic memory. So this is typically we're retrieving it using tools. We're going through a tool to get to the source of the memory or to put stuff into the memory there. This is where we'll talk about enterprise databases, vector stores, other things like that. So you probably are more aware of what's going on here. Then on the far right, we have prompts. So this is more you're probably treating them like Infrastructure as Code kind of things. This is where your builders are putting in the system prompt to say you are an insurance agent, here are your guardrails and things like that. You're probably managing like code in a code artifact repository or a key value store or something like that. You're not accessing them very often. You're not modifying them very often. That's the way to think about these three kinds of memory.

Okay, so now we're just going to see a little bit of code. I'm an engineer, this makes it look easier to me. We have two different examples here. I've chosen Strands Agents and LangChain and LangGraph. Now, if you don't follow code, that's completely fine because there's not really very much here. All we really need to focus on is this couple of lines at the top there which says I'm going to use an AgentCore session memory manager in this case. You could use another one. I'm going to use a conversation manager, summarizing conversation manager in this case, that's one that uses that summarization algorithm to smoosh the colors together like I showed before. And then I'm just going to pass it down to the framework, that's it. In the LangChain version it's pretty similar, they just use a different name for things and bundle it all together into a thing called a checkpointer. Same idea. So you're not really writing all this code, the agent's probably writing it for you anyway, right? But very simple.

Connecting Agents to Data Through MCP Servers and Tools

Okay, so we'll dig in a little bit more, we'll just talk about how we connect the agents to the data. So we get our inbound user query. We'll go through an MCP server probably, we'll dig into what that means in a little bit and we'll talk to some database to maybe get customer profiles. And then we'll go around again, maybe we'll talk to a different database to get vehicle data and telemetry, and we'll go around again, and you can see we're talking to different kinds of databases, maybe they're relational ones, maybe they're key value ones, maybe they're vector ones, maybe some of them belong to our company already, maybe some of them are third party ones like the Department of Motor Vehicles in my example. You're calling these things all through tools, MCP service. Right.

The same basic pattern is applied across everything here, and that's the joy of MCP. We don't have to write all this special code to deal with this.

So how did we get there? Those tools, we need to find them first. In the same example, we just look at the other part there. I'm creating an agent, instantiating an agent. I'm creating a Bedrock model to go with it, and I'm giving it a list of tools. There are four tools there, search documents and so on, and a system prompt that says you are an insurance agent. Now I'm not in love with how I had to specify or hardcode the tool names there, so I'm going to come back to that one in a minute, but you can still see this is pretty simple.

The Power of Caching: Four Strategic Layers to Optimize Performance and Cost

Going to change gears a little bit and talk about the power of caching here for a minute. So you know all the fundamentals here. You would've probably noticed that every time we go around this loop, we seem like we're doing a bunch of maybe duplicate work. We're fetching some data again, over and over again. Maybe some of that will make things slower, maybe some of that will cost us some more money. We don't like that. Caching is a pretty natural approach here.

So here's our similar picture and we'll talk about places where we can put caching into this. Why would we do it? We can reduce latency, that's pretty standard. These LLMs, large language models, they're not called SLMs, small language models, right? They have a bit of a cost to them. So if we can avoid calling them or we can avoid using up their token space, we can save ourselves money and time. So caching's a really good way to do that. Even if the backend system is not an LLM, if it's some other kind of database, it still has a finite capacity to deliver some performance, so reducing load on that's probably good as well. And again with this reinforcement learning thing, caching is a really important way to get signals to see if something is popular. So if I'm getting lots of hits in the cache, then it's probably a popular thing and I should cache it more and I should update my plan to do more of this thing.

So where can we insert some caches? Over on the right-hand side, this is the closest to the data source, so the lowest in the stack if you like. This is probably the most familiar to what you do already today. You've got a database, you put a cache in front of it, right, an ElastiCache or something like that. This makes your tool responses faster, but it doesn't do anything to avoid going around this loop. It just lets you go around this loop a bit quicker. So it probably doesn't save you any money or context on your LLMs.

Next step up, we can put a cache on in the memory context manager just after it's called the tool. This is on the tool results. So this might reduce your LLM usage. This is the first one that does this. You can reuse an existing action plan maybe and the sequence of tools that can be called, so you don't have to ask the LLM to do this every single time. This is pretty optimistic if it helps you, but it's useful to think about.

Next we can cache on the other side of the memory context manager, on the output side if you like. So this is within the scope of a user's session probably. So if your session contains a lot of cacheable information, messages that don't change very often, like in our example, we're going around and telling them the vehicle information over and over again, that kind of thing, that's the kind of thing we'll be using here. This won't affect your LLM usage appreciably, but it will make things go a lot quicker.

And finally, you can implement a semantic cache which is as close to the user as you can get before this whole loop iterates at all. So if this works for you, this is really, really powerful because you can avoid calling everything. Now this works for things like where you don't have personal information inside your agent responses. Maybe your agent was being asked a question like, just tell me the legal requirements for insurance in Nevada, right? There's nothing specific to me about that, so I could cache that response. That's the kind of thing we would use there. So this saves latency, saves cost, makes your customers happier because their interactions are more smooth.

Drawing Parallels: Agentic AI and Relational Database Architecture

Right, so now you're all kind of experts, or maybe you're all just bedazzled. So one thing, I'm a crusty old Unix Greybeard, right? And so I would like to share with you a bit of an analogy to see if this helps understand what's going on here a little bit. So I'm going to draw an analogy between Agentic AI and relational databases. If you know relational databases, great, maybe this works. If you don't, don't worry, it's not going to be on the take home assignment.

So first we'll talk about the interface. What's the interface into a relational database? Well it's SQL and it's JDBC. It's all this very standardized stuff. And Agentic AI is getting that as well with MCP so they're kind of similar. Then there's this idea of memory we've been talking about along today. You've got short-term memory and long-term memory for Agentic AI. And inside a relational database, of course you've got memory. You've got B-trees, you've got heaps, you've got indexes, you've got all that kind of thing.

Then there's the execution angle to this, so we just talked about this ReAct loop where we plan and then we execute the steps of the plan.

Well, in a SQL engine, you have a SQL query planner that's planning how to use indexes and heap sorts and all those other kinds of things, quite similar. And we've got caching we just talked about, that's very one-to-one. Databases work heavily on caching. And finally, we have this thing we call canned execution.

So with MCP, you can put whatever tool you like behind an MCP server to execute whatever operation you want. And inside a SQL engine, you have stored procedures, you have PLPGSQL, and lots of other pre-canned execution. If that makes you draw a complete blank, I'm sorry. If this helps, maybe it's good for you. Alright, so Model Context Protocol, we'll dig into this a little bit more.

Model Context Protocol (MCP): Standardizing Tool Discovery and Invocation

So this standardizes the interaction between your agents and the tools that they're calling in two ways. It lets them find the tools that they're going to call, work out what they're for, and it lets them call them. They're two very important separate steps. So here's our car insurance agent. Inside that agent, there's an MCP client, standard Python. It's talking to an MCP server. The MCP server advertises, in this case, three tools: get owner vehicles and so on. The tools have parameters. They have some documentation that goes with them to tell the humans and to tell the tool what they do.

That tool probably calls off to a database, so it's just doing normal stuff. It's like running a regular database driver, talking to a regular relational database, maybe. It's nothing magic once you get underneath the MCP server. So we'll zoom in, next level down. That agent embeds that MCP client. What is the MCP server actually doing?

So inside the agent, we have a tool catalog. It's now got two tools inside it. How did they get there? Well, the MCP server advertises these two tools. There's an MCP operation called tools slash list. MCP works on JSON-RPC, so the first thing that the MCP client does is say list me the tools. And you'll get vehicle by VIN, get vehicle owners, get owner vehicles. We know how to call them and add them into our catalog. And the agent understands then how it's useful to call that tool, so you don't have to tell the agent call the tool which is named X with parameters Y. It's already worked all this out.

That's the job of the tool catalog, which got discovered by using MCP. So that hard coding that I didn't like in the example gets replaced by this. We don't hard code tools anymore. You can if you like, but you don't have to. Now we can have many MCP servers plugged into one of our agents, one of our MCP clients. So we can continue to build our catalog by querying multiple MCP servers along the way, and this is how we build, for example, our internal data stores, and then we talk to the DMV ones and the other ones that we're trying to build up over time. So we're building this catalog. This is all internal to how MCP works.

So let's implement an MCP client and server, those purple boxes that we had before, a little bit more Python. On the left, standard MCP SDK, nothing to do with me. All we're doing is we're creating a client session. On the right-hand side, MCP server, this is the thing that's maybe off talking to your database and so on. Important part on the client side, we initialize a session and we call that list tools function. That's it. On the right-hand side, there's actually no Python code there at all. That's interesting. It's just a couple of decorators, the little pink things, and all they say is this function is the function I want you to call when you want to list the tools, for example, or call the tools.

Choosing the Right Tools: A Taxonomy from General-Purpose to Specialized Operations

So the takeaway for you here, I hope, is that it's a very, very low barrier to entry here. Making an MCP server is not too difficult. Just put a couple of annotations on the code that you probably already have. Alright. So as we go on, we're going to talk about how to choose the right tools for the right job. So I've kind of made this taxonomy here. We've got the X axis, which is read-only data retrieval versus data mutation. And the Y axis is whether it's a general-purpose tool, whether it can do pretty much anything, or a specialized tool like we've been talking about so far.

So on the top left there, we have general-purpose retrieval tools, read-only tools. So this is maybe your business analyst inside the insurance company, which needs the full power of a SQL query, a read-only SQL query, and it wants to call this through an agentic AI workflow. So this might be the MCP server tools that we ship inside AWS for a bunch of our databases like Aurora DSQL. And on the right-hand side of that, you've got the data mutation version of that. This is very powerful. This is basically general-purpose SQL execution or something like that. They can create, they can delete, and they can read general-purpose data in your data stores. You're probably not giving that to your customers. You're probably giving that to your internal developers.

Maybe you're accessing that through Keiro or something like that, Claude Code. Now down at the bottom, the specialized read-only calls, that's like what we've been making in our examples so far, the get vehicle by VIN and that kind of thing. So it's read-only, but it's specialized. I can't go and learn about tractors. I can only learn about cars or something, right? And then the data mutating versions of that, the specialized ones. So in our example, this is maybe when we get to the end and we want to save the progress on the quote, or we want to calculate some premium or something like that.

So this is your highest business value is packed into probably that bottom row there, the specialized ones. They're the tools that you're building for your customers to call. They're for frequently executed tasks, so it's worthwhile you investing the effort in there, and they'll give the higher performance as well. So now that we have this taxonomy, we can see how that maps to the personas of the users that you're maybe trying to have inside your business or as your customers. So the top left, you have the data analyst or the engineer. They're doing those read-only queries against the general-purpose data.

If I took the words MCP out of your mind right now, this picture probably looks pretty familiar anyway, right? This is what business intelligence tools and so on are doing. MCP is just giving you a different lens to think about it. Top right, that's your developers and your DevOps people, and down the bottom it's your end users and consumers, depending on if they're doing reads and writes. So hopefully that sets the framework here so that we can think about this some more. I'm going to hand over to Siva now, and he's going to tell us about the data behind the tools.

Data Sources and Business Challenges: From Operational Databases to Data Silos

Thank you, Tim. So good afternoon, everyone. So where does the data behind the tool come from? And you're probably wondering, I already have a lot of challenges on my data side. You know, I have data quality issues. Maybe I need to figure out how to get data lineage. And then how do I have an existing system that's working that maybe you have some APIs? How do I convert that into an MCP tool, hand this off to the agent, and what happens to security along end to end? This is what customers are asking us.

The next section I'm going to go through, how to address these challenges. Also, you know, is there a, as an architect, I always think about is there a reference architecture that I can give customers? So I kept the best for the last, so I think we're going to end the presentation with the reference architecture and a call to action, obviously. So let's dive in. So where does the, on a technical level, where does the data come from? You know, the data comes from your operational databases. This could be MySQL or Postgres, Oracle, or SQL Server that you're running, or it could be NoSQL databases such as DynamoDB, DocumentDB, MongoDB, and other things.

It could also come from your data warehouses, you know, data warehouses such as Redshift or Snowflake or others, right? Or you probably have data in an open table format like Iceberg, and you probably have this data in a Parquet format on top of S3, right? SageMaker, you know, Lakehouse, right, or other lakehouses. Or maybe you have streaming data coming into your organization that is in your streaming data source such as Kinesis or Kafka, right? Now let's look at this challenge from a business level, right?

This is AnyCompany.com Insurance, right? This is an example of a company that Tim alluded to, that Terry is going to buy an insurance from this company. When you look at this, in any company, there's multiple departments, right? For example, the claims history comes from the claims department. They are data producers of this dataset. And then maybe the legal department has information about rules and regulations, and then potentially this company sources data from the DMV in terms of accident records and other things, right? If they have pay-as-you-go insurance, maybe that data has to be sourced from some telemetric datasets, right?

So essentially you have data producers and data consumers in the setup. And then, well, any time you have multiple departments, producers and consumers, you're dealing with data silos. And you know, data quality issues come in. You know, if for example the sales and marketing department have information about the customer, you know, all the fields have to kind of be merged, and then the customer 360 should reflect that, right? And then the data lineage issues, right?

How do we track data lineage? There's personally identifiable information such as Social Security numbers. If there is a collision and there's a claim, then there's some medical information involved. You cannot pass this on to every user. And then access control: who has access to this, what are their entitlements? All of these challenges get magnified in the world of agents because agents are applications. They're going much faster than humans. They're accessing and they know how to navigate various paths. You want to make sure you're sending them along the right paths, that there's proper governance along the way and proper security checks along the way and filters along the way so they don't hand off the incorrect data set to the end user or an internal user.

Building a Data Marketplace Architecture: From Data Products to MCP Tools

So how do we address that as an architect? You're probably familiar with the data marketplace architecture. People call this data mesh and other things. I'm going to use the term data marketplace architecture. What this architecture says is let's imagine our company as data producers and data consumers. For example, the claims department is a data producer of claims. And then the underwriting department, which consumes this data set and evaluates the customer risk, is actually a data consumer. Typically, data producers are responsible for data quality because they produce the data. It's probably much easier for them to actually put some data quality rules in place and ensure that the data they produce actually conforms to the standards that they expect the data to be at.

And then data consumers like the underwriting department typically discover the data set and then they subscribe to the data set. So we can use these primitives to help us address some of these issues as well. Let's see. There are probably three steps that you want to do. First, build a data products marketplace. Now when I say this to customers, they often ask us, well, that's going to take a couple of years, maybe more to do that. So here's where I think we need to be a little more pragmatic. You can boil the ocean and build all these data products for your company, and you should probably keep on doing that. But as you identify some important use cases for Agentic AI, you should probably prioritize those use cases and build data products relevant to them, identify the data producers and consumers, and get started there to be fairly pragmatic.

And then now how do we feed these data products to agents? Let's build some data APIs on top of them. If you have a claims history, given the customer information, it could actually get all the claims information pertaining to that customer. Why should we do that? The fact that you created a data product, maybe that's a table in your lakehouse. The agent is able to actually select everything from the table, but actually you want to be a little more careful. You want to create a specialized path so if you're an external customer coming through your agent and asking for that, you want to actually restrict that use to just the data that belongs to them. Building an API on top of that makes a lot of sense.

The other thing is nobody can do this for you. You own your data. You know what this data set is. You know what your business procedures and practices are, and this belongs to us as data producers and consumers to actually build these APIs. And then again, one of the other challenges with agents in our industry right now is that everything is changing by the day or by the week. How do I, as an architect, always think about how do I build things in a completely changing world? So I build primitives. Those can be extremely helpful, not today, tomorrow, and the day after. And then obviously I can leverage the Agentic AIs to build these APIs. They're making my life a lot easier, but you do have to verify them. You do have to build them yourself.

After we do that, then the next thing is to hand this off. How do we hand this off to the agent? We can expose these APIs as MCP tools. To run through an example, here's where things like Amazon Bedrock AgentCore Gateway are going to come in fairly handy. As long as you have an API, you can actually define that API using OpenAPI Standard. You can register that with the AgentCore. It'll actually expose that API as an MCP server and the tool on your behalf, so you don't necessarily have to do that. Or if you want to actually create an API, you could simply write a Lambda function in Python or your favorite language and then expose that to AgentCore Gateway, and therefore magically these tools are available for your agents. Now when you do that, what happens?

Your agent has the brain to pick the right tool for the job. When the user says they want a specific functionality, the agent is able to reason out that request. It looks in the tool repository that it has and identifies the right tool for the job, or you can guide this by prompt templates as well.

Transforming Traditional Applications: External Customers, Internal Users, and RAG Integration

Okay, so we maybe addressed some of the data silos issues. Let's take these principles and apply this to how you move a traditional application to an agentic AI application, right? Before we get there, I want to address who the audience is, who the user is of this agentic experience that we're creating. Again, I'm going to go back into the AnyCompany Insurance example. It's easier to explain there.

So Tim talked about Terry wanting to buy car insurance or maybe filing a claim. That's one class of users, right? People outside our company who want to get something done by talking to us. That's the external customer. Now when you look within the company, there may be data producers and data consumers. In this case I'm going to use Nikki as an example. Nikki is a claims investigator, right? Nikki wants to actually look at all the claims that happened in a specific state.

Nikki lives in and is responsible for the state of Nevada, right? And then when she investigates the claim, she wants to get all the claims beyond a certain value and take a look at this just to see if there's any fraud, right? That's an internal user. And then again you could have data engineers. In this case Jane is a data engineer and wants to build a new data product. I talked about telematics, right? Maybe she wants to subscribe to a telematics data set from a telematics provider and then host the data set as a data product that the underwriting department and others can consume, right?

So in general you can generalize this to there are potentially three access paths for our agentic applications. Now with that as the backdrop, maybe AnyCompany doesn't have an agentic experience. Today what Terry does is maybe go through an insurance website, fill out the appropriate form, and say I want to buy insurance or I want to file a claim, right? Now, how do we transform that into an agentic experience, right?

You potentially already have the API. When Terry fills out the form and submits it, chances are your web server is already calling an API, and that's implemented in an application server. By simply converting that into an MCP tool, all of a sudden you can hook that tool up to your agentic application. And great, you know, voila, you have the agentic experience that you can present and start building on, right?

So in general what we see play out today, it's not a complete revolution, right? Moving an application from a non-agentic experience to an agentic experience is an evolution for most customers, right? And this thinking of the existing functionality as an API, exposing that as an MCP tool and handing it off to the agent is a powerful mechanism to actually plug that in.

Now, we talked about RAG. Two years ago RAG was the answer no matter what the question was, right? RAG hasn't gone anywhere, right? The beauty of RAG is that if I say I live in the state of Nevada and I want insurance, it's able to say maybe there's some rules and regulations associated with that. If you have a vector database with all these documents that explain the rules of buying insurance in the state of Nevada, maybe that's already chunked and then stored in a vector data store.

And what this does is, given a string like I want to buy car insurance, it retrieves the related information and then hands it off to the agent, to the reasoning loop, right? And then now you can simply expose that as an API too. Input is a string and output is a set of documents that comes in. So you could potentially expose them as semantic APIs as well, right?

Internal User Experience: Data Consumers and Data Producers with MCP Servers

Now let's look at the internal user experience. How does this play out? So in the case of internal users, again, I'm going to take Nikki, who's a data consumer, right? And then in this case, Jane is the data producer. Typically today, Nikki goes to a BI tool of some sort and says, hey, give me all the list of the claims that are greater than a certain value. Maybe sort this and then let's pick the top five or ten and then she starts investigating that, right? And that's the current experience.

Now Jane probably, to move this data set, Jane probably wants to figure out how to subscribe to this telematics provider.

Since it's IoT data, land this data in Kinesis, for example, or Kafka, and build some kind of a streaming workflow to build data pipelines to land this in an open table format on S3, right? That's the current path now. AWS is very busy building a lot of MCP servers for all the services that we have, right? When I looked at it, various MCP servers and various levels of maturity. As Tim pointed out, you can build these things yourself if you want to enhance them as well. And then imagine all your query engines and data processing engines as front-ended by the MCP servers and tools, right? As soon as you register the tool with your agent, now all of a sudden, Nikki can, rather than specifying or writing the SQL query, could simply ask the agent, "Hey, give me a list of high value claims in the state of Nevada," right? As simple as that. The underlying tool knows how to convert that, and we'll go through that complete experience in a little bit.

Similarly, Jane could come up with a long-term memory, a prompt template in the data engineering team. Maybe there's a data engineering agent that they have trained, right, with templates that say, "Hey, here's what you do when you actually have IoT data." Maybe you want to use, you know, maybe the organizational standard is Kafka. Maybe use MSK, which is a managed service for Kafka. They're familiar with Spark. Maybe they want to use Spark Streaming. If all of this is templated again, Jane could simply say, "Hey, by the way, can you please create me an MSK, a Kafka topic, and get the data from the third-party provider and then stage this in S3," right? It knows what the tools are. It's able to actually go through the reasoning loop, understand, get the confirmation from Jane, and deploy this whole thing, right? That magic happens by simply exposing your existing functionality as tools to your agentic applications.

Governance in Action: Nikki's Claims Investigation with Fine-Grained Access Control

And with that, now we're getting into, okay, looks pretty good, right? Now how does really governance work under the covers? It's kind of pretty important, right? And then let's actually play this thing out. How does this really work when Nikki, as a claims investigator, she has a role, and she uses the role of a claims investigator, right? Say, "Show me all the high value claims in the state of Nevada." What happens? The moment Nikki connects to the agent, the agent kind of inherits that characteristic, assumes the role that I'm an investigator. And then what happens is that it is smart enough to say, "By the way, Nikki is asking something about the collision claims, et cetera. I need to look up in the business catalog, you know, what this means," right? Because, so it has that, assuming you have the tool registered. What it does is it says, "Let me look, let me find the data product," right? And then it actually calls kind of a SageMaker business catalog and says, "Hey, do you have any data products pertaining to claims?" There's metadata in this catalog that says what those data products are, claims and policies, et cetera.

Then as soon as the first call goes out, it gets a list of claims and policies. Now, since the tool, the agent here inherits the investigator role, it has restrictions. It cannot look at all the data products. It only has access to what a claims investigator department has access to. It basically only lists all the data products from the catalog that's relevant to the user, that's relevant to this use case, as well as that Nikki has access to, right? Again, the similar thing happens. The next thing it does is it takes the data product and says it's backed by a couple of tables, the claims table and the policies table, right? Then it looks up in the technical catalog and says, "Let me get the structure of these tables," right? And then once it gets that, again, the context keeps on building, as Tim showed you. We're in that loop, building the context for this agentic application.

And the next thing it does is, well, I have the table and I need to retrieve data from this. It says, "Let me call, I have a list of tools." AWS has created the data processing MCP server that has Athena as an engine, so it says, "Hey, let me actually use Athena, which is like query as a service, and then I don't have to provision a cluster, et cetera. I can simply pass on the query. It'll execute that and get me the results," right? It fires off a session into Athena, simply packages the select statement it has built from the table metadata. Remember, Nikki doesn't say anything about the query. The agent builds it for her and then passes it on to Athena.

Now, before Athena retrieves this data, it talks to Lake Formation and applies filters, including column-level filters and row-level filters. Let's say the dataset in the claims has some personally identifiable information, and it has data for all the states. Lake Formation actually looks at the query and says, "Well, you're asking for all this data, every dataset. Now, I'm not going to allow you to do that." Lake Formation says, "I'm going to put a column filter. I'm going to rewrite your query and put a WHERE clause because you're only responsible for Nevada. I'm only going to show you records from Nevada, and then I'm actually going to mask and remove the Social Security number column because you shouldn't have access to that." That happens automatically. Isn't that pretty amazing, right?

Now, remember, the user didn't ask anything about it. Lake Formation is simply looking at the permissions, and this thing magically happens under the covers. So then the regular thing happens after that. Athena executes the query in the context of an investigator and then ensures that only the portion of the S3 key space that she has access to, or the agent has access to, is used to retrieve this data. Then it sends back to the user, and then the agent creates a report and sends it back to Nikki. This is how governance plays out for an internal consumer.

Now let's actually look at some of the critical pieces that the agent tech data consumer priorities are. If you're catering an experience to a data consumer, you should be worried about data quality right in the loop. If there is a data quality score associated with that, the agent on behalf of Nikki might also put a constraint saying, "Expose this dataset only if the data quality rule on a column is greater than 95%." I didn't show that in the agency loop, but that could be part of your prompt template as well. If you want data quality, you can bake that into your agency calls, right?

Data discoverability is going to be pretty important, right? Your agents have to look up your business catalog, obtain potentially data products, and then associate them with the column metadata in your technical catalog and retrieve the results. I'm going to talk in a little more detail about how this trusted identity propagation works. Let's hold off for a second. But you also have to be aware that when Nikki logs in, I simply said, "Hey, by the way, this agent is going to actually take on that role, right?" How does that really happen? I'll actually dig into that towards the end. Let's hold on to that.

Fine-grained access control means users only have access to the dataset that they need to look at. We demonstrated this by Athena rewriting the query and looking up Lake Formation, right? And then the other piece is performance is super important, right? For example, if this report takes forever to come back, Nikki has to go get a coffee or something like that, right? Now, maybe what we can do is actually use a materialized view. We've launched new materialized view last week, I think, and then you can create a materialized view. Nikki can potentially create a materialized view that says, "Hey, create a materialized view for me that only has records from the state of Nevada," so Athena doesn't have to actually scan all of that. That should be baked in as a consumer priority. So far, so good.

Data Producer Experience: Jane's Journey Creating a Telematics Data Product

Okay, let's keep going. Now let's look at the data producer experience, like what is the experience for Jane. Jane, as I mentioned, is a data engineer. In this example, Jane's task is going to be creating a data product, and then Jane is going to actually get this dataset from a telematics provider and host this in that environment, right? Now, again here, as Jane starts talking to the agent, the first thing the agent does is inherit the user's role, which is a data engineer. Then it has all the various tools required, like Amazon MSK. MSK has an MCP server and tools to provision a cluster that's already exposed to this agent, right?

And then it simply says, "Okay, I'm going to create an MSK cluster," and then creates the cluster and actually fires it up. Then it realizes it needs to also create a topic. This is a placeholder for all the streaming data that's coming in, right? Interestingly, the MCP server there for MSK does not have data plane access yet. So it's going to realize, "Hey, I don't see that tool there. Maybe I should go through the front end and actually create that." The agents are smart enough to actually come up with code templates that could go on with some prompts or automatically realize, "Hey, I don't know how to log on and create a topic, right?" And then that happens.

The thing unfolds. The next step is it needs to provision a Glue job, since the template says you want to use Spark, maybe it uses Spark streaming to actually retrieve the data set. The next thing it does is it needs to create a table where it needs to put that data set, right? Remember the data quality that we talked about? This is the place the producer has to think about data quality.

Now as part of doing that, Jane also writes data quality definition language, which is something that Amazon invented and we open sourced this as well. A lot of our tools automatically use this. What you can describe in the data quality definition language is what do you want to do. For example, if you have GPS coordinates, if you have speeds, you can simply say GPS coordinates have to be within this range or speed has to be within this range. All that is baked in. So as part of creating this table in the Glue catalog, those quality rules can also be specified.

When you specify that, Glue actually runs a batch job and actually evaluates this rule, and at the column level says, here's the data quality for your specific column. It's at 95% or 90% based on how many records are out of sync with the rules specified. That's automatically built in. It is Jane's responsibility to go ahead and do that. And then obviously the table gets created, and remember in this case Jane simply said like hey don't create all this stuff. I'm in my beta environment. Give me the infrastructure as code. I want to validate this code and then deploy this using my classic CI/CD pipelines.

It's probably a good practice, unless you don't have this running around in your production cluster. Use a beta cluster if you want, or at least create infrastructure in the code you inspect before you deploy this, because otherwise you would be affecting a running system. And again, as the agentic loop spins, it also realizes it needs to publish. Jane is trying to publish the data, this entire data set in the business catalog. It also publishes an entry in the business catalog for consumers to consume this, and again, it takes a data engineer who has the ability to publish to the business catalog. That is a role that you already created.

And finally, it also associates, it also says, let me actually create Lake Formation fine-grained access control rules. Only these consumers can access this data set that's actually baked in Lake Formation. So all of this magic happens behind the covers. This is a data producer experience in the world of agents. So let's look at the priorities of the data producer in the world of agents.

We talked about creating infrastructure as code and verifying it. The accuracy is still your job. The agent is going to come up with some code. You want to ensure that you have test scripts and others to ensure that this is actually doing what you intended to do. That is the responsibility of the producer, and that responsibility doesn't go away. And data quality rules, we saw how data quality definition rules are associated with this data set. That's a data producer's responsibility.

And also putting the information in data catalog. What do these columns mean, what are the various bounds and how should they be used, all of that should be part of your business catalog. And then auditing. Sometimes you figure out why this happened. So making sure that the audit is enabled, so all of these accesses to Lake Formation are already logged. It can show up as a CloudTrail. That's a responsibility of the data producer. So far so good.

Trusted Identity Propagation: Understanding JWT Tokens and Cross-System Authentication

I skipped one detail as I was telling you. I said, by the way, the agent inherits the role of the user coming in. How does that really happen? You're going to need a couple of primitives, AWS IAM as well as the OpenID Connect and OAuth2. Rather than delving into the details of this, I'm going to actually show you how this really works under the covers. Let's say you have a user. The user first logs in to their portal, through the agent portal.

What happens at that point is the user gets authenticated with the federated identity provider. Now for enterprise identity it could be Okta. Basically the user says, hey, here's my user ID. Here's my login. Here is your MFA token that's in my device. You plug that in. What that does is it sends out this thing called a JWT token. I didn't quite understand what this is. I was starting to understand what this is in detail. Here's where, as data practitioners, we have to understand the primitives of what these things mean.

The easiest way to imagine this is your driver's license.

We're talking about insurance, and a driver's license is easy to use. Your driver's license has your name, your picture, your signature, and age details. That's your ID. When you give that ID to someone else, they look at it, they look at your picture, and they say, yeah, I think that's the person. It also has entitlements, which is what class of car you can drive. I live in the state of Washington, and the default is a Class C driver's license. It doesn't say that, but that's the default. It also has a restriction for me called Restriction B, which in the US means I have to wear glasses or contact lenses to drive my car. That's the entitlements.

So what the JWT token is, it's the access token and the permissions token. That's what this thing is like. You send the token to the agent, and that token has to be passed along. There are a couple of use cases here. One is where the identity provider doesn't change in your organization. That is Nikki's use case. Nikki runs a query, and then that gets passed on to the tool. At each stage, it checks with the identity provider if need be. Then it passes faithfully the token along, the identity token and the access token along to the APIs. Beyond the APIs, it's the same path. Those credentials may be exchanged with temporary AWS credentials through single sign-on, and then it talks to the database. Or it could be using a user ID and password, and then you get the details from your secure store.

Then there's another path sometimes. Remember in the case of Jane, Jane is actually getting a dataset from a telematics provider. The telematics provider might use a different identity provider. In that case, what happens is that you need to exchange your credentials. It's like you're showing your driver's license, and if you're going to go to another country, they don't take a US driver's license. You've got to go to AAA and say, can you give me an international driver's license? What they do is they look at your ID, and then they give you new credentials. They print this paper and say this is your international driver's license. Then if you go to another country, you show them and they let you rent the car. That's really what happens.

The Bedrock identity allows you to actually do that. On your behalf, it acts as a token vending machine. It takes that and gives you other tokens that you can use in your other systems. So that's what happens, as you can see here. Then you send those tokens faithfully to your tools, and then that gets propagated to the API call that you're making. At that point, those credentials could get transformed into IAM credentials that can be used to access your data store, or in this case, whatever credentials are required to access this third-party telematics use case.

Reference Architecture and Call to Action: Connecting Data and AI with Amazon Bedrock AgentCore

This is super important. As data practitioners and AI practitioners, we really have to delve into how do we tell our agent to act on our behalf. It's super important, otherwise it can go do unpredictable things. Now, putting it all together, this is the final set of slides. Are you ready? Five minutes, right? All right, this is the most exciting part for me. I'm going to try to put all this in a single reference architecture that you can probably take a look at. Here we go, let's go.

This is the external user. This is Terry coming in and asking a question. I want to buy car insurance. So you're running an agent app, you're building an agent app. Maybe you're using Strands, maybe you're using LangChain, maybe you're using other frameworks. Now we think that Amazon Bedrock AgentCore is the best place to run this because it gives you a runtime which is pretty secure and containerized, so that's owned by that execution. Tim talked about memory, short-term memory and long-term memory. Those constructs are there. We talked about the gateway. If you have APIs, you can quickly expose them as tools by using gateway. We talked about identity, this thing where you show your ID to AAA and get that other ID. All of that magic can be actually simplified by your AgentCore identity. That's why we feel this is, I haven't discussed a few other things there, AgentCore is the best place to run this right now.

This doesn't have to be for your external customers. Your internal teams are going to, as we move forward, have agents that they potentially build. Or this may be like IDEs, like Quiro or other IDEs, agent IDEs that you can prompt and use that for internal deployments as well, as long as you can hook up all your tools to this. This is the way this looks. Remember Amazon Bedrock LLMs. Amazon Bedrock has a choice of LLM from all the way from Meta to Anthropic. Every year we add a few more LLMs because you can use the right LLM for your reasoning tasks. So we give you a choice of LLMs. And then remember Bedrock Knowledge Bases.

Amazon Bedrock can take your string and look up the knowledge base on your behalf to retrieve additional pieces of information. We have a host of functionality there in terms of Postgres. If you have a Postgres engine and you're doing SQL, if you're familiar with that and want to use the engine, you can use the vector capability in Postgres through PG vector. OpenSearch is fantastic for combining vector searches along with classic searches. We have a host of tools there.

If you're doing GraphRAG, you can use Neptune as your vector store where you can do both the graph traversals as well as vector searches together. We also have S3 vectors that we added, which Milan talked about yesterday. You can use that and then expose them as MCP tools, right, like we talked about this API where you give the string and it fetches a related piece of information. If you have an MCP server, you can build an MCP server for that and expose that as a tool as well.

Now this is the front end. Tim talked about packaging this as APIs. Remember the special purpose path that Tim talked about? You're creating canned queries or APIs. Now if you expose those things as MCP tools, both your customer-facing agents and your internal agents can actually talk to those back-end engines. We have a lot of capability starting from Aurora to DynamoDB to ElastiCache to Redshift to Athena, and the list keeps on growing based on what tool you want and what engine runs it the best. You can expose your API or general purpose query via this tool.

Now what happens to your back end? There's the data processing engine. This is where Jane goes and provisions a Kafka topic and runs a Glue job. That whole thing can be orchestrated by tools. With that, this is the picture I've been very intensely drawing, and hopefully this makes sense. This is, I feel like, a reference architecture, at least for me until I keep changing it, and we'll keep on tweaking this. Any feedback on this is welcome.

On the left side is the agent, on the right side is the data. The MCP is this big USB-C thing that connects the data and AI together, and this is where, this is how the magic happens, folks. With that, I want to leave you with a call for action. To me, I think the special purpose paths are something that you have to build. I would start building data APIs, as Tim pointed out. These are very easy to build and give you a secure path, but you can also build data products and give a general purpose path that's a little more risky. You can use both, but I would urge you to build APIs and expose them as MCP tools via the MCP protocol.

As we talked about, data governance, end-to-end security are a top priority. You should be thinking about them day one, not after the fact when your agent vends out the data that it should not vend out. Finally, we are working very hard to simplify this experience for you so you can focus on your business and leave the other muck to us. So with that, I want to end the session. Thank you very much. I hope this was meaningful. Your feedback is very welcome.

; This article is entirely auto-generated using Amazon Bedrock.