Kazuya

Posted on Dec 4, 2025

AWS re:Invent 2025 - A leader's guide to agentic AI (SNR201)

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - A leader's guide to agentic AI (SNR201)

In this video, an AWS leader explores the shift from automation to high-agency agentic systems, explaining how AI agents differ from scripted automation through their goal-driven, resourceful, and adaptive behaviors. The speaker presents four leadership mental model shifts: governance like a board of directors, risk management like a trading floor with circuit breakers, organizational structure like an immune system, and culture like a research lab. Technical capabilities are examined through intelligence (model selection in Amazon Bedrock), context (knowledge graphs, vector databases, memory types, and Model Context Protocol), and trust (Amazon Bedrock Guardrails with automated reasoning checks reducing hallucination by 99%). Real-world examples include Druva's autonomous data security agents, Syngenta's Cropwise AI increasing crop yield by 5%, and Thomson Reuters reducing technical debt by 70% using AWS Transform. The session introduces AgentCore primitives for building production-ready multi-agent systems and recommends starting with software development, customer support, and knowledge work use cases.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Welcome to re:Invent: Leading Organizations Beyond Automation to High Agency

Good morning. Well, that's great. I love this crowd. Welcome to re:Invent in Las Vegas. I'm excited to be here with you today. This is my 13th re:Invent. I attended the first six, like many of you in this room, as a customer of AWS, and this is my seventh one as a builder here at AWS.

In my role at AWS, I wear a dual hat. I get to lead, build, and solve business problems using technology and AI. I also get to work with C-suite executives and leaders of some of our largest customers, like yourselves. How many of you in the room are attending re:Invent for the first time? Well, wow, many. Welcome. Any veterans here? Five or more years? Excellent. I hope you have a fantastic week because it's an amazing time to be a senior leader in an organization. What a privilege we all have to become agents of change and lead our organizations beyond just automation to high agency.

In this session over the next 40 minutes or so, I'm going to talk about what leaders need to do to lead in this new era of high agency. What are the mental model shifts that we require as a leadership team? How do we re-envision our business processes? Then I'll go slightly deeper into some of the technical and architectural capabilities that help bring this vision to life.

Understanding the Autonomy Chain: From Scripted Automation to Agentic Systems

Now, I'm sure many of you have seen a lot of things being labeled as agents, but when you peel back that wrapper, underneath you find many of these things. You have just good old scripted automation that takes repetitive, predictable tasks and codes them in a fixed set of workflows. As you move up that autonomy chain, you have generative AI assistants and chatbots that provide useful information. They have access to data, can respond to queries, can summarize documents and information, but have very limited ability to act.

As you move up that autonomy chain, we start to see higher-order business tasks and goals that these agents can accomplish. Goal and task-based agents are optimized for completing a specific task by working with humans. This is where we start to see much higher value. At the highest level is the highest level of autonomy of agentic systems, where multiple agents work together, take much higher-order, complex, and sometimes ambiguous tasks, and are able to break those down into steps and pursue them.

This is a good mental model to think about when your teams come and talk to you about automation or AI agents. Now, this is not to say that one is better than the other. As senior leaders, it is important that we understand what tool to apply to solve which problem. The question that I find really useful when I'm talking to my team is: what are the behaviors or attributes that make something an agentic system?

Let's talk about what some of those behaviors are. Agentic systems are goal-driven. They take a higher-order goal or an intent and have the ability to break that into multiple chunks of tasks to complete. They are resourceful. They have the ability to access the tools, the data, the context, the hierarchy, and the roles inside your organization to take those actions. They remember. They have memory, so you don't start from scratch when you have a new transaction. They remember that there was an issue with billing the last time you tried to do something.

They learn and adapt. They get better with feedback, and they use escalation not just as a mechanism when something is wrong, but as an inbuilt trust mechanism to involve humans in the loop. It's good to keep this framing in mind when you think about agentic systems. Now you might ask: we've been on this journey for a couple of years now, so why now? What has changed? There are two fundamental shifts that are happening very rapidly that we as senior leaders need to pay attention to.

The first thing that is happening is that the unit of task, or the length of task that AI can do, is doubling roughly every seven months or so. You can now take much more complex tasks and give agents the ability to accomplish those tasks. Now, as senior leaders, we know that just having a technology or tool that is very intelligent and highly capable but is extremely expensive and unaffordable is not very useful, is it? So what is also happening on the other end is that the unit price or the cost to access this intelligence is continuously dropping as well.

MMLU is a standard benchmark for massive multi-language task understanding, and it is used to benchmark language models. Just to give you the context, an MMLU of 43 is similar to a struggling high schooler. They are mostly guessing answers, but they are very confident. I have one at home, so I know how that feels. I am sure many of you do too. But an MMLU of 83 is like a PhD level expert. They understand the nuance, and they are able to deal with ambiguity. So we can see that MMLU and the unit price to access both of these are dropping. We are in this sweet spot where capabilities are increasing and the unit price and cost are dropping. This is why we are starting to see real world applications of agents.

Leadership Mental Model Shifts: Governance, Risk Management, Organization, and Culture

Druva is a data security company, and I am sure many of you in this room use them. When you are dealing with a cybersecurity incident or a ransomware attack, you do not have time to go through thousands of pieces of logs and information manually to sift through when you are in the middle of a crisis. Druva worked with AWS to build Druva AI. These are agents that do not just respond, but they take action. They look proactively at the logs and autonomously detect system issues, and then they take corrective action. They move the data around, they update the policies, and they move the data through storage that is much more cost effective based on the data retention policies that you set, driving significantly higher value.

To take advantage of this requires a change in how we lead. If you think about historically, leadership has been mainly about optimizing for determinism. We are all rewarded for lowering and reducing deviation, having repeatable, predictable processes that do the same thing over and over again by reducing the error rate. This works today. But non-determinism, the ability to improvise, the ability to adapt, and the ability to take different actions based on context and changing requirements with agents is a feature, not a bug. I am sure this might make some of us nervous, especially from regulated industries. This is not about stepping into the wild west. This is about us as leaders being aware that non-determinism is what gives agents their power, and how do we safely harness this by setting up the right guardrails.

The mental model that I find useful is to think about how we work with some of our high agency teammates. You do not call your high agency teammate every Monday and say, "Hey, I am going to sit down with you to give you specific tasks that you are going to execute every single day, and then you are going to come at the end of the day and report to me what you did." We actually give them a strategic higher order intent, we give them the boundaries within which they can operate, we empower them with the right resources and access that they need to get the job done, and then we trust them to escalate to us and involve us as needed. To lead in this high agency requires a shift in the leadership model because we are no longer managing automation. We are managing thousands of agents at scale. That intelligent autonomy and managing that at scale requires a shift into four key areas of leadership. Please keep in mind that the mental models that I am going to share are just mental models. They are not recipes or strict operating models. So take them with that lens. The four areas that I am going to cover start with governance.

Traditionally, whether in software or in compliance, we have gate-based governance. A system moves, comes to a complete stop, waits for someone to verify a series of things via a checklist that has been defined and maybe audited a couple of times a year. They approve it, then the system moves again and stops again at the next toll gate. This works today, but when you have hundreds and thousands of agents, you cannot put these toll booths in front of them consistently. That will just slow it down. Instead, the mental model that is useful is to think about providing that strategic intent and then having a policy engine.

Having a policy engine that can govern this at scale requires thinking about what actually governs effectively. The mental model that came to mind is how boards of directors work with CEOs. They don't go to the CEO and say they're going to tell them exactly how to run their company every single day. Boards and CEOs typically align on a strategic direction and vision, and periodically they go and audit and calibrate. But there are also specific guardrails around what requires board approval, and the management team knows when to engage the board. This is a useful mental model for thinking about governance.

The second shift is in risk management. Today, our risk management controls are much like a factory floor. On a factory floor, you have fixed thresholds. Even today in many organizations, we have fixed thresholds regardless of the exposure of the risk. For example, if a purchase order is more than two million dollars, it requires approval from a vice president. If it is beyond ten million dollars, it must go to the CFO. These controls for risk break down when you're dealing with autonomy at scale.

The mental model that I find useful is to think about managing risk like a trading floor. On a trading floor, risk is actually managed much more aggressively, but it is managed through real-time visibility and control. Traders have a portfolio that they can use to place trades. Not every move is managed, but if the portfolio or a particular transaction violates the policy, then circuit breakers stop the trading. For example, if the portfolio drops by twenty percent, the circuit breaker kicks in and stops the trading. We need to think about agents this way, where agents have to earn their liquidity by operating within the risk controls and the boundaries that we design, with real-time visibility and automation.

It's not about managing every single agentic transaction, but also monitoring systematic drift in a multi-agent system. When that happens, you want the circuit breakers to kick in. One of the most fundamental changes is going to be in organizational structure. As leaders, we all know this. Every single time there has been a major technological change, it has required us to reimagine and rethink how we organize. Before cloud, we had application and development teams, and then infrastructure and technology operations teams. When cloud came, these boundaries got blurred. You started to get full-stack engineering teams that can operate at all layers of the stack.

In fact, with agents, we are seeing this inside our own teams where boundaries beyond just engineering, including business process roles, product management, engineering, and technical program managers, are all getting blurred because you have the ability to execute and work with agents throughout the entire value chain. Historically, our organizational structures are vertically optimized. This structure provides stability, but it is not designed to move fast. If a customer has an issue that gets handed off sequentially between departments—from customer support to supply chain to distribution to return and refund—it slows things down. But the mental model to use for how you organize for agents is like an immune system.

How does our body's immune system work? If there is a threat or if there is a problem, our white blood cells don't wait for a memo from the brain. They don't call a meeting with the lungs. They actually swarm that problem and then figure out how to solve it. Cross-functional teams that are organized around a business workflow, regardless of the current departmental structure, is how you really get outcomes from agents. The other thing that happens in this kind of structure is that the system learns constantly. It's sort of like creating antibodies. When you have this responsive swarm, it gets better over time.

Before I go further, what we really need to think about is that agents should not be limited by your organizational chart. They should be limited by the objective that they're pursuing. That brings us to the last part of this leadership mental model, and that is around culture. Historically, we have all optimized culture for precision.

It's almost like you have to hit a bullseye every single time. That's what we reward—perfect execution. But what happens in a research lab culture is that the culture is open to new discovery, sometimes good, sometimes bad. If those mistakes happen, they are used as a learning mechanism. They're documented, they are shared widely, and the new discovery and adaptation is once again a feature, not a bug in this kind of culture.

And so that's the shift that we need because what we are trying to scale is not obedience. We are actually trying to scale intelligence throughout our organization. The four key leadership mental model shifts that I talked about are governance like a board of directors that provides strategic direction, continuous calibration and guardrail risk management like a trading floor where you have real-time visibility and circuit breakers. Agents earn their liquidity and authority to operate by consistently showing us that they're operating within those risk controls. The organization structure is like an immune system that is much more fluid and responsive. And then finally, the culture is like a research lab culture that is open to new discoveries. If you want to read more, you can scan that QR code and dive deeper into some of these mental models.

Reimagining Business Processes: From Fixed Workflows to Goal-Driven Orchestration

Now I want to talk about the business process, and I'm going to pick a business process that I'm pretty sure most of us in this room are familiar with: accounts payable. Traditionally, the goal of an accounts payable business process has been to pay invoices on time and accurately. I want all of us to time travel to pre-ERP days. In the pre-ERP days, what happened was that an accounts payable system had a fixed goal to pay every invoice on time and accurately, but the workflows were vertical. Procurement would process the PO, receiving would confirm the receipt of the goods, treasury would make the payment, and AP would post the invoice and so on. The SOPs were all vertically inside of each department and function and then there was sequential handoff that happened.

In this kind of system, if there was any error or an issue, it would manually get kicked back into the previous process. Somebody would have to call the department to fix that issue. How many of you in the room have been part of an ERP implementation or run an ERP? Several of you. Now, what happened during ERP is that we could not look at these vertical processes. We had to re-engineer an AP process horizontally. We had to connect the dots between procurement, receiving, treasury, AP and so on. That forced us to talk across departments and functions to relay out our business process. But there was still a limitation because the goal was still fixed, which was to pay the invoices on time and accurately. If there was an issue anytime during the process it would get kicked back and then manually you would have to handle the exception.

Now, what if in an agentic system that I described earlier, we take a much higher magnitude goal? What if the goal of AP is not just to pay invoices on time but to optimize our cash flow? The system has the freedom to operate fluidly within the boundaries that we set. A particular invoice will go through all of the steps, but then a second invoice comes in and the agents will know that this invoice is in Euro and I have a 60-day payment term on this particular invoice. That means I'm going to call a Forex trading agent and say, can I invest this for 59 days and actually improve my cash flow? Or a third invoice comes in and it recognizes that the last four times I approved this particular vendor payment, it had to go into dispute because we did not like the quality of the deliverable that the vendor had. And so this time, I'm going to proactively bring human in the loop and say, even though approval processes have confirmed, I'm going to enroll the human and say, hey, last four times we didn't like what happened here. Do you still want me to pay?

And so that's a much higher order task, but also much more fluid workflows that are happening here. Today, as I was talking about, we are already seeing goal-based agents being in play. This is where you have a planning agent or a planner that takes this higher order task.

It breaks it into manageable chunks. Then you have an orchestrator that works across individual agents inside each of the functions like procurement, receiving, accounts payable, treasury, and goods to get to the payment and reduce the cycle time. In fact, we are seeing this in Amazon as well. I'll give you an example of an orchestrator within Amazon.

In our warehouses, we have over a million robots that help us safely and quickly move packages around. We have an orchestrator called Deep Fleet, which you can think of as the brain of this robotic fleet that manages these million robots distributed all over the world across our warehouses. By constantly optimizing that, we are able to reduce travel time by ten percent. Now, ten percent may not sound like a lot, but at our scale with a million robots and multiple hundreds of warehouses and distribution centers, that means we are able to get packages faster to you. That is an example of an orchestrator operating at scale and optimizing the business outcome.

Intelligence: Choosing the Right Models for Speed, Accuracy, and Cost

Earlier I talked about the behaviors of an agentic system. Let's dive deeper. We talked about the behaviors, the leadership mental model, and the business process changes. Let's examine what are the technical capabilities that bring these behaviors to life. There are three things that agents need. If you want to dive deeper into the architectural implications, you can scan the QR code and read an article that explores the architectural implications in your environment. But really, agents need three things: they need intelligence, they need context, and they need trust.

If we think of an agent as a human body, here is how that looks. Intelligence is like the brain of the agentic system. Your models, including your thinking models for chain of thought, reasoning, and reflection, provide the brain the ability to take the intent and break it into tasks. But just having a brain is like having a brain in a jar. It cannot take any action. So you then have context, which provides the agent the ability to access the right data. It provides the ability to actually take action. It provides the ability to access the tools to go and make things happen. That is like having hands to go and do the work.

But without trust, none of this matters. The final layer is trust. It is like having an ID badge. What is the agent's identity? Who are they operating on behalf? What are their authorizations to execute this task? What are the guardrails that they will operate within to make sure they are doing this safely? Let's dive deeper into each one and examine what are some of the implications in terms of technical choices that we have to make.

The first thing is that agents need intelligence. In an ideal world, we would want the most intelligent agent at the lowest possible price at the fastest possible speed. We want to optimize all three of them. But in real life, based on your use case and the problem you are trying to solve, you might not need all three in equal measure. In fact, there is a trade-off. If you are working with a legal agent, you might say, I cannot compromise on accuracy and intelligence here, but I am okay if it takes half a day or a few more hours or even a day to give me the response. So I am going to optimize for intelligence, but I am okay to trade off on speed of response.

On the other hand, if you are dealing with a customer service agent, you want fast, quick responses because your customers are not going to wait for ten minutes for the agent to come back with a hundred percent accurate answer. This is where you optimize for speed, but then you can trade off on intelligence or cost. This is why when we are thinking of enabling intelligence in the agentic system, the choice of models and having access to different types of models is very important because every model has a different strength. Some models are good with reasoning, others are very good with quick responses.

Some models do text-based tasks really well, while others excel at images and mathematical tasks. This is why our approach with Amazon Bedrock has been to make the broadest and widest set of models available, including open source models, Amazon's own models with Nova, as well as many third-party models. We see this application across Amazon businesses throughout our company.

Take the example of Amazon Pharmacy. Our goal is to reduce the time it takes to fill your prescription while doing so accurately. Sometimes prescriptions are handwritten notes, and this is where we use models and intelligence to reduce the time by 90% while reducing the error rate by 50%. On the other hand, on Amazon Ads, we want to provide the ability for brands who want to advertise to take an image of their product, touch it up, and make it better and more compelling when they list it. This is where we optimize for the ability to generate compelling images.

On the Prime Video side, are there any football fans in the house? If you're like me, you always join the game a little bit late and then wonder what you missed. We have the feature of Rapid Recap, where the model identifies the key things and key plays that happened and catches you up as you comment. Or Next Gen Stats, which provides much more probability and information to the viewer, bringing them closer to the action. All of these use cases require different behaviors and different models.

Take the example of our customer, Syngenta. Farming is one of the most complex multivariate problems. A farmer has to take into account so many different things: What is the weather condition? What seed should I plant? When do I plant it? What pesticide should I use? How much should I spray? Should I fertilize or not? When should I fertilize? These are just the factors that are in their control, but they are all dependent on so many other external factors like soil condition, moisture, weather patterns, and pest activity in that particular area.

This is why Syngenta worked with AWS to develop Cropwise AI, a series of agents that take information from multiple different data sources including soil condition, historical yield, seed quality, and 80,000 different growth stages of crops to provide specific action plans to farmers on what they need to do and when. This includes predictions of what might happen next week and what farmers should be doing right now, increasing yield by 5%. This is sustainable and also provides a direct business outcome, making it easier for farmers.

Context Engineering: Knowledge Graphs, Vector Databases, Memory, and Model Context Protocol

So we talked about intelligence, but just having intelligence is like having a brain but not the ability to add. Let's talk about context. Context is sometimes misunderstood as being all about just data. But as I'll show you, context is much more than just your existing data. There is a tongue-in-cheek statement from a senior scientist at Amazon that talks about the fact that anything out of an LLM is a hallucination. It is basically predicting the next token in the world. But it is our job to provide it the right context to make sure that hallucination is relevant to what we are trying to solve for.

This is why there is growing importance of context engineering. When we think about just optimizing the prompt, you are providing a one-time input with text. Sometimes you give one shot, you give multiple examples, but it is still back and forth. With context in your organization, especially when you are targeting complex workflows, you need agents to understand the role, the hierarchy, the data, the systems, and the tools. This is why we are seeing the emergence of context engineering roles and this competency in many organizations.

To provide this context, let's talk about what technical capabilities you need. The first thing is that agents need to understand the relationships in your data. Take the example of a relationship: Gartner evaluates Magic Quadrant, and Magic Quadrant evaluates technology vendors.

That's the relationship between different objects and domains inside your data. If you're a retailer, a media company, or a bank, you have many objects—whether it's customers, transactions, trades, or viewership behavior—where you need to establish these relationships so that agents understand how to navigate your workflows. This is why knowledge graphs become important in your data strategy.

The second thing that agents need is to understand semantics, which is the adjacency. Especially around adaptation and understanding ambiguity, you're not always going to have an exact term that an agent will search for and find. This is why they need to understand what is closer to what other thing. In this example, if we envision this in three-dimensional space, cats and dogs are both pets, so they're closer to each other. Gartner Magic Quadrant and the leading cloud providers are sort of far away from cats and dogs. But if you actually go into a second dimension, cats and dogs are not in the same family, so they are farther apart, and the Gartner Magic Quadrant is a lot closer to the leading cloud provider.

This is why semantic understanding of your data is what AI and agents need. This is where vector databases come into play. Because vectors are the language that agents speak. You need to provide them relationships, and you need to provide them semantics. But all of this won't matter if you don't have memory. When it comes to memory, I want to focus on the bottom right—the four things on the bottom right.

Think about priming. Priming is sort of like a mindset. It's giving a role to your agent—that you are a financial services agent. That means you're staying in character and you understand the role you are playing. You then have procedural memory, which provides the agent the how. This is our email system, this is our ERP. So they have the ability to remember who I am, what I'm allowed to do, and where it is. Then you have semantic memory, which is sort of like your world knowledge. The memory needs to understand not just all of the knowledge available in the world, but also understand here is my Q3 sales strategy. And finally, you have episodic memory so that agents remember what happened last time. For example, this particular invoice, as I shared in the AP example, had issues four times in previous runs. So I better remember to address this next time when I'm encountering this problem.

Because if you don't provide memory to the agent, it's sort of like having a goldfish—it forgets everything, and then you don't get the outcome. That creates not only poor customer experience, it also destroys some of those behaviors I talked about earlier. The agent needs to get better, it needs to remember, it needs to constantly optimize and improve. So we talked about in context the relationships and knowledge graphs, semantics, vector databases, and four different types of memory that the agent needs. But a lot of us know, especially in large enterprises, that a lot of our content is not even in databases. It's actually in documents that look like this. They are designed typically for all of us as humans to understand. We know when we look at this memo that a large bolded text looks like a title, there are sections with numbering and bullets, which means I'm looking at a list, there's a paragraph and a break. We know how to read this, but agents don't.

This is where we need to focus on taking the content that is in a lot of these documents and converting that into machine-readable files that are friendlier to agents. Here's a wonderful example from Stripe, and you can read more about it in terms of what they have published. They are taking their SOPs, their documents, their workflows, their org charts, and making them into machine-readable, agent-friendly documentation. Because it's the combination of structured data, memory, and content that is going to give agents the ability to have this context. And then the final piece of this context is the ability to access the tools and actually take action.

Before Model Context Protocol, those of you who have dealt with many APIs know that every time you don't have a standard protocol and you're trying to connect to a system, you're building a point-to-point integration, and that is very difficult to scale. What MCP has done is it's like a USB port for giving agents the ability to access many of your tools. It allows you to have a standard interface across your systems, your tools, and your data, so that agents have the context and the ability to access the tools and information they need. The combination of these pieces—the relationships, semantics, memory, better content, and MCP or any other protocol that allows access to tools—enables us to move from data just being an asset, which we have heard many times, to actually having knowledge as a capability. Because when we are leading in intelligent autonomy, we need the ability for humans and agents to share knowledge and exchange it with each other as well.

When you provide the context, the outcomes are remarkable. Rufus is Amazon's shopping assistant, and it provides personalized contextual advice. If you're buying something and wondering whether it needs AAA or AA batteries, or asking whether a product works with something you've already purchased, Rufus can help. This year alone, over 250 million shoppers have used Rufus throughout the year, and those who use Rufus are 60 percent more likely to complete a purchase. This happens because Rufus understands and maintains the context and provides useful information to the customer.

Trust Through Guardrails: Automated Reasoning Checks and AgentCore Primitives

The last piece is trust. An intelligent agent that has the ability and context but cannot be trusted will never scale in an enterprise. One way we can ensure trust is by setting guardrails. As part of Amazon Bedrock, we have Amazon Bedrock Guardrails, which allow you to set specific guardrails depending on your industry and company policy. For example, before I joined AWS, I was a CTO for a global media and entertainment company. That meant we could not use certain terms during prime time in our shows. Imagine having to apply this rule agent by agent throughout the enterprise. That would not scale. Amazon Bedrock Guardrails allows you to mention denied topics, such as never provide financial advice or do not mention competitors, and then that applies to every agentic workflow and every model you're using from Bedrock. You can be confident that these guardrails are consistently applied.

You can define sensitive information filters for personally identifiable information that you don't want exposed. Contextual grounding checks provide information to all your agents at a policy level. For example, you can specify that your return policy is 90 days, define it in the guardrails, and every time there's an interaction, if there's a conflict, the guardrail will ensure that the interaction is grounded in the policy you defined. One particular area I want to highlight in trust is automated reasoning checks. Automated reasoning checks search for a mathematical proof that a function, a program, or an agent did what it was supposed to do. You're doing this mathematically, and I'll give you an example of how this works.

Let's say we want to know how we can verify that the Pythagorean Theorem is correct. One way would be to try to draw every single variation of a triangle and then manually measure everything to ensure the theorem is correct. But that is not practical. That is not how we prove the Pythagorean Theorem. The way we actually establish it is by having a mathematical proof that it is correct without having to manually draw and measure every single variation of a triangle. That is exactly how automated reasoning works. As part of Amazon Bedrock Guardrails, automated reasoning checks are available that reduce hallucination by over 99 percent. Even before agentic AI, we have been using automated reasoning for provable security across AWS for a number of years.

S3's public storage blocking and VPC access analyzer allow you to ask questions like whether your database has access to the internet. But how do you actually prove that mathematically so that you have provable security? We do this through Automated Reasoning Check, which is one of those tools that allows us to trust the agent and make sure that we are able to effectively scale them in our companies.

I'll share an example from Amazon Tax and Compliance. When you're dealing with tax rules and compliance all over the world, you have different policies and different documents, and it's very difficult to do this manually. Our tax and compliance team worked with agents to look at 600 different companies around the world, including the tax policies and the rules to do the benchmarking. This is an example where trust is very important and you have provable security to establish that trust.

Now, when you start to move agents and operate multiple agents in production, it is still hard. It requires you to have better runtime so that you can isolate each of the runtimes. It needs memory so that it remembers different types of memory that I mentioned earlier. You need to provide an identity to that agent so that you know who is acting on behalf of it and what it is allowed to do. It needs access to different tools and gateways that it can access information from. Then it needs observability so that you can validate, audit, and feel confident that it is doing what you want it to do.

All of this requires a set of solid primitives. If you go back to just basic components or the building blocks, if you had compute and storage, you could build many different applications. You could create a website, an ERP, or a CRM with just compute and storage. We believe that inference is going to become another building block of all of the applications that are going to come out. This is why with AgentCore and Amazon Bedrock, we provide these building blocks and primitives so that all of the things I talked about—from identity to memory to providing access to tools across any model, across any protocol, not just models that are on Bedrock—AgentCore allows you to do that.

You can provide agent memory, manage the agent identity, provide a runtime that you know is secure and isolated, and provide the gateway that manages the connection with other tools. This is why we are excited about what we can build using these primitives.

Getting Started with Agents: Training, Support, and Building the Future Together

One of the questions we often get from leaders is what are some good places to get started. While this is not automation, it is also not full autonomy. These are still early days, and we make sure that we don't think of agents as a magic wand. If you have a predictable workflow with fixed steps and very limited tool use, then basic automation or a generative AI assistant works well. Agents are really good when you want dynamic tool selection, when you want to take advantage of adaptability and learning, when you want to do pattern recognition and matching, and when you are looking at things like latency and cost implications based on the business problem that you're trying to solve.

The three specific areas where we see the biggest value are in software development, customer support and customer care, and knowledge work, especially when you're handling exception loop type processes. How many of you in the room do not have any technical debt? Well, that was a trick question. Pretty much all of us have technical debt, and that's a wonderful example where agents can actually help us move fast. Paying off technical debt locks in a lot of our engineering resources into doing something that must be done but doesn't always necessarily give better business outcomes or features. It is sort of like managing risk.

Here's an example from Thomson Reuters. They had a lot of .NET legacy code that they needed to modernize.

They used AWS Transform, which is our agentic AI-based modernization agent. They were able to reduce technical debt by 70% and move their modernization 4x faster. This meant that their product and engineering teams could actually solve business problems rather than just paying off technical debt. This is a wonderful example of how agents are already helping us drive better outcomes. I would encourage you to look at your businesses and find areas like that where you can start to apply agents.

Now, none of this matters if we don't prepare our organizations with skill and training. This is why over the last 25 years, we have invested heavily into training and certification to build competency across not only companies but around countries across the world, so that we have the right talent and leaders to take advantage of this. A lot of this training is available for free, so I encourage not only all of you but also your teams to take advantage of it.

One of my favorite things is the AWS AI League. What I've found leading product and engineering teams for a long time is that gamification is always a great way to get people highly engaged. It's a low friction way of making it fun so people get to learn something. That's what AWS AI League does. You can actually host a competitive fun league inside of your own company. We also announced yesterday that there's going to be a championship league with a $50,000 prize where teams can compete.

Finally, you're going to need experts and we are here to help. There is guidance from AWS experts, many from our executive in residence team and other experts throughout the company. We have made over a hundred million dollar investment in the AWS Generative AI Innovation Center where we bring machine learning and AI experts from all over the company to work alongside you to help you solve business problems. We also have the AWS Marketplace where you can quickly get access to prebuilt agents and tools that you can start to deploy today in your organization.

It is really a fascinating and fantastic future that is ahead of you, and I'm excited to see what you all build together. I hope you have a fantastic re:Invent and rest of the week. Thank you so much.

; This article is entirely auto-generated using Amazon Bedrock.