DEV Community

Cover image for AWS re:Invent 2025 - Accelerate Game Design Reviews with Generative AI and LLM Agents (IND395)
Kazuya
Kazuya

Posted on

AWS re:Invent 2025 - Accelerate Game Design Reviews with Generative AI and LLM Agents (IND395)

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Accelerate Game Design Reviews with Generative AI and LLM Agents (IND395)

In this video, AWS solutions architects Sam Patzer and Christina demonstrate building an AI-powered game design review system using Amazon Bedrock AgentCore and the Strands framework. They create five specialized agents (Lore, QA, Gameplay, Strategy, and a Game Analyst orchestrator) that analyze game design documents through different lenses, using the example of adding elves to Amazon's New World MMORPG. The session includes live coding demonstrations showing how to build agents with Claude Sonnet 4.5, integrate MCP servers for knowledge base retrieval, and implement Amazon Bedrock AgentCore Memory to reduce token usage by 80+ seconds in response time. They showcase Quiro IDE's autonomous development capabilities and demonstrate how memory-enabled agents provide faster, more cost-effective reviews while maintaining unbiased feedback based on standardized knowledge bases.


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction: The Role and Challenges of Game Analysts in Modern Game Development

Alright, hello and welcome everyone. Make sure you're all in the right room. This is IND 395: Accelerating Game Design Reviews with Generative AI and LLM Agents. I am Sam Patzer, a solutions architect at AWS, and I'm joined here with Christina. Hi everyone, I'm Christina, and I'm also a solutions architect at AWS.

I want to ask the room a few questions before we fully get started. Who here has ever built a game before? Awesome, we got a few folks. All right, who here has written a design doc before? Requirements doc, you know, whatever name you want to call it. Yes, yes, all right, I see a lot more folks. Well, today we're going to review how that maps to some of the software engineering practices that you use as well as game design.

Thumbnail 0

Thumbnail 60

Let's take a look at our agenda first. We're going to introduce to everyone what is a game analyst, what does that mean, and what does that mean to the industry, and also how it maps back to software engineering as well. Then we'll go over an overview of what we're building today. We'll go into what tools we're going to use in our toolbox and then actually spend some time doing some live coding for you all. After that, we'll have some time hopefully for some Q&A. If not, we'll also be outside to answer any questions as well.

Thumbnail 80

Thumbnail 90

Let's dive into what is the role of a game analyst. The modern game analyst serves as both an interpreter and a storyteller. What does that mean? Well, it means that they are able to take the business goals and the player goals and map that back to a story that they want to tell across the industry. They want to translate things like player data, business data, QA, design docs, and everything else like that into a boiled down doc. Their role is to be able to translate any doc into those different areas and be able to get feedback on it.

Thumbnail 130

For those that have written requirements docs before, typically engineers are the ones writing the requirements docs, and then you have someone else come in and review it. A game analyst's job is to be the reviewer of those docs. For those that aren't familiar with the game analyst, we've broken it down into four competencies. The first one is focused on lore. Lore's entire idea is to validate that this update or this idea fits into the game itself. So think about, if we're adding a new race into New World, for example, which we'll be using later on as an example, we're going to be adding elves. Does that fit the lore of New World?

Next, we're going to be looking at QA or quality assurance. Does this break anything in the game? Is it valid to be in the game? Does it break? Is it a new powerful item? Does it break other playtests? This is a common problem that always has to be solved to make sure you're evaluated when you approach the problem. Finally, the next one is game design. Is this update fun? Do players want to engage with this update? Make sure that we design and build this game in such a way that players keep coming back to our game. And finally, game strategy. Does this fit into our business needs? Is this something that we want to go forward and do for business and strategy? Does it fit like our player retention models? Does it bring new players in? These are all questions that the game analyst should be able to answer as well.

Thumbnail 200

With that, there are a lot of current challenges that have to happen with this. The first one is slower iteration. During the back and forth process of that early proposal, there could be a lot of back and forth due to incomplete information that's unavailable. The designers don't have access to the same data sources as the analysts do. For example, the analysts might have docs around the strategy that the game designers might not have, and they might know more than the actual designers.

Finally, to have a game analyst available to review docs, it takes a lot of time for them to learn how to review these docs. Everything from learning knowledge about what the game is, learning what the content of that game is going to be, and all the different silos that we talked about previously, as well as best practices and methodologies for reviewing those docs, it takes a lot of time and energy to bring someone up to speed. And then finally, subjective feedback bias does exist. When you have someone come and review a doc for New World, they might view that like, oh, this new edition is not good for the game because I'm a purist and I think that all new updates are bad, and that might influence their decision making to be bad. Or if you've ever been in a design review and someone has an opinion about what code language should be written in when objectively something else is better, that's a bias that they're bringing into that doc review. So that's what we're looking at when it gets to subjective feedback.

Thumbnail 280

Building a Multi-Agent System: Specialized Agents and Orchestration Architecture

So how are we going to build it today? Well, we're going to first start off by building four different specialized agents. These agents' purpose is to only evaluate the doc that we're providing in its lens. So for example, with the lore doc, it's only viewing that doc from the lore perspective. So what that means is that we're not taking into consideration our strategy, we're not taking into consideration our gameplay, we're just taking consideration of our lore, and we're going to do that for every single agent. Its whole purpose is to evaluate that project with that idea.

With that, we want to connect it to different knowledge bases. These different knowledge bases contain a lot of detail around how that lore exists or how that gameplay exists, and it's going to be very well specialized in that area. For example, in lore, we have the entire New World wiki loaded into it, which contains all the different knowledge around New World inside of it.

After we have all these bots review those docs, we then use an orchestrator agent, which is the Game Analyst Agent, that then reviews the entire summary and decides how to present it back to the customer. While the elves might exist as a great solution for the strategy doc, lore-wise it's against the lore overall, and so it has to evaluate how we evaluate the strategy versus lore. There are different ways to approach that, and lore can then provide feedback on how to do that better.

Thumbnail 360

Some of the benefits behind this is that we get an accelerated innovation cycle because we're continuously chatting with bots, identifying how we can spin up things like, for example, solve the lore problem or solve the strategy problem. We are getting collaborative intelligence by having these bots work together to be able to identify different areas and different gaps. Finally, it's unbiased, right? You're literally reviewing the information right from the source. There is no bias saying that I prefer this or I prefer that. No, it's just straight from the source of the knowledge base and it's pulling those in.

Thumbnail 390

Thumbnail 400

The Toolbox: Strands Framework and Amazon Bedrock AgentCore Platform

So with all that, let's take a look at our toolbox to understand how we're going to build this. All right, so we have what we're going to build now. I want to talk about how we can take this idea and make it into a reality, and we can do this by using a few different tools. Who here has heard about Strands? Yeah, so Strands is our open source framework that is designed for building agentic agents, AI agents for agentic applications. It's an open source Python SDK that was designed for ease of use, so you don't have to spend a ton of time just trying to understand how to use the framework. It's supposed to be intuitive so that you can get started quickly for rapid prototyping.

Thumbnail 460

It has seamless integration with tools such as MCP servers. How many of you have had the chance to use MCP servers before? So yeah, you can use MCP servers and it even is built to be seamlessly integrated with AWS services. It was also designed to be flexible, providing you with choice of models and tools that you can use. The next tool we're going to be using is Amazon Bedrock AgentCore. Amazon Bedrock AgentCore is an agentic platform for building, deploying, and operating AI agents at scale and production ready.

Agent Core comes with a suite of services. The ones that we're going to be specifically covering today include Amazon Bedrock AgentCore Runtime, which provides you with a serverless hosting environment for deploying and running your agents on AWS without having to manage the complex infrastructure that comes with agentic applications. Additionally, we can enhance our agents and make them context aware using AgentCore Memories, which provides you with a fully managed memory infrastructure for your context-aware agents. And finally for tracing and debugging, we have Amazon Bedrock AgentCore Observability, which provides you a unified dashboard where you can see all the things such as traces so you can figure out the root causes of any of those errors that usually takes a really long time to find. Additionally, we can see performance metrics to see how well our agents are performing.

But you're not just limited to these tools in Amazon Bedrock AgentCore. There are many other services that you can bring into your workflow to further increase the capabilities of your agents. So just to list a few of them, we have identity where you can bring identity and access management into your agents and even integrate them with your own existing identity providers. We have a code interpreter for securely executing code in a sandbox environment as well as browser capabilities, and we have the AgentCore Gateway so that you can take your existing APIs and Lambda functions and make them into MCP compatible tools. Pretty cool, right?

Thumbnail 590

Technical Architecture: Integrating Agents, Knowledge Bases, and Memory Systems

So now that we've talked about the tools, let's talk about how we're going to string these all together to build our agentic solution. To start off, we're going to have five different agents that we're going to be building.

We can break these into two different types of categories. We're going to have our specialized agents, which represent our core competencies that a game analyst would need. These include a Lore agent, a Quality Assurance agent, a Gameplay agent, and a Strategy agent. Then, as our main director, we have our Orchestrator agent, which is our Game Analyst agent.

The first thing to note is that these agents are all going to be written with the Strands framework, and the model we're going to be using today is the Anthropic Claude Sonnet 4.5 model. All of these agents will be deployed to their own Amazon Bedrock AgentCore runtime. Each of our agents will need to have access to various different data sources. We have our game design documents and business strategy documents, which will be stored in an Amazon S3 bucket. We also have player behavior data, which could be stored in an Amazon Redshift data warehouse.

To provide RAG capabilities, since at the basic level this is still a RAG application because we need to query data from our data sources, we're going to use Amazon Bedrock Knowledge Bases. This is a really cool tool because you can bring your enterprise data into your AI applications in a very easy way. It provides you with the ability to easily integrate your AWS services and make them into a vector data store, and it does all the indexing for you.

To provide a standardized way for our agents to access these data sources, we're going to leverage an MCP server. This one is specifically designed by AWS for the demo and is called the AWS Labs Knowledge Base retrieval. We're going to be using that today for interacting with these data sources.

Finally, to provide better performance with our agents and a more personalized experience for our users, we're going to make these agents memory-enabled by leveraging Amazon Bedrock AgentCore Memory. We can break our memories up into two different levels. We're going to have our project-based memories, which are semantic facts. Each of these agent runtimes will have their own set of facts that they think about as they go through their knowledge base. They're going to learn things about the data, and the idea is we can make the performance better because we can reduce how often the agent goes all the way back to the knowledge base to retrieve information. Over time, it will get faster and better at responding.

At the user-facing level, we have our Game Analyst agent. We have two different types of memory that we can add here. One is user preferences, which can make the experience better for users by saving preferences about that user. For example, if someone prefers having bullet points in the responses over paragraph form, you can have the agent remember that so it only produces responses in bullet points. Additionally, we want the agent to have awareness of previous conversations. We can add session memory so that if someone gets disconnected, they don't have to start the conversation all over again, or they can even refer to a previous conversation for a more seamless interaction.

Throughout this entire build process, we leveraged Amazon Bedrock AgentCore Observability for all tracing and debugging, as well as monitoring the performance of our agents. I know that was a lot. Now we're going to see that in action, so I'm going to pass it on to you, Sam.

Thumbnail 900

Live Demo: Creating and Analyzing the New World Elven Expansion Project

All right, let's get this machine up real quick. All right, go ahead and put the demo up. All right, so first off, we're going to have a project portal that we've created for this demo today. This project portal will demonstrate all the different agents and the different stages that were discussed.

Thumbnail 910

Thumbnail 920

As I mentioned, the first thing we have to build is all the different specialized agents. We're going to create a new project that is referenced throughout the entire system. The idea behind it is that, as we talked about before with memories, these projects are all different silos inside the actual project. I've already pre-populated some of the fields here. We're going to use the example I referred to before, adding elves into our New World experience today. For those that haven't played New World, it's one of Amazon's MMORPGs that really focuses on humans during medieval times. So elves should not fit that well, and we'll see some examples of that later on.

Thumbnail 950

Thumbnail 960

Thumbnail 970

Thumbnail 980

What we're going to do is go down here to the document analysis section. I've already prepared a document that I'll show you right now. Inside this document, you'll see we have some different core features that we're going to be implementing, such as characteristics for the elves, some cultural background, a brief overview of lore, and some racial abilities that come with being an elf. Finally, there are some customization options, some technical requirements, and a timeline. For those that have ever written requirement documents, this is missing a lot of details and should not pass anyone's bar for writing a document. But that's what we're going to show today. This was obviously not a well-written requirements document, but that was the whole purpose behind writing this document: to show that I'm a brand new game designer who doesn't know what I'm doing and to ask how I can improve this document.

Thumbnail 1010

Thumbnail 1030

What we're going to do is copy this over to our tool where we create a project and then we're going to be prompted with our agent configuration. By building out each individual agent separately, we've given ourselves the power to select how we want to send this document to the agents. If, for example, I already know my strategy is a great strategy, we don't want our strategy to be evaluated. Or if we know more than something else, we can just say don't use that tool and don't enable it. For this demo today, we're going to start off by removing the analyst and we're going to send it to all three different agents individually. We've also disabled memory to give us an idea of what the performance looks like without using any memory at all. We're going to go ahead and create that project. Then with some demo magic, instead of waiting for this to respond, I'm going to move over to our first page.

Thumbnail 1050

Thumbnail 1070

Thumbnail 1080

Thumbnail 1090

What we have here is an overview. The first thing it does is send over the entire document to each one of our agents, and each agent will automatically respond with a detailed overview of what it thinks about our strategy document. Inside here we can filter by gameplay, lore, and strategy. These are all the separate agent overviews, and it will give us information around how it feels the content should fit. Let me scroll down a little bit so you can see that. It will give us some details and tell us things that we want to see. For example, in the gameplay section, it says there are some unprecedented mechanics that I've added. Well, that's probably not great for other parts of the gameplay, like PVP. It might introduce bad things. Inside the factory system, it might be hard to integrate, and so on and so forth. It gives us a whole overview of all the different areas that the gameplay might be impacted by the strategy document.

Thumbnail 1100

Thumbnail 1110

Thumbnail 1120

Thumbnail 1130

Thumbnail 1140

We're going to do the same thing and see that also in lore. But one of the things that's interesting about lore is that we definitely did not follow the lore inside New World at all, so it's going to call out a lot of bad things, saying this is bad, no good. It's going to go over and say that this is just a bad update. But then we go into strategy and strategy says this is pretty good. But that's the problem: strategy likes this, but lore does not. How do we combine all those together and make sure that they're all valid? We're going to get into that a little later. But first, I'm going to hand off to Christina to talk about how we built these individual agents themselves. So I have our IDE open. This is our Quiro that we just came out with. How many of you had a chance to play around with it yet? I really like this one. Let me make sure that I open the right one. We're going to talk about how to start building our first agent, and for this one we're going to be looking at the gameplay agent.

Thumbnail 1190

Building Specialized Agents with Strands: From Gameplay to Strategy

At a basic level, when you're building an agent with Strands, there are three key components that are specific to a Strands agent: you're going to have tools, you're going to have a model, and you're going to have a system prompt. Tools are going to be things like custom tools that can be created through custom logic, like a function in a class, or something like that. They're designated with a tool decorator, which you'll see in a little bit. Tools can also be your MCP servers that you want to use.

Thumbnail 1220

We're going to be leveraging the MCP server for the Bedrock knowledge-base retrieval MCP server that I alluded to earlier. We can see here that what we'll do is initialize the tool by initializing an MCP client, and then we'll state which MCP server we want to use. For this specific MCP server, we have a few environment variables that you can enable. We wanted to be able to increase the accuracy of our RAG process, so we have re-ranking enabled. The other piece we have is an inclusion tag key because the way this MCP server works is you will tag your resource with a specific tag that you can define to say this is part of my grouping for what I want my agent to be able to have access to for querying using this MCP server.

Thumbnail 1270

Thumbnail 1280

Thumbnail 1290

For our model , we have it up here and it is defined that we're using the Amazon Bedrock model, and we just have the model ID here. The most important piece of this entire thing, and I'm going to make this a little bit wider for you, is this system prompt. This system prompt is basically the brain of your agent. It's going to define its personality, its role, and instructions on how it retrieves information and leverages its tools and what kind of questions it can answer.

Most of the time when working with AI applications like this, when you're using foundation models, the first thing you should check if you have any kind of unexpected output is your prompt because that's where you can start seeing the quickest improvements. For this specific agent, we're going to tell the agent that you are a gameplay expert. Your job and expertise includes anything around game mechanics. We're going to let it know that it has access to a knowledge base and we have specific instructions telling it how to leverage the knowledge base and how we want it to query the knowledge base and return information whether it found sufficient information or not.

Thumbnail 1370

Now, this is a basic Strands agent. The next thing we have to do is define our agent core to . The next thing we want to do is be able to deploy this to Amazon Bedrock AgentCore. For that, what we would need to do is use this Bedrock AgentCore application. Basically, what this is is a wrapper to give it all the tools that it needs to be deployed to an agent core runtime. The reason that we need this is think about how many of you have used Docker before. In Docker, if you want to make something web service enabled, you need to basically make it into some kind of web server using either something like FastAPI or Flask. This is the Bedrock AgentCore flavor of that, but you can also use your own frameworks like FastAPI if you would like to as well. This is a really easy way to get started.

Thumbnail 1430

With that, the last thing that you need for your main logic is you just need to have your app entry point , and that's going to tell your runtime where your main function is. This is how you create your first agent and it's ready to go to Amazon Bedrock AgentCore. When you want to deploy it, you have a few different options. You can do it through the console or you can use our CLI for AgentCore where you would just configure it and in the background it's going to create all the resources that it needs to deploy it to a Docker container. As a result, it'll give you an endpoint and you'll use that endpoint to invoke it in your application.

Thumbnail 1500

With Strands and AgentCore, one of the things I've really appreciated about it is the fact that since it's so intuitive to use, once you've created one good agent, it becomes even faster and faster to create the other agents because you don't really have to change that much. I'm going to prove that to you by showing you a different agent. This one is our strategy agent.

Thumbnail 1530

Thumbnail 1540

As I mentioned before, for every Strands agent, we have tools, our model, and our system prompt. We can see here that not much has changed. We still use the same code for our Bedrock model, we have the system prompt present, and then we have our MCP server. The only thing that's different here is we have a different tag because we're using a different knowledge base. Our system prompt instead of telling it that it's a Gameplay Agent, we're saying that it's a Strategy Agent and it has access to strategy documents instead of game design documents. But that's really all the difference.

Now we're going to talk about how we can use the tool Ciro to make building more agents even faster. So who here has used Ciro before? Just me, okay, cool. So inside of Ciro, it's just like an IDE like anyone else's IDE. You have your editor, and you can change things down here like, for example, what type of language you have. You can also change your auto completion rules, your formats, and your themes. But the coolest part about it is you have this new chat feature over here on the right-hand side.

Accelerating Development with Quiro: AI-Assisted Agent Creation

Now there are two different types of chat that you can have with Ciro. First is the vibe section. If you want to vibe with Ciro, that's great. If you want to do small code changes or even single file changes, that works. But for me, I want to build a brand new feature. In this case, I want to build a new agent. Obviously, the agent is probably a little bit of overkill for spec-driven development, but we're just going to be using it as an example. A better example in the future would be instead of me building out a single agent, it would be building out all the separate agents and just me giving some specs around like, hey, here are the MCP servers that you should use.

Thumbnail 1630

Inside of here, I have a prompt already pre-populated that I'm going to run, but I'll be explaining that prompt while it's running. So inside of this prompt, I'm telling it to only use the folder Ciro building so that way it doesn't get any of the wonderful code that we have prepared already for you today. I will only work in that side of that folder and place all the new code in the new Lore Agent Basic. I've asked to just create two tasks just for demo purposes today, and we'll see if it follows through on that. Other than that, it will just print. The prompt was, hey, build out a new agent based off of the old agents that we've already created, and then it will create a Lore prompt for us and make sure to install any needed packages. For example, if we're using Strands today, it'll go ahead and do that.

Thumbnail 1680

While that's running, I will review the new requirements that Ciro created. So the requirements document overview is that this is really what you would design if you were to give a paper to your executive. This is what our requirements doc would look like for them. It would really contain things like, what is a Lore Agent supposed to be? What is the Strands SDK? You would hand that off to your manager or your BD, and they'd be like, oh, this is really helpful because they would not understand these terms. For us engineers, we'd be like, alright, yeah, the requirements look good enough to me, and we move on to the more important section, which is the design doc.

Thumbnail 1710

Thumbnail 1720

Thumbnail 1730

Thumbnail 1740

Thumbnail 1750

The design doc contains all the implementation details and the requirements that we would typically look at as engineers. So inside of here, you'll see the key components that we want to have and the different interfaces and interactions that we want to have with that. So it will give a little bit more detail. For example, what an app entry point would be, the Lore Agent functions, give different connections for MCP, and give some variable names that we might want to utilize. For example, I pasted in here what my knowledge base I want to use, and I was able to keep moving forward. After that, we gave, for example, the Bedrock model we should use, the system prompts, and it will kind of give us a detailed overview of exactly how the implementation of something like this would happen. Obviously, with more complex projects, this doc gets really cool and a lot of detail in there. To be honest, it is a great first step to documenting what you've built and the new feature that you've built.

Thumbnail 1760

Thumbnail 1770

Thumbnail 1780

Thumbnail 1790

Thumbnail 1800

After this is all done and it takes all this entire design doc in, it will then create a task list. Inside of this task list, it will contain all the different tasks that we have to do to build this thing. So you notice that we only have two tasks, but typically on a lot of other features, we're going to have longer tasks or even more tasks than this and even some subtasks. So we'll go ahead and get started on the first task itself, and it will go ahead and run that task. At any point, if it has any questions or if it needs to run any commands, it will ask me before doing so. Obviously, we just released today Ciro Autonomous, which allows us to run things in the background or even sign GitHub actions or GitHub pull requests, I should say, and GitHub feature requests inside of Ciro. So we'll do the entire thing behind the scenes instead of something like in the CLI.

Thumbnail 1810

Thumbnail 1830

I personally like to use it in the CLI when I'm actively building something because I get that feedback right then and there, and I can change things as I see fit. So the first task is now complete, and while that second task gets kicked off, one of the really cool things it does is it gives you a nice little overview of what it has done. Inside of here it tells me it created a folder, configured the Bedrock model, gave me an initial prompt, gave me the MCP client that's supposed to connect, gave me environment variables and error handling, billing entry points as well.

Thumbnail 1850

Afterwards, the next task is obviously going to be building out our requirements.txt. So it's going to go through that file I created and validate that all the requirements that we need for Python are going to be properly installed and configured for us to utilize for this project. So after that we are able to have a brand new agent spun up with Kiro. This saves us a lot of time and makes our lives a lot easier. Obviously that was a very simple task and there's a lot more cool complex tasks that you can do, but I just wanted to use that as an example of how you can use Kiro.

Thumbnail 1870

Thumbnail 1880

Thumbnail 1890

Orchestrator Implementation and Memory Integration: Improving Performance and Reducing Costs

Now that we've talked about how we build the agents, let's dive into our next section. Let's take a look at what it looks like to combine all these things. So obviously we can go in here and create a brand new project. Inside this project we're going to fill out this just like we did last time. We're going to use our same doc for Elven expansion. We're going to go in here and paste that document in, and this time we're going to configure it with the analyst thing. Notice how we're not able to select any other agents because all the other agents are going to be available in the analyst. The analyst is just going to use them as tools.

Thumbnail 1910

So we're going to go ahead and create that project. Once again, using some wonderful demo magic, I already have done this ahead of time. So what we have here is this wonderful review of this document, and inside of here it gives us a high level overview. This gives us the three different agents that I called and gives us a reason why or why we should not proceed with this project. For example, the Lore Agent, as I mentioned before, does not fit at all. Elves do not fit into the context of this game, so we're able to look at that, evaluate that, and immediately see that.

Thumbnail 1930

Thumbnail 1940

Thumbnail 1950

Thumbnail 1970

After that we can literally dive into each of the consistencies just like the original three prompts are, but this time it's all available on one side of one single document, and so you can see that each one of these areas gives us a nice little overview and we're able to identify different themes and dive into it as well. One thing you'll notice as well is that it will give us the entire tokens that are outputted as well as the response time as well as the average cost for that document review. After that we can then ask additional prompts to this, for example, can we dive into how agents can be used instead of elves, which is a valid race inside of New World. What this does is then it takes a look at all the different gameplay agents and stuff like that afterwards and is able to dive into how this change that I've made is able to then align and move forward.

Thumbnail 1980

Thumbnail 1990

Thumbnail 2000

Thumbnail 2010

If I continue having conversation with this bot, it will continue to be able to create and design things for me. So next thing I asked is, hey, can you create a quest around this? And what it did is instead of calling all the agents, it started utilizing just certain agents. So now that we have strategy pretty much satisfied, this one just went ahead and called Gameplay and Lore, and it built an entire quest around the gameplay and lore. And it was able to design an entire quest for me, and what I can use this for is I can keep iterating and build a whole design doc around this and ask more questions.

Thumbnail 2030

Thumbnail 2040

As you are all familiar, if you've ever written requirements docs before, this is a collaborative process. Usually when you give your first design doc, unless you've written it so many times or you've been preparing it for months ahead, you always get feedback on your docs. This is just that iteration and ability to ask for more details and to continue that iteration process in a more streamlined way. So after multiple different things you'll see that we have different token utilizations, response times, as well as estimated costs that are all outputted from the Strands SDK that you can easily use through the metrics.

Thumbnail 2070

Thumbnail 2080

With this, we are obviously just trying to combine all the agents, so there's some areas that we can improve upon, like for example memory, but we'll dive into that a little bit later. Let's first talk about how we build this with Strands. So I like to bring up a picture because I think this is the best way to demonstrate how our orchestrator is built. This one is going to be a little bit different because this agent is not going to be directly retrieving information from the knowledge bases, but instead it's going to delegate to our specialized agents. So the way that we have this one designed is instead we're going to be having our main agent, our game analyst agent.

Our main agent is the Game Analyst Agent, which is defined the same as before except this time its tools are going to be the specialized agents. The way we'll be doing this is through an orchestrator. When the orchestrator receives a query like a game design proposal, its job is to break down the design proposal and figure out which questions need to be answered to evaluate it across gameplay, lore, strategy, and QA. It will delegate to those specialized agents to gather that information, and then those agents will return it to the orchestrator. The orchestrator will synthesize it and create a more concise response based on all of its findings with full visibility across all of the core competencies.

Thumbnail 2180

Thumbnail 2190

Thumbnail 2230

Now let's take a look at how that is built in the code. Let's go to the Game Analyst Agent. We can see here it is very similar to what we had before. We have our model that is still defined, still using Anthropic Claude. We are going to have our system prompt. Let me make this a little bit bigger. This time in our system prompt we're going to be saying you are a Game Analyst and your job is to take in game design documents or questions and analyze all game designs for the core competencies. With that, we're going to tell it that it has access to the following tools that we want it to use. For anything around lore, we're going to have it send it to the lore agent. For anything around gameplay, we're going to have it send questions to the gameplay agent. The same applies for the strategy agent as well.

Thumbnail 2240

Thumbnail 2260

Thumbnail 2270

With that, we have we're now leveraging the other type of tool that I alluded to earlier: our custom tools. That is all done by adding this tool decorator. For each specialized agent, we're just going to have a function that invokes them using the Boto3 library. That is how we would use those agents. Then with that, we still have the same stuff. We have our main function right here with our entry point and that initialization that was up at the top for our wrapper to make it ready for an AgentCore runtime.

Thumbnail 2290

Now that we have all our Game Analysts ready to go, we want to look at adding memories and why memories are important. Well, first of all, they give us access to the ability to communicate with the agent and remember our prompts and everything else like that. Right now inside of the agent that I have built today, I've actually decided to send the entire conversation history every single time with the agent. That includes the original document and all the original document feedback that I've gotten as well. As I continue to iterate and build on this document, the number of input tokens is going to skyrocket every single time I send another update. It gives me another prompt back or a big document back, which means I'm storing more and more detail back to each individual project. That's not helpful.

On top of that, each of the different agents underlying are going back to the knowledge bases and querying for more data. For example, if I asked to add agents into elves and I had to go look up the Ancients, it's going to have to do that again when I asked it to design that quest. It's going to have to go back to the knowledge base and pull all that detail because the fact that it pulled it one time and didn't store it anywhere means it didn't remember that at all. That's where things like memories can come in handy. It can remember, oh, I already pulled that detail from the knowledge base. I already remember what Ancients are. It might need to go look up certain details around the Ancients, maybe details around what region they live in, but it's stored what they are. So instead of doing a deep lookup into Ancients in general, it might have to do lookups into only specific areas of that race.

Thumbnail 2390

Thumbnail 2400

What we'll look at here is that we'll do the same exact thing with the project spin up. This time we'll create a new project and this time we'll see that we can select an analyst and we'll actually turn on memory enabled. Once again, I will go ahead and run this because I've already run this ahead of time and you can see that we already had a response back with the memories. We're then going to be able to read through the entire response, dive into it, and then you can see the number of tokens between the two are very similar.

Thumbnail 2410

Thumbnail 2420

Thumbnail 2430

Thumbnail 2440

That's because this is the first analysis. It's reading the entire document for the first time, right? It's supposed to use everything it has available, but as we continue to call it, as I did today, let me go back to this real quick. So as I continue to ask the same exact prompts, for the sake of this demo, I asked it the same exact prompts of designing a quest around it. I asked it to use the ancients. I've proceeded to go through the entire process. You'll notice at the very bottom here that our token usage has actually dropped. That's because we're using memories. We're no longer having to query our knowledge base every single time, and we're no longer having to read the document every single time. We're just being able to utilize what we've already called in a different manner and be able to provide more and more things back and forth.

Thumbnail 2460

The other thing that's also interesting to look at is that the response times start to drop as well because I start understanding the knowledge base and the knowledge of what is happening. Obviously, you can also improve response times by using things like specialized agents and definitely like the Nova training models that just came out today. Those are all different areas that you can also improve response times, but memories just dropped our response times by over 80 seconds, which is incredible. So by using memories, we're able to provide a better response time as well as using less tokens, which obviously drives your costs down as well.

Thumbnail 2510

Thumbnail 2520

Memory Architecture Deep Dive and Key Takeaways for Production-Ready AI Agents

So I'll hand it off to Christina to talk a little bit more about how you can actually enable memories and utilize that. So let me go back to my pictures. I have another picture for you. The way that we are implementing memories is at a few different levels. I talked about this a little bit before, but we have our project-specific memories that are going to be storing facts, and then we have our session memories and our user preference memories to help out our game analyst agent. For that, our orchestrator is very similar to the way before except we have a few additions. We have our memory integration, so what that requires is we're actually going to add a couple of extra tools.

So we'll start with the specialized agents at the specialized agents level. What we'll be doing is you would add in order to integrate memories. You're going to use tools. You're going to define tools that are going to be able to restore and retrieve memories. And then you're going to have memory hooks. This is going to be specific to Strands. If you think about it like this, in a Strands agent, an agent has a life cycle, and in this life cycle you have different phases from when the agent is initialized to when the conversation is over and the agent is done. With Strands hooks, you can customize the logic of how your agents interact and how they function at each point in their lifetime.

And so with hooks, we're going to define how it stores and retrieves memories with the interaction between the agent and Amazon Bedrock AgentCore memories. And so at the specialized agent level, we're going to be leveraging memories like a cache. The whole idea is we want to reduce how often the specialized agent has to go all the way back to the knowledge base and look for the information. What we'll have it do instead is first check and see if it's already answered a similar question before or retrieved the information that it's looking for before. And only if it can't find that information in its memories, then it'll go back to the data source and go as normal.

Thumbnail 2690

Additionally, we have our game analyst agent that's going to be leveraging session memories. So at the beginning of its life cycle, we're going to have it load a number of previous conversations into its context to provide more information on previous conversations. And then we also have it leveraging user preferences to provide that more personalized experience. So now let's dive back into the code and see what that looks like. So I know I've been saying this a lot, but a lot of this is going to be very similar to what we had before. The reason I keep mentioning this is because I really want you to get out of this just how easy it is to work with the Strands framework. We still have the model defined just as before.

Thumbnail 2710

Thumbnail 2720

And we have our system prompt. This is actually a spoiler alert—this is going to be the most different piece. You can see we have some extra notations in here. This was an experiment I conducted when building out this agent to see how much we could improve the agent's performance with just a prompt. Who here enjoys prompt engineering and feels confident in their abilities with it? Not very many of us, I'm in the same boat as you. But with tools like Amazon Bedrock, you don't have to be an expert in prompt engineering, and I'm going to show you why.

Thumbnail 2760

Thumbnail 2770

Thumbnail 2780

Thumbnail 2790

I'm going to go into the console and show you a new tool that I love, which is our prompt manager prompt builder. I took the original prompt from our original agent and pasted it into here. You can tell it what model you're using and it will optimize the prompt for you for that model. I took this and copied it, pasted it into the code, and I got immediate results from it. It was really cool. That's one of the main differences.

Thumbnail 2800

Thumbnail 2810

Thumbnail 2820

Thumbnail 2830

Thumbnail 2860

The other piece that's different is we now tell the system prompt that it has access to memory tools. It's very straightforward. We tell it to go check the memories first for the information that you need, and then if you don't have the information, go about your job as normal. We have that here and we can see we still have our same tools. Finally, the last thing we have is we're going to initialize and basically say that instead of just leveraging the custom tools we had before where we said we have that logic to go and invoke the other agents, we're also telling it to use the memory tools as part of its tool belt. That's all you really need to do, and then you go about as normal with your agent core runtime initialized. You have your entry point and that's the entire magic behind the whole thing.

Thumbnail 2870

Thumbnail 2880

With that, we'll go back to the presentation. I think I accidentally pressed a button earlier. Let me recap what we did today. We learned how to build specialized agents and orchestrator agents in Strands, and we saw through our demo how by leveraging AI agents we can accelerate game designers in getting into their innovation cycles by providing them with a place to get quick and early feedback. This allows them to have a more fine-tuned design proposal for that human interaction with the game analyst, so we can reduce all the back and forth.

Additionally, we saw specialized agents working together and alongside the designers and analysts to provide that collaborative intelligence of AI, information retrieval plus a human's industry experience. Finally, because all of our agents are going to have access to a standardized knowledge base, we're able to create a more unbiased standardized baseline assessment to have a predictable starting point for all of the design proposals. If there are three key takeaways that we would want you to walk away with today, these are the three I would say are the most important.

First, Amazon Bedrock AgentCore simplifies the development of production-ready agents by providing you with runtimes that are scalable and secure and all the tools that you need to help your agents evolve in a secure and performant manner. Additionally, you can enhance your agent performance and user experiences with AgentCore memory by increasing performance through lower latency by using it as a cache and providing more personalized experiences. Finally, you can accelerate your design teams by equipping them with the tools to transform subjective feedback into objective insights, and this will enable faster and more confident game design decisions.

Thumbnail 3030

We've left some links for you here. Be sure to look out for our GitHub repository. We do have it posted and we're continuing to add to it. If you are interested in having any more hands-on tutorials with Amazon Bedrock AgentCore, this is a really great place to start with the samples. I personally actually used them to learn how to use AgentCore.

Thumbnail 3080

Finally, if you want a really cool hands-on workshop with another use case for Amazon Bedrock AgentCore, check out the autonomous live ops AI agents for dynamic gaming experiences session. I actually sat in on the dry runs for it and it was really cool, so please do check it out if you have time. I want to open it up for questions and thank you all so much for being here.


; This article is entirely auto-generated using Amazon Bedrock.

Top comments (0)