The evolution of AI has moved far beyond simple chatbots. Today, AI agents can reason, plan, and take autonomous actions to achieve goals on behalf of humans. But building production-ready agents comes with significant challenges: secure deployment, context management, tool discovery, and comprehensive observability.
In this code talk session, Erik Hanchett and Du'An Lightfoot, Senior Developer Advocates at AWS, demonstrated how to tackle these challenges using three powerful tools working in harmony: Kiro (agentic IDE), Model Context Protocol (MCP) for standardized tool integration, and Amazon Bedrock AgentCore for production deployment.
Watch the Full Session:
The Evolution of AI Agents
Du'An opened the session by tracing the rapid evolution of generative AI. "November 2022, ChatGPT changed the world, which is why this room is filled now because all of us are feeling the effects of AI," he noted. The journey from simple summarization tasks with 8,192-token context windows to today's sophisticated multi-agent systems has been remarkably swift.
At their core, agents are autonomous systems leveraging AI to reason, plan, and take actions. But as Du'An emphasized, "agents still need compute, agents still need storage, agents still need databases. So all of the knowledge that you have is still valuable up until this point and beyond."
An agent consists of several key components:
- Intelligence - The LLM at the center providing reasoning capabilities
- Tools and APIs - Access to external services for weather lookups, database queries, and more
- Memory - Both short-term (conversation state) and long-term (historical facts across sessions)
- Software infrastructure - All the traditional engineering practices still apply
The Challenge: Undifferentiated Heavy Lifting
Deploying agents to production introduces several complex challenges:
- How do you deploy an agent securely in its own isolated runtime?
- How do you maintain context memory, both short-term and long-term?
- How do you manage identity for who can access the agent and what the agent can access?
- How do you manage tools with proper policies and controls?
- How do you observe the entire system effectively?
AWS calls these challenges "undifferentiated heavy lifting," and Amazon Bedrock AgentCore was built to solve them.
Amazon Bedrock AgentCore: Production-Ready Agent Infrastructure
Amazon Bedrock AgentCore provides a comprehensive platform for deploying and managing AI agents in production. The platform includes several key components:
Runtime
AgentCore Runtime is serverless and framework-agnostic, supporting any model and any agent framework (LangGraph, LangChain, Strands Agents SDK, or custom SDKs). Agents can be deployed via Docker containers or zip files, similar to AWS Lambda. The runtime supports workloads up to eight hours, making it ideal for long-running tasks like deep research or complex analysis.
"Runtime supports workloads up to eight hours," Du'An explained. "So if you're building a deep research agent, this is the type of platform you would deploy it to. So that way when you need something like deep research or some deep analysis or some agentic function to occur, runtime can handle that, complete the task, and then that session is done, it's ephemeral."
The runtime also natively supports MCP servers, allowing developers to launch their own MCP servers directly in the runtime environment.
Memory Management
AgentCore Memory provides both short-term and long-term memory capabilities. Short-term memory stores chat messages and user state for the current session. Long-term memory automatically extracts information using customizable strategies:
- Semantic search - Extract semantically relevant information
- User preferences - Remember user-specific details like names and preferences
- Summary - Store conversation summaries
- Custom strategies - Define your own extraction patterns
"Once you have your memory in the short-term memory, now we automatically extract information," Du'An demonstrated. "So now across sessions, when I talk to the agent or you invoke the agent, it can retrieve that information and bring it into context and perform the task more efficiently."
Gateway
AgentCore Gateway provides authenticated connections to MCP servers and enables integration with existing APIs. Developers can expose OpenAPI schemas, Smithy specifications, or Lambda functions as MCP servers through the Gateway.
A powerful feature is semantic search for tools. Rather than passing all available tools to the agent (which increases tokens, costs, and latency), semantic search ensures "your agent only gets access to the tools it needs when it needs it," as Du'An explained. This optimization saves on speed, cost, and performance.
The Gateway also supports identity management for outbound connections, allowing agents to authenticate with third-party providers using JWT tokens or other authentication mechanisms.
Additional Capabilities
- Browser - Search the web securely
- Code Interpreter - Offload compute-intensive workloads to an isolated execution environment
- Observability - Full generative AI observability through Amazon CloudWatch with exports to tools like Datadog
- Evaluations - Measure agent performance with built-in evaluation capabilities
- Identity - Manage authentication and authorization for both inbound and outbound connections
- Policy - Define and enforce policies around tool usage and agent behavior
Building with Kiro: The Agentic IDE
Erik introduced Kiro, AWS' new agentic IDE that recently reached general availability. "We released it into preview a few months ago and it just went into general availability a few weeks ago," Erik shared. The response was overwhelming, with hundreds of thousands of downloads in the first few days.
Kiro distinguishes itself with two development modes:
Spec Mode vs Vibe Mode
Spec mode implements spec-driven development, where developers define requirements and the IDE helps build a requirements document, design document, and task list. "We found that it's much more accurate and it's much quicker. It saves you time when you build it using spec mode," Erik noted.
Vibe mode takes a more direct approach. As Du'An described it: "When it comes to vibe coding, you just want it to go through, figure out what the problem is and solve it. You don't really need to have specs around that, just do what I asked you to do."
MCP Integration and Kiro Powers
Kiro supports both local and remote MCP servers, and the team just launched Kiro Powers, a new feature for adding MCP servers, steering files, and additional context. An AgentCore-specific Kiro Power is available, making it even easier to build agents.
Erik demonstrated using multiple MCP servers in Kiro:
- Context7 - Access to hundreds of documentation sources
- AWS Documentation MCP - Comprehensive AWS documentation
- Amazon Bedrock AgentCore MCP - Specialized for AgentCore development
- Brave Search - Web search capabilities
Steering Files
Steering files (similar to rules files in other IDEs) provide context-specific instructions. Erik created an AgentCore steering file with CDK best practices, ensuring that whenever Kiro works with CDK code, it follows established patterns.
Practical Demo: Multi-Agent Food Tracker
Erik demonstrated a production-ready multi-agent architecture with three layers:
Architecture Overview
- Food Tracker Agent - The user-facing agent that communicates naturally with the front end
- Recipe Agent Server - An orchestrator agent that formats and coordinates between MCP servers
- Two MCP Servers:
- Recipe Server MCP - Manages recipe data with tools for filtering by difficulty, cuisine, and type
- Nutrition Ingredients MCP (via Gateway) - Provides nutritional information through Lambda functions
Building MCP Servers
Erik showed how to build an MCP server using FastMCP in Python. The recipe server exposed several tools for retrieving recipes by cuisine, difficulty, and type. Each tool uses a decorator and clear docstrings.
As Du'An emphasized: "The doc string is actually what goes to an agent. So when we talk about increasing the amount of tokens, all of this is created in a JSON Schema that goes to LLM. So in order for an agent to know how to use that tool, you need a clearly defined doc string inside of your function."
Converting Lambda Functions to MCP Servers
For existing Lambda functions, AgentCore Gateway provides seamless conversion to MCP servers. Developers define a schema in CDK that describes the Lambda function's inputs and outputs. The Gateway automatically creates MCP tools based on this schema, enabling existing infrastructure to work with agents without code changes.
Testing with MCP Inspector
Erik demonstrated using the Model Context Protocol Inspector, an open-source tool for testing MCP servers locally. Running the inspector against the recipe server showed all available tools and allowed testing each one before deployment.
Memory in Action
The demo included a practical demonstration of AgentCore Memory. Erik logged into the application and introduced himself: "Hello, my name is Erik." After logging out and back in with a new session, he asked: "What is my name?"
The agent correctly retrieved "Erik" from long-term memory, demonstrating how user preferences persist across sessions. The memory configuration used semantic search to extract relevant information from short-term memory and store it in long-term memory for future retrieval.
Observability: Understanding Agent Behavior
Du'An provided a deep dive into AgentCore Observability through Amazon CloudWatch. The observability features include:
Session and Trace Analysis
- Sessions - Complete user interactions with the agent
- Traces - Individual invocations from beginning to end
- Spans - Every step that happens within a trace
For each span, developers can view:
- Input and output tokens
- Latency measurements
- System messages, user messages, and assistant responses
- Tool calls with JSON inputs and outputs
- Complete telemetry data
"If you're really trying to analyze and get observability of your agents, you can do that for each trace," Du'An explained. The detailed view allows developers to identify bottlenecks, optimize token usage, and understand exactly how their agents are performing.
CloudWatch Logs
Beyond generative AI observability, standard CloudWatch logs provide line-by-line execution details. "If you use Lambda function and you troubleshoot those, you can pretty much do the same thing with the runtime," Du'An noted. Developers can add print statements and debug their agents just like any other application.
Key Takeaways and Best Practices
Start with Observability
"After you have your use case, the first thing you should think about, or the next thing you should think about, is gonna be observability," Du'An emphasized. Build monitoring into your agents from the beginning to measure metrics, latency, token usage, and costs. The larger your agent gets, the harder observability becomes to add later.
Begin with Single Agents
Start with focused, single-purpose agents before scaling to multi-agent systems. Give agents only the tools they need to complete specific tasks, then expand capabilities as requirements grow.
Optimize Tool Usage
Use semantic search with AgentCore Gateway to reduce token consumption. Rather than passing hundreds of tools to your agent, semantic search ensures agents only receive relevant tools when needed, improving speed and reducing costs.
Leverage Existing Infrastructure
AgentCore Gateway allows you to expose existing Lambda functions and APIs as MCP servers without rewriting code. This enables rapid agent development using infrastructure you already have.
Choose the Right Memory Strategy
Configure long-term memory strategies based on your use case. Whether you need semantic search, user preferences, conversation summaries, or custom extraction patterns, AgentCore Memory supports flexible configuration.
Framework Agnostic Approach
AgentCore works with any framework and any model. Whether you prefer LangGraph, LangChain, Strands Agents SDK, or custom implementations, you can deploy to AgentCore without vendor lock-in.
About This Series
This post is part of DEV Track Spotlight, a series highlighting the incredible sessions from the AWS re:Invent 2025 Developer Community (DEV) track.
The DEV track featured 60 unique sessions delivered by 93 speakers from the AWS Community - including AWS Heroes, AWS Community Builders, and AWS User Group Leaders - alongside speakers from AWS and Amazon. These sessions covered cutting-edge topics including:
- π€ GenAI & Agentic AI - Multi-agent systems, Strands Agents SDK, Amazon Bedrock
- π οΈ Developer Tools - Kiro, Kiro CLI, Amazon Q Developer, AI-driven development
- π Security - AI agent security, container security, automated remediation
- ποΈ Infrastructure - Serverless, containers, edge computing, observability
- β‘ Modernization - Legacy app transformation, CI/CD, feature flags
- π Data - Amazon Aurora DSQL, real-time processing, vector databases
Each post in this series dives deep into one session, sharing key insights, practical takeaways, and links to the full recordings. Whether you attended re:Invent or are catching up remotely, these sessions represent the best of our developer community sharing real code, real demos, and real learnings.
Follow along as we spotlight these amazing sessions and celebrate the speakers who made the DEV track what it was!
Top comments (0)