DEV Community

Raj Shah
Raj Shah

Posted on • Edited on

Building a Production ready AI Agent with Amazon Bedrock AgentCore: A Complete Hands-On Guide

If you’ve used frameworks like LangChain or LlamaIndex, you know the excitement of your first working agent locally. But turning that prototype into a production system quickly hits infrastructure complexity. You suddenly deal with scaling, security, and cloud components instead of just code. Amazon Bedrock AgentCore bridges this gap, taking agents from local scripts to production in minutes.


What is Amazon Bedrock AgentCore?

Amazon Bedrock AgentCore is an enterprise-grade framework and managed hosting service that provides the "primitives" for generative AI operations. Think of it as the foundational infrastructure that handles the boring but critical parts of agentic systems—containerization, isolation, and compliance—so you can focus on your agent's reasoning logic.

Architecturally, AgentCore is split into two primary layers:

  • Control Plane API: Used at configuration time for resource management and setup.
  • Data Plane API: Used at runtime for actual session invocation and operation.

It is strictly framework-agnostic. Whether you are using Strands Agents, LangChain, or your own custom orchestration, AgentCore provides the managed environment to run those agents at AWS scale.

Why AgentCore? The AWS Advantage


The AgentCore Solution: Managed Infrastructure for Agents

AgentCore introduces a serverless compute environment built specifically for the agentic loop. Think of the AgentCore Runtime not as a single function call, but as a dedicated "clean room" for your agent’s session.

Unlike standard Lambda functions with a 15-minute cap, AgentCore provides a dedicated microVM for every session that can stay active for up to 8 hours.

  • Session Isolation: Every session is cryptographically isolated.
  • Persistent Connection: You can call the agent multiple times while the session is active, and it maintains its state.
  • Streaming Support: The runtime supports streaming data, allowing for the low-latency, real-time responses that production users expect.


Agent Deployment Workflow

1. Environment Setup
AgentCore uses uv to ensure fast and reliable dependency management. A best practice is to separate your setup into two directories — one for development with full tooling, and another lightweight deployment folder containing only essential dependencies and your agent code. This keeps your runtime secure and improves performance.

"""
Production-Ready AI Agent for Amazon Bedrock AgentCore
"""
from strands import Agent
from strands_tools import calculator
from bedrock_agentcore.runtime import BedrockAgentCoreApp

app = BedrockAgentCoreApp()
MODEL_ID = "us.anthropic.claude-4-5-sonnet-20250929-v1:0"

@app.entrypoint
def invoke(payload, context):
    agent = Agent(
        model=MODEL_ID,
        system_prompt="You are a helpful assistant that can perform calculations. Use the calculate tool for any math problems.",
        tools=[calculator]
    )

    prompt = payload.get("prompt", "Hello!")
    result = agent(prompt)

    return {
        "response": result.message.get('content', [{}])[0].get('text', str(result))
    }

if __name__ == "__main__":
    app.run()
Enter fullscreen mode Exit fullscreen mode

2. Entrypoint
The @app.entrypoint decorator acts as the bridge between your local script and the AgentCore runtime. It defines how incoming requests are handled, making your agent cloud-compatible with minimal changes to your existing code.

# 1. Project creation and install dependencies
mkdir agentcore-demo && cd agentcore-demo
uv init --no-workspace && uv add bedrock-agentcore-starter-toolkit

# 2. Create a deployment folder and add required pyproject.toml:
mkdir agent_deployment
uv init --bare ./agent_deployment && uv --directory ./agent_deployment add strands-agents bedrock-agentcore strands-agents-tools

# 3. Agent code should be saved in to the agent_deployment folder as agent.py
Enter fullscreen mode Exit fullscreen mode

3. CLI Workflow

  • agentcore configure → prepares infra (no deployment)
  • agentcore launch → builds & deploys your agent
  • agentcore invoke → test your live agent from CLI
# 3. Configure and deploy
# Use all default answers for now:
uv run agentcore configure -e ./agent_deployment/agent.py

uv run agentcore launch

# 4. Test your deployed agent
uv run agentcore invoke '{"prompt": "What is 87 * 54 + 9?"}'
Enter fullscreen mode Exit fullscreen mode

Architecture Deep Dive: Runtime and Memory

AgentCore Architecture

1. The Runtime Lifecycle and the "33-Character Rule"

When you invoke an agent, the Runtime spawns a microVM. For this to work securely, Session IDs must be 33+ characters long and sufficiently complex. This ID serves as the key for spawning the dedicated environment and prevents session hijacking. The environment automatically cleans up after 15 minutes of inactivity to optimize costs.

2. A Two-Tier Memory System

AgentCore provides managed memory that scales independently of your compute:

  • Short-term Memory (STM): Stores the exact conversation history within a single session.
  • Long-term Memory (LTM): Uses intelligent extraction to store user facts and preferences that persist across different sessions over weeks or months.

3. The Lazy Loading Pattern

You’ll often see the get_or_create_agent pattern in AgentCore code. This is necessary because the actor_id and session_id are passed in the request headers at the moment of invocation. Because you don’t have these IDs at the module’s global startup, you must "lazy load" the agent. This approach ensures the agent instance is initialized only once per session, avoiding the "cold start" cost of recreating the agent and re-connecting to memory on every request.

Lazy Loading


The AgentCore Toolbox: Key Capabilities

Key Capabilities


Observability: Seeing Inside the Agentic Loop

Debugging an autonomous agent is notoriously difficult. AgentCore automatically enables a GenAI Observability Dashboard in Amazon CloudWatch.

The standout feature here is the Service Map, which provides a visual representation of how your agent interacts with memory, tools, and the model. By using AWS X-Ray, you can perform end-to-end tracing to see exactly how long a model call took versus how long it took to hydrate state from memory. This transparency is vital for identifying bottlenecks in the agentic loop.


Key Takeaways

Amazon Bedrock AgentCore offers a modular, professional path to scale:

  • Framework Agnostic: Whether you use LangChain, CrewAI, or LlamaIndex, the infrastructure remains the same.
  • Production-Ready in Minutes: Automates the "plumbing" of ECR, IAM, and CodeBuild, allowing for deterministic deployments.
  • Managed Security: Uses isolated microVMs for session compute and secure sandboxes for code execution.
  • Modular "Bolt-on" Philosophy: You only use what you need. Need memory? Bolt it on. Need a browser? Add it in. You don’t pay the complexity tax for features you aren't using.

Top comments (0)