In this first deep dive of our multi-framework series, I'll show you how to build a production-ready AI agent using Strands Agents and deploy using Amazon Bedrock AgentCore. The complete code for this implementation, along with examples for other frameworks, is available on GitHub at agentcore-multi-framework-examples.
Strands Agents embodies a model-driven philosophy that aligns perfectly with the rapid improvements in foundation models. Rather than imposing complex orchestration logic, it lets the model's capabilities drive the agent's behavior, resulting in cleaner, more maintainable code.
Strands has a hook-based architecture that provides an elegant way to extend agent functionality without cluttering the main logic. This makes it perfect for integrating with AgentCore's memory system—we can handle all the complexity of conversation persistence and memory extraction in dedicated hooks while keeping our agent code focused and clean.
Setting Up the Development Environment
Let's start by cloning the complete repository with all framework examples:
git clone https://github.com/danilop/agentcore-multi-framework-examples.git
cd agentcore-multi-framework-examples
Now let's set up the Strands Agents project. I'm using uv, a fast Python package installer, to manage dependencies:
cd agentcore-strands-agents
uv sync
source .venv/bin/activate
The project dependencies include:
-
strands-agents
: The core framework for building our agent -
strands-agents-tools
: Community-provided tools like calculator -
bedrock-agentcore
: The SDK for integrating with AgentCore services -
bedrock-agentcore-starter-toolkit
: CLI tools for deployment
Creating and Configuring AgentCore Memory
Before we build our agent, let's set up AgentCore Memory. This service will store our agent's conversations and extract meaningful insights that persist across sessions.
Understanding Memory Strategies
AgentCore Memory provides three built-in strategies that automatically extract different types of information from conversations:
User Preferences: Captures recurring patterns in user behavior, interaction styles, and choices. For example, if a user consistently prefers detailed explanations or always asks for code examples, this gets stored as a preference.
Semantic Facts: Maintains knowledge of facts and domain-specific information. When users mention facts like "our company has 500 employees" or "the API endpoint is api.example.com", these get extracted and stored.
Session Summaries: Creates condensed representations of conversations. After each session, the system generates a summary capturing the main topics discussed, decisions made, and action items.
Creating the Memory Instance
I'll use the provided script to create a memory instance with all three strategies:
cd scripts
uv sync
uv run create-memory
This script creates a new AgentCore Memory instance and configures it with all three strategies. Here's what happens behind the scenes:
MEMORY_STRATEGIES = [
{
"userPreferenceMemoryStrategy": {
"name": "UserPreferences",
"namespaces": ["/actor/{actorId}/strategy/{memoryStrategyId}"]
}
},
{
"semanticMemoryStrategy": {
"name": "SemanticFacts",
"namespaces": ["/actor/{actorId}/strategy/{memoryStrategyId}/{sessionId}"]
}
},
{
"summaryMemoryStrategy": {
"name": "SessionSummaries",
"namespaces": ["/actor/{actorId}/strategy/{memoryStrategyId}/{sessionId}"]
}
}
]
The namespaces organize memories hierarchically. Each actor (user) has their own isolated memory space, and within that, memories are organized by strategy and session. This provides data isolation between users while allowing memories to be shared across sessions for the same user.
The script saves the memory configuration to ../config/memory-config.json
:
{
"memory_id": "mem-abc123..."
}
Adding Sample Memory
To demonstrate how memory works, I'll add a sample memory event that we can retrieve later:
uv run add-sample-memory
This adds a simple user message to the memory: "I like apples but not bananas". The script stores this as a conversation event:
messages_to_store = [
("I like apples but not bananas", 'USER')
]
memory_client.create_event(
memory_id=memory_id,
actor_id="my-user-id",
session_id="DEFAULT",
messages=messages_to_store
)
Now when our agent is asked about fruit preferences later, it will be able to retrieve this memory even in a completely new session. This demonstrates the power of persistent memory—the agent remembers information from past interactions.
Don't forget to copy the configuration to our project directory:
cd ..
cp config/memory-config.json agentcore-strands-agents/
cd agentcore-strands-agents
Building the Agent with Hooks
Now let's build our agent. The architecture uses the hook system in Strands to cleanly separate memory management from the main agent logic.
Understanding the Hook System
Strands provides a powerful hook system that allows us to subscribe to lifecycle events and extend agent functionality without modifying the core logic. I've created two complementary hooks for memory management:
Hook 1: Short-Term Memory (ShortMemoryHook)
This hook handles conversation persistence within and across sessions. It subscribes to two events:
AgentInitializedEvent - Loading Conversation History
When the agent starts up, this hook retrieves previous conversation history and injects it into the agent's context:
class ShortMemoryHook(HookProvider):
def register_hooks(self, registry: HookRegistry) -> None:
registry.add_callback(AgentInitializedEvent, self.on_agent_initialized)
registry.add_callback(MessageAddedEvent, self.on_message_added)
def on_agent_initialized(self, event: AgentInitializedEvent) -> None:
# Load conversation history when agent starts
conversations = self.memory_client.get_last_k_turns(
memory_id=self.memory_id,
actor_id=event.agent.state.get("actor_id"),
session_id=event.agent.state.get("session_id"),
k=100 # Retrieve up to 100 previous conversation turns
)
if conversations:
# Format conversation history for context
context_messages = []
for turn in reversed(conversations):
for message in turn:
context_messages.append(f"{message['role']}: {message['content']}")
# Inject into agent's system prompt
event.agent.system_prompt += f"\n\nRecent conversation:\n{'\n'.join(context_messages)}"
The get_last_k_turns API retrieves previous conversation turns from AgentCore Memory. By injecting this into the system prompt, the agent maintains context even if the runtime session restarts or the user returns after a break.
MessageAddedEvent - Persisting New Messages
After each message is added to the conversation, this hook stores it in memory:
def on_message_added(self, event: MessageAddedEvent) -> None:
# Extract the last message
last_message = event.agent.messages[-1]
last_message_tuple = (json.dumps(last_message["content"]), last_message["role"])
# Store in AgentCore Memory
self.memory_client.create_event(
memory_id=self.memory_id,
actor_id=event.agent.state.get("actor_id"),
session_id=event.agent.state.get("session_id"),
messages=[last_message_tuple]
)
The create_event API stores each message immediately, building the conversation history in real-time.
Hook 2: Long-Term Memory (LongTermMemoryHook)
This hook retrieves relevant memories from past sessions before each model invocation:
class LongTermMemoryHook(HookProvider):
def register_hooks(self, registry: HookRegistry) -> None:
registry.add_callback(BeforeInvocationEvent, self.on_before_invocation)
def on_before_invocation(self, event: BeforeInvocationEvent) -> None:
# Only process user messages
last_message = event.agent.messages[-1]
if last_message.get("role") != "USER":
return
user_query = last_message.get("content", "")
# Semantic search for relevant memories
retrieved_memories = retrieve_memories_for_actor(
memory_id=self.memory_config.memory_id,
actor_id=event.agent.state.get("actor_id"),
search_query=user_query,
memory_client=self.memory_client
)
if retrieved_memories:
# Format and inject memories into context
memory_context = format_memory_context(retrieved_memories)
event.agent.system_prompt += f"\n\nRelevant long-term memory context:\n{memory_context}"
The RetrieveMemories operation performs semantic search across all stored memories. It finds the most relevant facts, preferences, and summaries based on the current query. This happens automatically before every model invocation, ensuring the agent always has access to relevant historical context.
The Main Agent Entry Point
from bedrock_agentcore import BedrockAgentCoreApp
from strands import Agent, tool
from strands_tools import calculator
app = BedrockAgentCoreApp()
agent = None
@app.entrypoint
def invoke(payload: Dict[str, Any], context: Optional[RequestContext] = None) -> Dict[str, Any]:
"""AI agent entrypoint"""
global agent
actor_id = payload.get("actor_id", "my-user-id")
session_id = context.session_id if context and context.session_id else payload.get("session_id", "DEFAULT")
if agent is None:
agent = create_agent(actor_id, session_id)
user_message = payload.get("prompt", "Explain what you can do for me.")
try:
result = agent(user_message)
return {"result": result.message}
except Exception as e:
logger.error("Error during agent invocation: %s", e)
return {"error": "An error occurred while processing your request"}
def main():
"""Main entry point for the application."""
app.run()
if __name__ == "__main__":
main()
The BedrockAgentCoreApp
handles all infrastructure concerns—HTTP server setup, request routing, and error handling. The @entrypoint decorator marks the function that AgentCore Runtime will invoke.
The app.run()
at the bottom is crucial for local development. When executed directly (not deployed to AgentCore Runtime), it starts an HTTP server at http://localhost:8080
that listens for requests at the /invocations
endpoint. This allows you to test your agent locally with the same interface it will have when deployed to production. The SDK automatically detects whether it's running locally or in a Docker container and configures itself appropriately.
Agentic Memory Retrieval: Beyond Automatic Context
While the LongTermMemoryHook
provides automatic memory retrieval for every invocation (similar to standard RAG), I've also added a memory retrieval tool that enables agentic RAG capabilities:
@tool
def retrieve_memories(query: str) -> List[Dict[str, Any]]:
"""Retrieve memories from the memory client.
Args:
query: The search query to find relevant memories.
Returns:
A list of memories retrieved from the memory client.
"""
actor_id = agent.state.get("actor_id")
return retrieve_memories_for_actor(
memory_id=memory_config.memory_id,
actor_id=actor_id,
search_query=query,
memory_client=memory_client
)
The @tool decorator makes this function available to the agent. But why have both automatic retrieval (the hook) and tool-based retrieval?
Understanding RAG in Our Memory System
Before diving into the differences, let's understand how our memory retrieval relates to RAG (Retrieval-Augmented Generation). RAG is a technique where an AI model retrieves relevant information from a knowledge base before generating its response, rather than relying solely on its training data. This retrieved information "augments" the generation process, providing fresh, relevant context.
In our implementation, AgentCore Memory acts as the knowledge base. When we retrieve memories—whether user preferences, semantic facts, or session summaries—we're essentially doing RAG. The memories are retrieved based on semantic similarity to the query, then injected into the agent's context (via the system prompt) to augment its response generation. This is exactly the RAG pattern, just applied to conversation memories rather than documents.
Standard RAG vs Agentic RAG
The key difference lies in who controls the retrieval process:
Standard RAG (via BeforeInvocationEvent hook):
- Automatically retrieves memories for every query
- Searches based on the user's direct input
- Static, reactive approach—always follows the same pattern
- Similar to traditional RAG systems where retrieval is hardcoded into the pipeline
- The system decides what to retrieve based on fixed rules
Agentic RAG (via retrieve_memories tool):
- The agent decides when to search memories
- The agent determines what to search for, which may differ from the initial query
- Dynamic, adaptive approach—the agent can:
- Skip retrieval if it already has sufficient context
- Search for related concepts not mentioned in the query
- Iteratively refine searches based on initial results
- Cross-reference multiple topics to build comprehensive understanding
- Reason about what information would be most helpful
For example, if a user asks "What should I cook for dinner?", the automatic retrieval might search for "dinner" memories. But with the agentic approach, the agent might decide to search for "dietary restrictions", then "favorite cuisines", and finally "ingredients on hand"—building a more complete picture through multiple targeted searches. The agent is reasoning about what information it needs, not just reacting to keywords.
This combination gives us the best of both worlds: guaranteed context from automatic retrieval (ensuring we never miss obvious relevant memories) plus the flexibility for the agent to explore memories strategically for complex reasoning tasks.
I also include the calculator tool from the strands-agents-tools package, demonstrating how easy it is to combine custom and pre-built tools.
Bringing It All Together
The agent creation combines hooks and tools:
def create_agent(actor_id: str, session_id: str) -> Agent:
"""Create and configure the agent with hooks and tools."""
agent = Agent(
hooks=[
ShortMemoryHook(memory_id=memory_config.memory_id),
LongTermMemoryHook(memory_id=memory_config.memory_id)
],
tools=[calculator, retrieve_memories],
state={"actor_id": actor_id, "session_id": session_id}
)
return agent
Testing Locally
Before deploying to the cloud, let's test our agent locally. First, configure it for AgentCore:
agentcore configure -n strandsagent -e src/agentcore_strands_agents/agent.py
Press Enter to accept the defaults. This creates the necessary AWS resources like IAM roles and ECR repositories.
Now launch the agent locally:
agentcore launch --local
This starts a local container running your agent. In another terminal, test it:
agentcore invoke --local '{ "prompt": "What did I say about fruit?" }'
The agent should retrieve the sample memory we added earlier and respond with something like "You mentioned that you like apples but not bananas." This confirms that our memory system is working!
Try a follow-up question:
agentcore invoke --local '{ "prompt": "Based on my preferences, would I enjoy apple pie?" }'
The agent maintains context from the previous interaction and can reason about your preferences.
Deploying to Production
Once you're satisfied with local testing, deploying to AWS is simple:
agentcore launch
AgentCore Runtime handles all the complexity:
- Building and pushing container images
- Creating Lambda functions with proper networking
- Setting up API Gateway endpoints
- Configuring IAM permissions
- Enabling CloudWatch logging
Check the deployment status:
agentcore status
This shows your endpoint ARN, CloudWatch logs location, and other deployment details.
Test the production deployment:
agentcore invoke '{ "prompt": "What did I say about fruit?" }'
The AgentCore starter toolkit automatically preserves the session, so further invocations continue the conversation:
agentcore invoke '{ "prompt": "Thanks for remembering that!" }'
Shared Memory Architecture
One key architectural decision I made was creating a unified memory management module that's identical across all framework implementations in this series. This memory.py
module contains two classes and two standalone functions:
Classes:
-
MemoryConfig
class: Manages centralized configuration-
__init__()
method: Loads the memory configuration from JSON file -
memory_id
property: Returns the configured memory ID
-
-
MemoryManager
class: High-level interface for all memory operations-
get_memory_context()
method: Retrieves both conversation history and relevant memories -
store_conversation()
method: Saves user input and agent responses - Additional helper methods for managing session state
-
Standalone Functions:
-
retrieve_memories_for_actor()
: Performs semantic search across memory namespaces for a specific actor -
format_memory_context()
: Formats retrieved memories into consistent text for injection into prompts
By sharing this module across Strands Agents, CrewAI, Pydantic AI, LlamaIndex, and LangGraph implementations, I ensure consistency and portability. Memory created by one framework can be used by another, and improvements benefit all implementations. Each framework uses these components slightly differently—for example, Strands Agents primarily uses the standalone functions within its hooks, while other frameworks might instantiate the MemoryManager
class directly.
Production Considerations
The combination of Strands Agents and AgentCore provides several production-ready features:
Security: Each session runs in an isolated microVM, preventing data leakage between users. The session isolation at infrastructure-level strengthens data privacy.
Scalability: AgentCore Runtime automatically scales based on demand, handling everything from a few requests to thousands of concurrent sessions.
Observability: Built-in CloudWatch integration provides logs, metrics, and traces for monitoring agent behavior and debugging issues.
Memory Persistence: Conversations and extracted insights persist beyond session boundaries, enabling truly personalized experiences.
Framework Flexibility: The clean separation between agent logic and infrastructure means you can evolve your agent implementation without changing deployment configuration.
What's Next
This Strands Agents implementation demonstrates how to build a clean, maintainable agent with persistent memory and production-ready deployment. The hook-based architecture keeps concerns separated, making the code easy to test and evolve.
In the next article, I'll show how to build collaborative multi-agent systems with CrewAI, using the same AgentCore infrastructure and memory configuration. You'll see how different frameworks can leverage the same deployment patterns while bringing their unique strengths to the table.
The complete code is available on GitHub. I encourage you to explore the repository, experiment with the implementation, and see how AgentCore simplifies the journey from prototype to production.
Ready to build your own production AI agent? Clone the repo and start experimenting!
Top comments (0)