Elizabeth Fuentes L for AWS

Posted on Oct 22 • Originally published at builder.aws.com

Bring AI agents with Long-Term memory into production in minutes

#aws #tutorial #agentcore #bedrock

Give Your AI Agents Long-Term Memory: Amazon Bedrock AgentCore Memory in Action

My Amazon Bedrock AgentCore code sample

Part 2 of the AgentCore Series

🇻🇪🇨🇱 Dev.to LinkedIn GitHub Twitter Instagram YouTube

Elizabeth Fuentes L

AWS Developer Advocate specializing in AI/ML and Generative AI. I simplify complex cloud concepts through hands-on tutorials and real-world examples.

🧠 The Cross-Session Memory Problem

You deployed your first production AI agent with AgentCore Runtime. It works perfectly within conversations.

AgentCore Runtime already provides short-term memory - your agent remembers context within the same session (up to 8 hours or 15 minutes of inactivity).

But here's what happens when users return:

New session starts → Agent forgets everything 😤
User preferences lost → No personalization between visits 🤦‍♀️
Previous insights gone → Every interaction starts from zero 📝
No learning across sessions → Same questions answered repeatedly 🔄

Your agent has cross-session amnesia. Each new session ID means starting over, even for the same user ID.

Amazon Bedrock AgentCore Memory solves this. Your agents remember users across sessions, learn from past interactions, and provide personalized experiences that persist beyond session boundaries.

This tutorial shows you how to add long-term memory to your agents.

🧠 Memory Architecture Explained

AgentCore Runtime (Built-in)

Session memory - Context within conversations (automatic)
Container persistence - Up to 8 hours or 15 minutes inactivity
Same session ID - Full conversation history maintained

AgentCore Memory (This Tutorial)

Cross-session persistence - Remember across different session IDs
Long-term extraction - Key insights stored automatically
User-centric storage - Same user ID, different sessions
Intelligent retrieval - Relevant context when needed

AgentCore Services Overview

Service	Purpose	Memory Features
⭐ AgentCore Runtime	Serverless execution	Built-in session memory, Container isolation
⭐ AgentCore Memory	Cross-session persistence	Long-term insights, User preferences
AgentCore Identity	Credential management	API keys, OAuth tokens
AgentCore Code Interpreter	Code execution	Secure sandbox, Data analysis
AgentCore Browser	Web interaction	Cloud browser automation
AgentCore Gateway	API management	Tool discovery, Service integration
AgentCore Observability	Monitoring	Tracing, Dashboards, Debugging

⭐ This tutorial: Memory service adds cross-session intelligence to Runtime's built-in session memory.

Prerequisites

Before you begin, verify that you have:

AWS Account with appropriate permissions
Python 3.10+ environment
AWS CLI configured with aws configure

New AWS customers receive up to $200 in credits

Start at no cost with AWS Free Tier. Get $100 USD at sign-up plus $100 USD more exploring key services.

Let's add cross-session memory to your agents. 🧠

Tutorial Roadmap:

Setup ⚙️ → Code Agent 💻 → Test Memory ✅ → Deploy 🚀 → Validate 🔍

Estimated time: 15 minutes

Install Dependencies

pip install -r requirements.txt

Required packages:

bedrock-agentcore
strands-agents
strands-agents-tools

Agent Implementation with Cross-Session Memory

Memory Configuration and Agent Creation

"""
Production-Ready AI Agent with Memory
Remembers conversations and user preferences across sessions
"""
import os
from strands import Agent
from strands_tools import calculator
from bedrock_agentcore.runtime import BedrockAgentCoreApp
from bedrock_agentcore.memory.integrations.strands.config import AgentCoreMemoryConfig, RetrievalConfig
from bedrock_agentcore.memory.integrations.strands.session_manager import AgentCoreMemorySessionManager

app = BedrockAgentCoreApp()

MEMORY_ID = os.getenv("BEDROCK_AGENTCORE_MEMORY_ID")
REGION = os.getenv("AWS_REGION", "us-west-2")
MODEL_ID = "us.anthropic.claude-3-7-sonnet-20250219-v1:0"

# Global agent instance
_agent = None

def get_or_create_agent(actor_id: str, session_id: str) -> Agent:
    """
    Get existing agent or create new one with memory configuration.
    Since the container is pinned to the session ID, we only need one agent per container.
    """
    global _agent

    if _agent is None:
        # Configure memory with retrieval for user facts and preferences
        memory_config = AgentCoreMemoryConfig(
            memory_id=MEMORY_ID,
            session_id=session_id,
            actor_id=actor_id,
            retrieval_config={
                f"/users/{actor_id}/facts": RetrievalConfig(top_k=3, relevance_score=0.5),
                f"/users/{actor_id}/preferences": RetrievalConfig(top_k=3, relevance_score=0.5)
            }
        )

        # Create agent with memory session manager
        _agent = Agent(
            model=MODEL_ID,
            session_manager=AgentCoreMemorySessionManager(memory_config, REGION),
            system_prompt="You are a helpful assistant with memory. Remember user preferences and facts across conversations. Use the calculate tool for math problems.",
            tools=[calculator]
        )

    return _agent

AgentCore Entry Point

@app.entrypoint
def invoke(payload, context):
    """AgentCore Runtime entry point with lazy-loaded agent"""
    if not MEMORY_ID:
        return {"error": "Memory not configured. Set BEDROCK_AGENTCORE_MEMORY_ID environment variable."}

    # Extract session and actor information
    actor_id = context.request_headers.get('X-Amzn-Bedrock-AgentCore-Runtime-Custom-Actor-Id', 'user') if context.request_headers else 'user'
    session_id = context.session_id or 'default_session'

    # Get or create agent (lazy loading)
    agent = get_or_create_agent(actor_id, session_id)

    prompt = payload.get("prompt", "Hello!")
    result = agent(prompt)

    return {
        "response": result.message.get('content', [{}])[0].get('text', str(result))
    }

if __name__ == "__main__":
    app.run()

Memory Strategies

AgentCore Memory provides three built-in strategies for different use cases:

User Preferences Strategy

Automatically identifies and extracts user preferences, choices, and styles from conversations. This creates a persistent profile of each user for personalized interactions.

Example: An e-commerce agent remembers a user's favorite brands and preferred size, offering tailored product recommendations in future sessions.

Semantic Strategy

Identifies and extracts key factual information and contextual knowledge from conversations. This builds a persistent knowledge base about important entities, events, and details.

Example: A customer support agent remembers that order #ABC-123 relates to a specific support ticket, so you don't need to provide the order number again.

Session Summaries Strategy

Creates condensed, running summaries of conversations within a single session. This captures key topics and decisions for quick context recall.

Example: After a 30-minute troubleshooting session, the agent accesses a summary: "User reported issue with software v2.1, attempted a restart, and was provided a link to the knowledge base article."

For advanced use cases, you can configure custom strategies with overrides to fine-tune memory extraction with your own prompts and foundation models.

Deploy with Cross-Session Memory

Deploy your memory-enabled agent:

Configure Agent

agentcore configure -e my_agent_memory.py

Select 'yes' for memory when prompted

Select 'yes' for long-term memory extraction

Launch to Production

agentcore launch

AgentCore automatically:

Creates cross-session memory store
Configures long-term extraction pipelines
Sets up user-centric persistence
Provides production endpoint

Test Cross-Session Memory in Production

Short-term Memory Test (Same Session)

import json
import uuid
import boto3
import os
import sys

def test_short_memory(agent_arn, region=None):
    """Test short-term memory within a single session"""

    # Extract region from ARN if not provided
    if not region:
        region = agent_arn.split(':')[3]

    # Initialize client
    client = boto3.client('bedrock-agentcore', region_name=region)

    # Generate session ID
    session_id = str(uuid.uuid4())

    print(f"Testing short-term memory in session: {session_id}")
    print(f"Region: {region}")
    print("-" * 50)

    try:
        # First message - establish context
        print("Message 1: Setting context...")
        payload1 = json.dumps({"prompt": "My name is Alice and I like chocolate ice cream"}).encode()

        response1 = client.invoke_agent_runtime(
            agentRuntimeArn=agent_arn,
            runtimeSessionId=session_id,
            payload=payload1,
            qualifier="DEFAULT"
        )

        content1 = []
        for chunk in response1.get("response", []):
            content1.append(chunk.decode('utf-8'))

        result1 = json.loads(''.join(content1))
        print(f"Agent: {result1.get('response', 'No response')}")
        print()

        # Second message - test memory recall
        print("Message 2: Testing memory recall...")
        payload2 = json.dumps({"prompt": "What is my name and what do I like?"}).encode()

        response2 = client.invoke_agent_runtime(
            agentRuntimeArn=agent_arn,
            runtimeSessionId=session_id,  # Same session
            payload=payload2,
            qualifier="DEFAULT"
        )

        content2 = []
        for chunk in response2.get("response", []):
            content2.append(chunk.decode('utf-8'))

        result2 = json.loads(''.join(content2))
        print(f"Agent: {result2.get('response', 'No response')}")

        print("\n✓ Short-term memory test completed")

    except Exception as e:
        print(f"Error: {e}")
        import traceback
        traceback.print_exc()

Run the test:

python test_short_memory.py "AGENT_ARN"

Long-term Memory Test (Different Sessions)

import json
import uuid
import boto3
import os
import sys
import time

def test_long_memory(agent_arn, region=None):
    """Test long-term memory across different sessions"""

    # Extract region from ARN if not provided
    if not region:
        region = agent_arn.split(':')[3]

    # Initialize client
    client = boto3.client('bedrock-agentcore', region_name=region)

    # Generate session IDs
    session_1 = str(uuid.uuid4())
    session_2 = str(uuid.uuid4())

    print(f"Testing long-term memory across sessions")
    print(f"Region: {region}")
    print("-" * 50)

    try:
        # Session 1: Store user information
        print(f"Session 1: {session_1}")
        print("Storing user preferences...")

        payload1 = json.dumps({"prompt": "My name is Sarah and I'm a software engineer. I prefer Python over JavaScript."}).encode()

        response1 = client.invoke_agent_runtime(
            agentRuntimeArn=agent_arn,
            runtimeSessionId=session_1,
            payload=payload1,
            qualifier="DEFAULT"
        )

        content1 = []
        for chunk in response1.get("response", []):
            content1.append(chunk.decode('utf-8'))

        result1 = json.loads(''.join(content1))
        print(f"Agent: {result1.get('response', 'No response')}")
        print()

        # Wait for long-term memory extraction
        print("Waiting 25 seconds for long-term memory extraction...")
        time.sleep(25)

        # Session 2: Test memory recall
        print(f"Session 2: {session_2}")
        print("Testing cross-session memory recall...")

        payload2 = json.dumps({"prompt": "What do you remember about me? What's my name and what do I prefer?"}).encode()

        response2 = client.invoke_agent_runtime(
            agentRuntimeArn=agent_arn,
            runtimeSessionId=session_2,  # Different session
            payload=payload2,
            qualifier="DEFAULT"
        )

        content2 = []
        for chunk in response2.get("response", []):
            content2.append(chunk.decode('utf-8'))

        result2 = json.loads(''.join(content2))
        print(f"Agent: {result2.get('response', 'No response')}")

        print("\n✓ Long-term memory test completed")

    except Exception as e:
        print(f"Error: {e}")
        import traceback
        traceback.print_exc()

Run the test:

python test_long_memory.py "AGENT_ARN"

# Or using environment variables
export AGENT_ARN="your-agent-arn"
python test_long_memory.py

This test verifies AgentCore Memory's cross-session persistence:

Stores user preferences in session 1
Waits 25 seconds for long-term memory extraction
Tests recall in different session 2 (same user ID)
Verifies cross-session memory works

Memory Management Operations

AgentCore Memory provides comprehensive management capabilities:

Save and retrieve insights - Store and access extracted knowledge
Retrieve memory records - Access specific memory entries
List memory records - Browse stored memories
Delete memory records - Remove outdated information

Memory Architecture Benefits

Runtime Only	Runtime + AgentCore Memory
✅ Session context (8 hours)	✅ Session context (8 hours)
❌ Lost after session ends	✅ Persistent across sessions
❌ No user learning	✅ User preferences remembered
❌ Repeat information	✅ Intelligent context retrieval
❌ Generic responses	✅ Personalized interactions

Key Cross-Session Features

User persistence - Same user ID, different session IDs remembered
Automatic extraction - Key insights stored during conversations
Intelligent retrieval - Relevant past context when needed
Preference learning - User choices persist across visits
Scalable architecture - Fully managed cross-session service

The invoke Function

The invoke function is the main entry point for your AgentCore agent. It:

Receives user prompts and context from AgentCore Runtime
Extracts session and actor IDs for memory management
Creates or retrieves the agent instance with memory configuration
Processes the user message and returns the response

@app.entrypoint
def invoke(payload, context):
    """AgentCore Runtime entry point with lazy-loaded agent"""
    # Extract user prompt
    prompt = payload.get("prompt", "Hello!")

    # Get session/actor info for memory
    actor_id = context.request_headers.get('X-Amzn-Bedrock-AgentCore-Runtime-Custom-Actor-Id', 'user') if context.request_headers else 'user'
    session_id = context.session_id or 'default_session'

    # Get agent with memory
    agent = get_or_create_agent(actor_id, session_id)

    # Process and return response
    result = agent(prompt)
    return {"response": result.message.get('content', [{}])[0].get('text', str(result))}

Clean Up

Remove all resources:

agentcore destroy

This removes AgentCore deployment, memory stores, ECR repository, IAM roles, and CloudWatch logs.

🎉 You Just Built Cross-Session Memory AI Agents!

Your agents now remember users across sessions, learn from past interactions, and provide personalized experiences that persist beyond session boundaries. What intelligent applications will you build? 🚀

❤️ If This Helped You

❤️ Heart it - helps others discover this tutorial

🦄 Unicorn it - if it blew your mind

🔖 Bookmark it - for when you need it later

📤 Share it - with your team or on social media