DEV Community

Cover image for AI Agent Memory: From Manual Implementation to Mem0 to AWS AgentCORE
Sudarshan Gouda
Sudarshan Gouda

Posted on

AI Agent Memory: From Manual Implementation to Mem0 to AWS AgentCORE

AI Agent Memory: From Manual Implementation to Mem0 to AWS AgentCORE

Introduction

AI agents need memory to remember past conversations, user preferences, and learned information. Just like humans have different types of memory (short-term, long-term, episodic), AI agents use different memory systems to function effectively.

This guide explains memory in simple terms and shows you how to implement it both without external tools (using pure Python) and with external tools (using specialized services). We'll end with a complete end-to-end solution using Mem0 that combines all memory types.


Understanding Memory Types (Simple Explanation)

Think of AI agent memory like human memory:

Memory Type What It Does Simple Example
Short-term Memory Remembers current conversation "What did the user just say?"
Long-term Memory Remembers across sessions "User prefers dark mode" (even after days)
Episodic Memory Remembers specific past events "Last week, user asked about Python"
Semantic Memory Remembers facts and knowledge "User is a software developer"

Part 1: Memory Without External Tools

When you don't want to use external databases or services, you can implement memory using pure Python. This is great for:

  • Learning and prototyping
  • Small applications
  • Full control over your data

1.1 Simple Short-Term Memory (Current Conversation)

What it does: Keeps track of the current conversation.

class SimpleShortTermMemory:
    """Remembers the current conversation"""

    def __init__(self, max_messages=10):
        self.messages = []
        self.max_messages = max_messages

    def add_message(self, role, content):
        """Add a message (user or assistant)"""
        self.messages.append({"role": role, "content": content})

        # Keep only recent messages
        if len(self.messages) > self.max_messages:
            self.messages.pop(0)  # Remove oldest

    def get_conversation(self):
        """Get all messages for the LLM"""
        return self.messages

# Usage
memory = SimpleShortTermMemory(max_messages=5)

memory.add_message("user", "Hi, I'm Alice")
memory.add_message("assistant", "Hello Alice! How can I help?")
memory.add_message("user", "What's my name?")

# Get conversation context
context = memory.get_conversation()
# LLM can now see: user said "Hi, I'm Alice" and assistant responded
Enter fullscreen mode Exit fullscreen mode

1.2 Simple Long-Term Memory (User Preferences)

What it does: Remembers user preferences across sessions.

import json
import os

class SimpleLongTermMemory:
    """Remembers user preferences and facts"""

    def __init__(self, storage_file="memory.json"):
        self.storage_file = storage_file
        self.data = self._load()

    def _load(self):
        """Load from file"""
        if os.path.exists(self.storage_file):
            with open(self.storage_file, 'r') as f:
                return json.load(f)
        return {"preferences": {}, "facts": []}

    def _save(self):
        """Save to file"""
        with open(self.storage_file, 'w') as f:
            json.dump(self.data, f, indent=2)

    def remember_preference(self, user_id, key, value):
        """Remember a user preference"""
        if user_id not in self.data["preferences"]:
            self.data["preferences"][user_id] = {}
        self.data["preferences"][user_id][key] = value
        self._save()

    def get_preference(self, user_id, key):
        """Get a user preference"""
        return self.data["preferences"].get(user_id, {}).get(key)

    def remember_fact(self, user_id, fact):
        """Remember a fact about the user"""
        if user_id not in self.data["facts"]:
            self.data["facts"][user_id] = []
        self.data["facts"][user_id].append(fact)
        self._save()

    def get_facts(self, user_id):
        """Get all facts about a user"""
        return self.data["facts"].get(user_id, [])

# Usage
ltm = SimpleLongTermMemory()

# Remember preferences
ltm.remember_preference("alice_123", "theme", "dark")
ltm.remember_preference("alice_123", "language", "Python")

# Remember facts
ltm.remember_fact("alice_123", "User is a software developer")
ltm.remember_fact("alice_123", "User works at TechCorp")

# Later, retrieve memories
theme = ltm.get_preference("alice_123", "theme")  # Returns "dark"
facts = ltm.get_facts("alice_123")  # Returns list of facts
Enter fullscreen mode Exit fullscreen mode

1.3 Simple Episodic Memory (Past Interactions)

What it does: Remembers specific past conversations to learn from them.

class SimpleEpisodicMemory:
    """Remembers past interactions"""

    def __init__(self, max_episodes=100):
        self.episodes = []
        self.max_episodes = max_episodes

    def add_episode(self, user_query, assistant_response, outcome="success"):
        """Store a past interaction"""
        episode = {
            "query": user_query,
            "response": assistant_response,
            "outcome": outcome
        }
        self.episodes.append(episode)

        # Keep only recent episodes
        if len(self.episodes) > self.max_episodes:
            self.episodes.pop(0)

    def find_similar(self, query, top_k=3):
        """Find similar past interactions"""
        # Simple keyword matching
        query_words = set(query.lower().split())
        scored = []

        for episode in self.episodes:
            episode_words = set(episode["query"].lower().split())
            # Count matching words
            matches = len(query_words.intersection(episode_words))
            if matches > 0:
                scored.append((matches, episode))

        # Sort by matches and return top_k
        scored.sort(reverse=True)
        return [ep for _, ep in scored[:top_k]]

# Usage
episodic = SimpleEpisodicMemory()

# Store past successful interactions
episodic.add_episode(
    "How do I create a Python virtual environment?",
    "Use: python -m venv myenv, then activate with: source myenv/bin/activate",
    outcome="success"
)

episodic.add_episode(
    "What's the best way to handle Python dependencies?",
    "Use requirements.txt or pyproject.toml with pip or poetry",
    outcome="success"
)

# Find similar past interactions
similar = episodic.find_similar("How do I set up a Python project?")
# Returns similar past episodes that can be used as examples
Enter fullscreen mode Exit fullscreen mode

1.4 Simple Semantic Memory (Knowledge Base)

What it does: Stores facts and knowledge that can be searched.

class SimpleSemanticMemory:
    """Stores and searches knowledge"""

    def __init__(self):
        self.knowledge = []

    def add_knowledge(self, content, category="general"):
        """Add a piece of knowledge"""
        self.knowledge.append({
            "content": content,
            "category": category
        })

    def search(self, query, top_k=3):
        """Search for relevant knowledge"""
        query_words = set(query.lower().split())
        scored = []

        for item in self.knowledge:
            content_words = set(item["content"].lower().split())
            matches = len(query_words.intersection(content_words))
            if matches > 0:
                scored.append((matches, item))

        scored.sort(reverse=True)
        return [item for _, item in scored[:top_k]]

# Usage
semantic = SimpleSemanticMemory()

# Add knowledge
semantic.add_knowledge("Alice is a data scientist at TechCorp", "user_profile")
semantic.add_knowledge("Alice prefers detailed technical explanations", "preferences")
semantic.add_knowledge("Alice uses Python and scikit-learn", "tools")

# Search for relevant knowledge
results = semantic.search("What tools does Alice use?")
# Returns: ["Alice uses Python and scikit-learn"]
Enter fullscreen mode Exit fullscreen mode

1.5 Complete Example: All Memory Types Together

class SimpleMemoryAgent:
    """Agent with all memory types (no external tools)"""

    def __init__(self):
        self.short_term = SimpleShortTermMemory(max_messages=10)
        self.long_term = SimpleLongTermMemory()
        self.episodic = SimpleEpisodicMemory()
        self.semantic = SimpleSemanticMemory()

    def process_query(self, user_id, user_query):
        """Process a user query using all memory types"""

        # 1. Get long-term memories (preferences, facts)
        preferences = self.long_term.data.get("preferences", {}).get(user_id, {})
        facts = self.long_term.get_facts(user_id)

        # 2. Get similar past episodes (few-shot examples)
        similar_episodes = self.episodic.find_similar(user_query, top_k=2)

        # 3. Get relevant knowledge
        relevant_knowledge = self.semantic.search(user_query, top_k=2)

        # 4. Build context for LLM
        context = f"""User Preferences: {preferences}
Known Facts: {facts}

Similar Past Interactions:
{chr(10).join([f"Q: {e['query']}\nA: {e['response']}" for e in similar_episodes])}

Relevant Knowledge:
{chr(10).join([k['content'] for k in relevant_knowledge])}

Current Conversation:
{self.short_term.get_conversation()}
"""

        # 5. Add to short-term memory
        self.short_term.add_message("user", user_query)

        # 6. Generate response (pseudo-code - replace with actual LLM call)
        response = f"Response to: {user_query}"

        # 7. Store in episodic memory
        self.episodic.add_episode(user_query, response, outcome="success")

        # 8. Add response to short-term memory
        self.short_term.add_message("assistant", response)

        return response

# Usage
agent = SimpleMemoryAgent()

# Set up some memories
agent.long_term.remember_preference("alice_123", "theme", "dark")
agent.semantic.add_knowledge("Alice is a Python developer", "profile")

# Process queries
response1 = agent.process_query("alice_123", "Hi, I'm Alice")
response2 = agent.process_query("alice_123", "What's my favorite theme?")
# Agent remembers from long-term memory: "dark"
Enter fullscreen mode Exit fullscreen mode

Part 2: Memory With External Tools

External tools provide better scalability, persistence, and advanced features like semantic search. This is better for:

  • Production applications
  • Large-scale systems
  • Multiple users
  • Advanced search capabilities

2.1 LangGraph Checkpointer (Short-Term + Persistence)

What it does: Manages conversation state with automatic persistence.

from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.messages import HumanMessage, AIMessage
from langchain_openai import ChatOpenAI
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

# Define state
class ConversationState(TypedDict):
    messages: Annotated[list, add_messages]

# Initialize LLM
llm = ChatOpenAI(model="gpt-4")

# Create graph
def chat_node(state: ConversationState):
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

workflow = StateGraph(ConversationState)
workflow.add_node("chat", chat_node)
workflow.add_edge(START, "chat")
workflow.add_edge("chat", END)

# Add checkpointer for persistence
checkpointer = InMemorySaver()
graph = workflow.compile(checkpointer=checkpointer)

# Usage - each thread_id maintains separate conversation
config = {"configurable": {"thread_id": "user_alice"}}

# First message
result = graph.invoke(
    {"messages": [HumanMessage(content="Hi, I'm Alice!")]},
    config
)

# Second message - remembers previous conversation
result = graph.invoke(
    {"messages": [HumanMessage(content="What's my name?")]},
    config
)
# LLM remembers: "Alice"
Enter fullscreen mode Exit fullscreen mode

2.2 ChromaDB (Semantic Memory)

What it does: Vector database for semantic search over knowledge.

import chromadb
from chromadb.utils import embedding_functions

class ChromaSemanticMemory:
    """Semantic memory using ChromaDB"""

    def __init__(self, collection_name="knowledge"):
        self.client = chromadb.PersistentClient(path="./chroma_db")

        # Use OpenAI embeddings
        self.embedding_fn = embedding_functions.OpenAIEmbeddingFunction(
            model_name="text-embedding-3-small"
        )

        self.collection = self.client.get_or_create_collection(
            name=collection_name,
            embedding_function=self.embedding_fn
        )

    def add_knowledge(self, content, metadata=None):
        """Add knowledge to the database"""
        self.collection.add(
            documents=[content],
            metadatas=[metadata or {}]
        )

    def search(self, query, n_results=3):
        """Search for semantically similar knowledge"""
        results = self.collection.query(
            query_texts=[query],
            n_results=n_results
        )
        return results['documents'][0]  # Returns list of relevant content

# Usage
memory = ChromaSemanticMemory()

# Add knowledge
memory.add_knowledge("Alice prefers Python over JavaScript")
memory.add_knowledge("Alice is building a recommendation system")

# Search semantically
results = memory.search("What programming language does Alice like?")
# Returns: ["Alice prefers Python over JavaScript"]
# Even though query doesn't match exactly, semantic search finds it
Enter fullscreen mode Exit fullscreen mode

2.3 Pinecone (Episodic Memory at Scale)

What it does: Cloud vector database for storing millions of past interactions.

from pinecone import Pinecone, ServerlessSpec
from openai import OpenAI

class PineconeEpisodicMemory:
    """Episodic memory using Pinecone"""

    def __init__(self, index_name="episodes"):
        self.pc = Pinecone()
        self.openai = OpenAI()

        # Create index if needed
        if index_name not in [idx.name for idx in self.pc.list_indexes()]:
            self.pc.create_index(
                name=index_name,
                dimension=1536,  # OpenAI embedding dimension
                metric="cosine",
                spec=ServerlessSpec(cloud="aws", region="us-east-1")
            )

        self.index = self.pc.Index(index_name)

    def store_episode(self, episode_id, query, response, user_id):
        """Store a past interaction"""
        # Create embedding
        text = f"Query: {query}\nResponse: {response}"
        embedding = self.openai.embeddings.create(
            model="text-embedding-3-small",
            input=text
        ).data[0].embedding

        # Store in Pinecone
        self.index.upsert(vectors=[{
            "id": episode_id,
            "values": embedding,
            "metadata": {
                "query": query,
                "response": response,
                "user_id": user_id
            }
        }])

    def find_similar(self, query, user_id=None, top_k=3):
        """Find similar past interactions"""
        # Create query embedding
        embedding = self.openai.embeddings.create(
            model="text-embedding-3-small",
            input=query
        ).data[0].embedding

        # Search
        results = self.index.query(
            vector=embedding,
            top_k=top_k,
            include_metadata=True,
            filter={"user_id": user_id} if user_id else None
        )

        return [match.metadata for match in results.matches]

# Usage
episodic = PineconeEpisodicMemory()

# Store episodes
episodic.store_episode(
    "ep_001",
    "How do I optimize a database query?",
    "Add indexes, use EXPLAIN, and optimize WHERE clauses",
    user_id="alice_123"
)

# Find similar
similar = episodic.find_similar(
    "My database is slow, what should I do?",
    user_id="alice_123"
)
# Returns similar past interactions
Enter fullscreen mode Exit fullscreen mode

Part 3: End-to-End Solution with Mem0 (All Memory Types)

Mem0 is a specialized service that handles all memory types automatically. It extracts, stores, and retrieves memories intelligently.

Complete Mem0 Implementation

from mem0 import MemoryClient
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from typing import List, Dict

class Mem0MemoryAgent:
    """Complete agent using Mem0 for all memory types"""

    def __init__(self):
        # Initialize Mem0 (requires MEM0_API_KEY environment variable)
        self.mem0 = MemoryClient()

        # Initialize LLM
        self.llm = ChatOpenAI(model="gpt-4")

        # Create prompt template with memory context
        self.prompt = ChatPromptTemplate.from_messages([
            ("system", """You are a helpful personal assistant with memory.
Use the provided memories to personalize your responses.

Relevant Memories:
{memories}

Use these memories to provide personalized, context-aware responses."""),
            MessagesPlaceholder(variable_name="history"),
            ("user", "{input}")
        ])

    def get_memories(self, query: str, user_id: str) -> str:
        """Retrieve relevant memories for the current query"""
        try:
            results = self.mem0.search(query, user_id=user_id)
            if results.get("results"):
                memories = []
                for mem in results["results"]:
                    memories.append(f"- {mem['memory']}")
                return "\n".join(memories)
            return "No relevant memories found."
        except Exception as e:
            print(f"Memory retrieval error: {e}")
            return "No relevant memories found."

    def save_interaction(self, user_id: str, user_input: str, assistant_response: str):
        """Save interaction to Mem0 - it automatically extracts memories"""
        try:
            self.mem0.add(
                messages=[
                    {"role": "user", "content": user_input},
                    {"role": "assistant", "content": assistant_response}
                ],
                user_id=user_id
            )
        except Exception as e:
            print(f"Memory save error: {e}")

    def chat(self, user_input: str, user_id: str, history: List[Dict] = None) -> str:
        """Main chat function with full memory integration"""
        history = history or []

        # 1. Retrieve relevant memories (Mem0 handles all memory types)
        memories = self.get_memories(user_input, user_id)

        # 2. Generate response with memory context
        chain = self.prompt | self.llm
        response = chain.invoke({
            "memories": memories,
            "history": history,
            "input": user_input
        })

        # 3. Save interaction (Mem0 automatically extracts and stores memories)
        self.save_interaction(user_id, user_input, response.content)

        return response.content

    def get_all_memories(self, user_id: str) -> List[Dict]:
        """Get all memories for a user"""
        try:
            results = self.mem0.get_all(user_id=user_id)
            return results.get("results", [])
        except Exception as e:
            print(f"Error retrieving memories: {e}")
            return []

    def delete_memory(self, memory_id: str):
        """Delete a specific memory"""
        try:
            self.mem0.delete(memory_id=memory_id)
        except Exception as e:
            print(f"Error deleting memory: {e}")

# Complete Usage Example
def main():
    """End-to-end example using Mem0"""

    # Initialize agent
    agent = Mem0MemoryAgent()
    user_id = "alice_123"
    conversation_history = []

    print("=== Conversation 1 ===")
    # First interaction
    user_input1 = "Hi! I'm Alice and I love hiking in the mountains. I'm a Python developer at TechCorp."
    response1 = agent.chat(user_input1, user_id, conversation_history)
    print(f"User: {user_input1}")
    print(f"Assistant: {response1}\n")

    # Update history
    conversation_history.append({"role": "user", "content": user_input1})
    conversation_history.append({"role": "assistant", "content": response1})

    print("=== Conversation 2 (Same Session) ===")
    # Second interaction - Mem0 remembers from first conversation
    user_input2 = "What outdoor activities would you recommend for this weekend?"
    response2 = agent.chat(user_input2, user_id, conversation_history)
    print(f"User: {user_input2}")
    print(f"Assistant: {response2}\n")
    # Mem0 recalls: Alice loves hiking → recommends hiking activities

    print("=== Conversation 3 (New Session - Days Later) ===")
    # New session - Mem0 still remembers!
    user_input3 = "What programming language should I use for my new project?"
    response3 = agent.chat(user_input3, user_id, [])  # Empty history, but Mem0 remembers
    print(f"User: {user_input3}")
    print(f"Assistant: {response3}\n")
    # Mem0 recalls: Alice is a Python developer → recommends Python

    print("=== All Memories for User ===")
    # View all stored memories
    all_memories = agent.get_all_memories(user_id)
    for i, mem in enumerate(all_memories, 1):
        print(f"{i}. {mem.get('memory', 'N/A')}")

    print("\n=== Memory Types Handled by Mem0 ===")
    print("""
    Mem0 automatically handles:
    - Short-term Memory: Current conversation context
    - Long-term Memory: User preferences and facts (persisted)
    - Episodic Memory: Past interactions and experiences
    - Semantic Memory: Knowledge about the user and domain

    All extracted automatically from conversations!
    """)

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

How Mem0 Handles All Memory Types

Mem0 automatically extracts and manages different memory types:

  1. Short-term Memory: Maintains conversation context during the session
  2. Long-term Memory: Extracts user preferences and facts, stores them persistently
  3. Episodic Memory: Remembers specific past interactions and their outcomes
  4. Semantic Memory: Builds a knowledge base about users and topics

Key Benefits of Mem0:

  • ✅ Automatic memory extraction (no manual coding)
  • ✅ Intelligent retrieval (finds relevant memories)
  • ✅ Handles all memory types automatically
  • ✅ Production-ready and scalable
  • ✅ Simple API

Part 4: AWS AgentCORE Memory (Alternative to Mem0)

AWS Bedrock AgentCORE Memory is a fully managed AWS service that provides similar capabilities to Mem0. It's designed for applications already using AWS services and offers enterprise-grade features.

Can AWS AgentCORE Memory be Used Like Mem0?

Yes! AWS AgentCORE Memory can be used similarly to Mem0. Both provide:

  • Short-term and long-term memory
  • Automatic memory extraction
  • Context-aware retrieval
  • Multi-session persistence

Key Differences

Feature Mem0 AWS AgentCORE Memory
Deployment Open-source + Managed Fully managed AWS service
Integration Works with any LLM Optimized for AWS Bedrock
Setup Simple API key AWS account + IAM setup
Cost Usage-based pricing AWS pricing model
Customization Open-source option available AWS-managed (less customization)
Best For Multi-cloud, flexibility AWS-native applications

AWS AgentCORE Memory Implementation

Note: This requires the bedrock-agentcore Python SDK. Install with:

pip install bedrock-agentcore
Enter fullscreen mode Exit fullscreen mode
from bedrock_agentcore.memory import MemoryClient
from bedrock_agentcore.memory.session import MemorySessionManager
from bedrock_agentcore.memory.constants import ConversationalMessage, MessageRole
from typing import List, Dict, Optional
from datetime import datetime

class AWSAgentCOREMemory:
    """Agent using AWS Bedrock AgentCORE Memory"""

    def __init__(self, region_name="us-east-1", memory_name="AgentMemory"):
        # Initialize Memory Client
        self.memory_client = MemoryClient(region_name=region_name)

        # Create or get memory resource
        self.memory = self._get_or_create_memory(memory_name)
        self.memory_id = self.memory['id']

        # Initialize session manager
        self.session_manager = MemorySessionManager(
            memory_id=self.memory_id,
            region_name=region_name
        )

    def _get_or_create_memory(self, name: str) -> Dict:
        """Create or retrieve memory resource"""
        try:
            # Try to get existing memory
            memories = self.memory_client.list_memories()
            for mem in memories.get('memories', []):
                if mem.get('name') == name:
                    return mem

            # Create new memory if not found
            memory = self.memory_client.create_memory(
                name=name,
                description="Memory store for AI agent",
                eventExpiryDuration=30,  # Store events for 30 days
                memoryStrategies=[
                    {
                        "userPreferenceMemoryStrategy": {
                            "name": "UserPreferences",
                            "namespaces": ["agent/{actorId}/preferences"]
                        }
                    },
                    {
                        "semanticMemoryStrategy": {
                            "name": "SemanticKnowledge",
                            "namespaces": ["agent/{actorId}/knowledge"]
                        }
                    }
                ]
            )
            return memory
        except Exception as e:
            print(f"Error creating memory: {e}")
            raise

    def store_interaction(self, user_id: str, session_id: str, 
                         user_message: str, assistant_message: str):
        """Store interaction in short-term memory"""
        try:
            # Create or get session
            session = self.session_manager.create_memory_session(
                actor_id=user_id,
                session_id=session_id
            )

            # Add conversation turns
            session.add_turns(
                messages=[
                    ConversationalMessage(user_message, MessageRole.USER),
                    ConversationalMessage(assistant_message, MessageRole.ASSISTANT)
                ]
            )
        except Exception as e:
            print(f"Error storing interaction: {e}")

    def get_recent_events(self, user_id: str, session_id: str, max_results: int = 10) -> List[Dict]:
        """Get recent events from short-term memory"""
        try:
            events = self.memory_client.list_events(
                memory_id=self.memory_id,
                actor_id=user_id,
                session_id=session_id,
                max_results=max_results
            )
            return events.get('events', [])
        except Exception as e:
            print(f"Error retrieving events: {e}")
            return []

    def retrieve_long_term_memories(self, user_id: str, query: str, 
                                   top_k: int = 5) -> List[Dict]:
        """Retrieve long-term memories (preferences, facts)"""
        try:
            # Search in preferences namespace
            preferences = self.memory_client.retrieve_memory_records(
                memory_id=self.memory_id,
                namespace=f"agent/{user_id}/preferences",
                searchCriteria={
                    "searchQuery": query,
                    "topK": top_k
                }
            )

            # Search in knowledge namespace
            knowledge = self.memory_client.retrieve_memory_records(
                memory_id=self.memory_id,
                namespace=f"agent/{user_id}/knowledge",
                searchCriteria={
                    "searchQuery": query,
                    "topK": top_k
                }
            )

            # Combine results
            all_memories = (preferences.get('memoryRecords', []) + 
                          knowledge.get('memoryRecords', []))
            return all_memories[:top_k]

        except Exception as e:
            print(f"Error retrieving long-term memories: {e}")
            return []

    def get_all_long_term_memories(self, user_id: str) -> List[Dict]:
        """Get all long-term memories for a user"""
        try:
            session = self.session_manager.create_memory_session(
                actor_id=user_id,
                session_id="retrieval_session"
            )

            # List all memory records
            memory_records = session.list_long_term_memory_records(
                namespace_prefix=f"agent/{user_id}/"
            )

            return list(memory_records)
        except Exception as e:
            print(f"Error getting all memories: {e}")
            return []

    def chat(self, user_input: str, user_id: str, session_id: str, 
             llm_callback=None) -> str:
        """
        Main chat function with AgentCORE Memory integration

        Args:
            user_input: User's message
            user_id: Unique user identifier
            session_id: Session identifier
            llm_callback: Function to call LLM (you provide this)

        Returns:
            Assistant response
        """
        # 1. Get recent events (short-term memory)
        recent_events = self.get_recent_events(user_id, session_id, max_results=5)

        # 2. Get long-term memories (preferences, facts)
        long_term_memories = self.retrieve_long_term_memories(
            user_id, user_input, top_k=3
        )

        # 3. Build context from memories
        context = self._build_context(recent_events, long_term_memories)

        # 4. Generate response using LLM (you provide this function)
        if llm_callback:
            assistant_response = llm_callback(user_input, context)
        else:
            # Placeholder response
            assistant_response = f"Response to: {user_input}"

        # 5. Store interaction in memory
        self.store_interaction(user_id, session_id, user_input, assistant_response)

        return assistant_response

    def _build_context(self, recent_events: List[Dict], 
                      long_term_memories: List[Dict]) -> str:
        """Build context string from memories"""
        context_parts = []

        # Add recent conversation context
        if recent_events:
            context_parts.append("Recent Conversation:")
            for event in recent_events[-5:]:  # Last 5 events
                messages = event.get('messages', [])
                for msg in messages:
                    role = msg.get('role', '')
                    content = msg.get('content', '')
                    context_parts.append(f"{role}: {content}")

        # Add long-term memories
        if long_term_memories:
            context_parts.append("\nRelevant Memories:")
            for mem in long_term_memories:
                content = mem.get('content', {}).get('text', '')
                if content:
                    context_parts.append(f"- {content}")

        return "\n".join(context_parts)

# Usage Example
def llm_generate(user_input: str, context: str) -> str:
    """
    Example LLM callback function
    In production, replace with actual LLM call (Bedrock, OpenAI, etc.)
    """
    # This is a placeholder - replace with your LLM
    return f"Based on context: {context[:50]}... Response to: {user_input}"

def main_aws():
    """Example using AWS AgentCORE Memory"""

    # Initialize AgentCORE Memory
    agent = AWSAgentCOREMemory(region_name="us-east-1", memory_name="MyAgentMemory")
    user_id = "alice_123"
    session_id = f"session_{datetime.now().timestamp()}"

    print("=== AWS AgentCORE Memory Example ===\n")

    # First interaction
    print("--- Conversation 1 ---")
    user_input1 = "Hi! I'm Alice and I love hiking in the mountains. I'm a Python developer at TechCorp."
    response1 = agent.chat(
        user_input1,
        user_id,
        session_id,
        llm_callback=llm_generate
    )
    print(f"User: {user_input1}")
    print(f"Assistant: {response1}\n")
    # AgentCORE stores this in short-term memory and extracts long-term memories

    # Second interaction - AgentCORE remembers from short-term memory
    print("--- Conversation 2 (Same Session) ---")
    user_input2 = "What outdoor activities would you recommend for this weekend?"
    response2 = agent.chat(
        user_input2,
        user_id,
        session_id,
        llm_callback=llm_generate
    )
    print(f"User: {user_input2}")
    print(f"Assistant: {response2}\n")
    # AgentCORE recalls: Alice loves hiking (from short-term memory)

    # New session - long-term memory persists
    print("--- Conversation 3 (New Session - Days Later) ---")
    new_session_id = f"session_{datetime.now().timestamp()}"
    user_input3 = "What programming language should I use for my new project?"
    response3 = agent.chat(
        user_input3,
        user_id,
        new_session_id,
        llm_callback=llm_generate
    )
    print(f"User: {user_input3}")
    print(f"Assistant: {response3}\n")
    # AgentCORE recalls from long-term memory: Alice is a Python developer

    # View all stored memories
    print("--- All Long-Term Memories for User ---")
    all_memories = agent.get_all_long_term_memories(user_id)
    for i, mem in enumerate(all_memories, 1):
        content = mem.get('content', {}).get('text', 'N/A')
        print(f"{i}. {content}")

if __name__ == "__main__":
    # Note: Requires:
    # 1. AWS credentials configured (aws configure)
    # 2. Bedrock AgentCORE access enabled
    # 3. Install: pip install bedrock-agentcore
    # 
    # Uncomment to run:
    # main_aws()
    pass
Enter fullscreen mode Exit fullscreen mode

How AWS AgentCORE Memory Works

AWS AgentCORE Memory provides:

  1. Short-Term Memory:

    • Stores raw interaction events using create_event() or add_turns()
    • Events organized by actor (user) and session
    • Maintains chronological order for conversation flow
    • Configurable retention (up to 365 days)
  2. Long-Term Memory:

    • Uses Memory Strategies to extract insights from events
    • Built-in strategies: userPreferenceMemoryStrategy, semanticMemoryStrategy
    • Stores extracted memories in hierarchical namespaces
    • Persists across sessions automatically
    • Retrieved using retrieve_memory_records() with search queries
  3. Memory Strategies:

    • User Preference Strategy: Extracts user preferences and settings
    • Semantic Strategy: Extracts facts and knowledge
    • Custom strategies can be defined for specific needs
    • Strategies process events and create long-term memory records
  4. Security:

    • Data encrypted at rest and in transit
    • AWS-managed or customer-managed KMS keys
    • Fine-grained access control via namespaces
    • IAM-based authentication
  5. Scalability:

    • Fully managed service - no infrastructure to manage
    • Handles large volumes efficiently
    • Low latency retrieval
    • Built for production workloads

When to Use AWS AgentCORE Memory vs Mem0

Choose AWS AgentCORE Memory if:

  • ✅ You're already using AWS services
  • ✅ You need enterprise-grade security and compliance
  • ✅ You want fully managed infrastructure
  • ✅ You're building AWS-native applications

Choose Mem0 if:

  • ✅ You want open-source flexibility
  • ✅ You're using multiple cloud providers
  • ✅ You need more customization
  • ✅ You want simpler setup (just API key)

Setup Requirements for AWS AgentCORE Memory

  1. AWS Account: Active AWS account with Bedrock AgentCORE access
  2. Install SDK: pip install bedrock-agentcore
  3. AWS Credentials: Configure using aws configure or IAM roles
  4. IAM Permissions: Required permissions for Bedrock AgentCORE Memory
  5. Region: Available in specific AWS regions (e.g., us-east-1, us-west-2)
# Prerequisites setup (one-time)

"""
1. Install the SDK:
   pip install bedrock-agentcore

2. Configure AWS Credentials:
   aws configure
   # Or use IAM roles if running on EC2/Lambda

3. Required IAM Permissions:
   - bedrock:CreateMemory
   - bedrock:GetMemory
   - bedrock:ListMemories
   - bedrock:UpdateMemory
   - bedrock:DeleteMemory
   - bedrock:CreateEvent
   - bedrock:ListEvents
   - bedrock:RetrieveMemoryRecords
   - bedrock:ListMemoryRecords

4. Enable Bedrock AgentCORE in AWS Console:
   - Go to AWS Bedrock Console
   - Request access to AgentCORE features
   - Wait for approval (if required)

5. Create Memory Resource:
   - The code automatically creates a memory resource
   - Or create manually via AWS Console/CLI
"""
Enter fullscreen mode Exit fullscreen mode

Comparison: Memory Solutions

Quick Comparison Table

Aspect Manual (No Tools) Mem0 AWS AgentCORE Memory
Setup Complexity Very Simple Simple (API key) Moderate (AWS setup)
Scalability Single machine High Enterprise-scale
Search Quality Keyword matching Semantic search Semantic search
Memory Extraction Manual coding Automatic Automatic
Persistence File-based Database-backed AWS-managed
Cost Free Usage-based AWS pricing
Best For Learning, prototyping Production (flexible) AWS-native apps
Open Source Yes Yes (option) No (AWS managed)
Multi-Cloud N/A Yes No (AWS only)

Detailed Comparison

Manual Memory (No External Tools)

  • Pros: Free, full control, simple setup, no dependencies
  • Cons: Limited scalability, manual extraction, basic search
  • Use When: Learning, prototyping, small applications

Mem0

  • Pros: Automatic extraction, simple API, open-source option, multi-cloud
  • Cons: Requires API key, usage costs for managed version
  • Use When: Production apps needing flexibility, multi-cloud deployments

AWS AgentCORE Memory

  • Pros: Enterprise-grade, AWS integration, fully managed, high security
  • Cons: AWS-only, more complex setup, AWS account required
  • Use When: AWS-native applications, enterprise requirements, need AWS integration

Best Practices

1. Choose the Right Approach

  • Start Simple: Use manual memory for learning and prototyping
  • Scale Up: Move to external tools when you need production features
  • Consider Mem0: If you want automatic memory management with flexibility
  • Consider AWS AgentCORE Memory: If you're building AWS-native applications and need enterprise features

2. Memory Hygiene

  • Regular Cleanup: Remove old or irrelevant memories
  • Deduplication: Avoid storing duplicate information
  • Validation: Check memory quality before storing

3. Privacy and Security

  • Encrypt Sensitive Data: Protect user information
  • User Consent: Get permission before storing memories
  • User Control: Let users view and delete their memories

4. Performance

  • Batch Operations: Store multiple memories at once when possible
  • Caching: Cache frequently accessed memories
  • Indexing: Use proper indexes for fast retrieval

5. Memory Selection

  • Relevance First: Prioritize memories relevant to current context
  • Recency Matters: Give more weight to recent memories
  • Success Filtering: Prefer memories from successful interactions

6. Testing

  • Test Memory Retrieval: Ensure relevant memories are found
  • Test Memory Persistence: Verify memories survive restarts
  • Test Memory Extraction: Confirm automatic extraction works correctly

Conclusion

Memory is essential for building intelligent AI agents. Whether you start with simple Python implementations or use advanced tools like Mem0 or AWS AgentCORE Memory, the key is understanding what each memory type does and when to use it.

Quick Decision Guide:

  • Learning/Prototyping: Use manual memory (Part 1)
  • Production App (Flexible): Use Mem0 (Part 3) - works with any cloud
  • Production App (AWS): Use AWS AgentCORE Memory (Part 4) - AWS-native
  • Custom Needs: Use individual tools like ChromaDB, Pinecone (Part 2)

Start simple, understand the concepts, then scale up as needed. The examples in this guide provide working code you can adapt to your needs.

Key Takeaway: Both Mem0 and AWS AgentCORE Memory can be used similarly - they both provide automatic memory extraction and management. Choose based on your infrastructure preferences (multi-cloud vs AWS-only).


Resources


Top comments (0)