Introduction
In the part 4 of this article, we implemented our Custom Agent with Strands SDK and deployed it on Amazon Bedrock AgentCore Runtime. By default, the agents are stateless, so they don't remember the previous conversations. In order to change that, we can provide the agent with memory capabilities. This is what we did in part 6 by providing our custom agent with short-term memory capabilities to store conversations to keep track of immediate context.
In this article, we'll focus on another approach to provide our agent with the memory which is AgentCore Memory long-term memory. The same way we can adjust the application introduced in the article Using Bedrock AgentCore Runtime Starter Toolkit with Strands Agents SDK to use the AgentCore long-term Memory.
The sample application with some adjustments of our Custom Agent to use long-term AgentCore Memory can be found in my bedrock-agentcore-custom-with-long-term-memory) GitHub repository.
AgentCore (long-term) Memory
AgentCore Memory lets your AI agents deliver intelligent, context-aware, and personalized interactions by maintaining both immediate and long-term knowledge. AgentCore Memory offers two types of memory:
Short-term memory: We covered it in part 6.
Long-term memory: Stores extracted insights - such as user preferences, semantic facts, and summaries - for knowledge retention across sessions.
User Preferences – Think of your coding agent which uses AgentCore Memory as your long-time coding partner. Over many days, it notices you always write clean code with comments, prefer snake_case naming, use pandas for data analysis, and test functions before finalizing them. Next time, even after many sessions, when you ask it to write a data analysis function, it automatically follows these preferences stored in AgentCore Memory without you telling it again.
Semantic facts – The coding agent also remembers that “Pandas is a Python Library for data analysis and handling tables”. When you ask, “Which library is best for table data?”, it immediately suggests Pandas because it understands what Pandas are from the semantic memory.
Summary – The coding agent generates session summaries such as “During this interaction, you created a data cleaning function, fixed two syntax errors, and tested your linear regression model.” These summaries both track completed work and compress conversation context, enabling efficient reference to past activities while optimizing context window usage.
Custom strategies - Let you append to the system prompt and choose the model. This lets you tailor the memory extraction and consolidation to your specific domain or use case.
Long-term memory generation is an asynchronous process that runs in the background and automatically extracts insights after raw conversation/context is stored in Short Term Memory via CreateEvent. This efficiently consolidates key information without interrupting live interactions. As part of the long-term memory generation, AgentCore Memory performs the following operations:
- Extraction: Extracts information from raw interactions with the agent.
- Consolidation: Consolidates newly extracted information with existing information in the AgentCore Memory.
Once long-term memory is generated, you can retrieve these extracted memories to enhance your agent's responses with persistent knowledge. Extracted memories are stored as memory records and can be accessed using the GetMemoryRecord, ListMemoryRecords, or RetrieveMemoryRecords operations. The RetrieveMemoryRecords operation is particularly powerful as it performs a semantic search to find memory records that are most relevant to the query. For example, when a customer asks about running shoes, the agent can use semantic search to retrieve related memory records, such as customer's preferred shoe size, favorite shoe brands, and previous shoe purchases. This lets the AI support agent provide highly personalized recommendations without requiring the customer to repeat information they've shared before.
Here is a diagram taken from this source.
How to create AgentCore (long-term) memory
Here is the script to create agentcore long-term memory. Below is the most essential part of the code:
client = MemoryClient(region_name=os.environ['AWS_DEFAULT_REGION'])
memory_name = "OrderStatisticsAgentMemoryWithStrategies"
strategies = [
{
StrategyType.SUMMARY.value: {
"name": "CustomerSummary",
"description": "Captures customer summary",
"namespaces": ["support/customer/{actorId}/{sessionId}/summary"]
}
},
{
StrategyType.SEMANTIC.value: {
"name": "CustomerSupportSemantic",
"description": "Stores facts from conversations",
"namespaces": ["support/customer/{actorId}/semantic"],
}
}
]
memory = client.create_memory_and_wait(
name=memory_name,
strategies==strategies,, # strategies for long-term memory
description="Long-term memory for personal agent",
event_expiry_days=7, # Retention period for long-term memory. This can be upto 365 days.
)
memory_id = memory['id']
First, we create AgentCore memory client and create 2 strategies for the long term memory: semantics facts and summary (see above). Then we invoke create_memory_and_wait function on it passing the memory name and description and the number of event expiry days which is a retention period for long-term memory and can be upto 365 days. After we created memory, we can retrieve its id which we'll be using later.
We can see the created AgentCore memory it the AgentCore console:
Here are the memory details including the strategies:
Creating AgentCore (long-term) Memory Hook with Strands Agents
Now let's explore how we can use AgentCore (long-term) Memory with Strands Agents. As explained in part 6, Strands Agents has a concept of hooks. Especially important it is to understand hook event lifecycle.
Here is the implementation of the strands agents long-term memory hook.
Our hook implementation with the name LongTermMemoryHookProvider inherits strands.hooks.HookProvider and we pass to it the memory client, created memory id and extract namespaces from the strategies by invoking get_memory_strategies on the memory client with the memory id:
class LongTermMemoryHookProvider(HookProvider):
def __init__(self, memory_client: MemoryClient, memory_id: str):
self.memory_id = memory_id
self.memory_client = memory_client
self.namespaces = self.get_namespaces()
# Helper function to get namespaces from memory strategies list
def get_namespaces(self) :
strategies = self.memory_client.get_memory_strategies(self.memory_id)
return {i["type"]: i["namespaces"][0] for i in strategies}
Next, we register 2 hook events :
- MessageAddedEvent will be mapped to the invocation of the retrieve_context function
- AfterInvocationEvent will be mapped to the invocation of the save_event function
def register_hooks(self, registry: HookRegistry) -> None:
"""Register customer support memory hooks"""
registry.add_callback(MessageAddedEvent, self.retrieve_context)
registry.add_callback(AfterInvocationEvent, self.save_event)
Now let's explore what happens in those functions.
In case of MessageAddedEvent where we retrieve customer context before processing support query the invocation of the retrieve_context function the following code will be executed:
def retrieve_context(self, event: MessageAddedEvent):
...
messages = event.agent.messages
if messages[-1]["role"] == "user" and "toolResult" not in messages[-1]["content"][0]:
user_query = messages[-1]["content"][0]["text"]
# Retrieve customer context from all namespaces
all_context = []
# Get actor_id from agent state
actor_id = event.agent.state.get("actor_id")
session_id = event.agent.state.get("session_id")
for context_type, namespace in self.namespaces.items():
namespace = namespace.format(actorId=actor_id, sessionId=session_id)
memories = self.memory_client.retrieve_memories(
memory_id=self.memory_id,
namespace=namespace.format(actorId=actor_id, sessionId=session_id),
query=user_query,
top_k=3
)
for memory in memories:
if isinstance(memory, dict):
content = memory.get('content', {})
if isinstance(content, dict):
text = content.get('text', '').strip()
if text:
all_context.append(f"[{context_type.upper()}] {text}")
# Inject customer context into the query
if all_context:
context_text = "\n".join(all_context)
original_text = messages[-1]["content"][0]["text"]
messages[-1]["content"][0]["text"] =
(f"Customer Context:\n{context_text}\n\n{original_text}")
Let's break this down: what we are doing here is to retrieve the agent messages (which is the user prompt).
For example, if our prompt is:
"input": {"prompt": "Hi, my name is Vadym. Today is a sunny weather. Give me the information about order with id 12376"}
then event.agent.messages returns the following information for our application: [{'role': 'user', 'content': [{'text': 'Hi, my name is Vadym. Today is a sunny weather. Give me the information about order with id 12376'}]}, {'role': 'assistant', 'content': [{'text': "Hello Vadym! It's nice to meet you, and I'm glad to hear you're enjoying sunny weather today! \n\nLet me get the information about order 12376 for you."}, {'toolUse': {'toolUseId': 'tooluse_07jBcXOVQQ6YtRzpLHMpwA', 'name': 'DemoOpenAPITargetS3OrderAPI___getOrderById', 'input': {'id': '12376'}}}]}]
from which we extract text and role.
Then we extract user_query from it with user_query = messages[-1]["content"][0]["text"]
which will delivery Hi, my name is Vadym. Today is a sunny weather. Give me the information about order with id 12376'
.
When we extract session and actor ids (more about it later). When we iterate over the namespaces (which are our memory strategies) and for each context type (SEMANTIC and SUMMARIZATION for our use case) we replace the values of actor and session id in the namespaces :
support/customer/{actorId}/semantic
support/customer/{actorId}/{sessionId}/summary
with real values so the namespaces have after it the following values:
support/customer/order_statistics_user_123/semantic
support/customer/order_statistics_user_123/order_statistics_session_001/summary
and then we perform retrieve_memories invocation in the memory client for each namespace to retrieve memory for our user query with the last k (in our case 3) entries. When we iterate over the retrieved memory and collect all the text information for each iteration into all_context variable and then create context_text variable containing the concatenation of all contexts.
For example, if our initial prompt was:
"input": {"prompt": "Hi, my name is Vadym. Today is a sunny weather. Give me the information about order with id 12376"}
And it was saved in the memory when after starting the next conversation with the agent the retrieved memory for it will contain something like this:
[SEMANTIC] Vadym is experiencing sunny weather today.
[SEMANTIC] Vadym has an existing order with Order ID 12376,
placed on August 9, 2025,
with a total value of $350
and containing two different products.
[SUMMARIZATION] <topic name="User Introduction">
User introduces himself as Vadym and mentions that it's sunny weather today.
</topic>
<topic name="Order Request">
User requests information about order ID 12376.
</topic>
<topic name="Order Details">
The order (ID: 12376) belongs to User ID 12345, has a total value of $350,
status is RECEIVED, and was placed on August 9, 2025 at 3:10:25 PM.
</topic>
<topic name="Order Items">
The order contains 2 different products with a total of 6 items:
- Item 1: Product ID 233, Quantity 3, Value $250
2025-09-16T14:32:01.471Z
- Item 2: Product ID 230, Quantity 3, Value $100
</topic>
What we finally do is to extract the original text into the variable original_text and then to assign the variable messages[-1]["content"][0]["text"] which the message text to the value of context_text variable (which is the overall context retrieved from the memory) concatenated with the value of the original_text variable like this:
context_text = "\n".join(all_context)
original_text = messages[-1]["content"][0]["text"]
messages[-1]["content"][0]["text"] =
(f"Customer Context:\n{context_text}\n\n{original_text}")
With that we enrich the prompt with the information extracted from the memory.
In case of AfterInvocationEvent which happens after agent response and the invocation of the save_event function the following code will be executed:
def save_event(self, event: AfterInvocationEvent):
...
actor_id = event.agent.state.get("actor_id")
session_id = event.agent.state.get("session_id")
messages = event.agent.messages
....
for msg in reversed(messages):
if msg["role"] == "assistant" and not agent_response:
agent_response = msg["content"][0]["text"]
elif msg["role"] == "user" and not customer_query and "toolResult" not in msg["content"][0]:
customer_query = msg["content"][0]["text"]
self.memory_client.create_event(
memory_id=self.memory_id,
actor_id=actor_id,
session_id=session_id,
messages=[(customer_query, "USER"), (agent_response, "ASSISTANT")]
)
What we are doing here is to retrieve session and actor ids (more about it later) and then invoke create_event function on the memory client to save the event in memory. The event itself contains the following information in the messages parameter: customer query (prompt) and what agent assistant responded to it.
We can also look into what is currently in memory including memory strategies (with namespaces) by executing search_agentcore_memory script. Please don't forget to configure the following parameters:
ACTOR_ID = "order_statistics_user_123" # It can be any unique identifier (AgentID, User ID, etc.)
SESSION_ID = "order_statistics_session_001" # Unique session identifier
memory_id="{YOUR_MEMORY_ID}"
Using AgentCore (long-term) Memory Hook with Strands Agents
Now comes the final piece of puzzle - how to use the created AgentCore (long-term) Memory Hook with Strands Agents. This happens in the agentcore_runtime_custom_agent_with_long_term_memory_demo script.
First, we do some initialization:
ACTOR_ID = "order_statistics_user_123" # It can be any unique identifier (AgentID, User ID, etc.)
SESSION_ID = "order_statistics_session_001" # Unique session identifier
client = MemoryClient(region_name=os.environ['AWS_DEFAULT_REGION'])
memory_id="{YOUR_MEMORY_ID}"
gateway_url = "${YOUR_GATEWAY_URL}"
We put some static values for the actor and session id, in other scenarios we might think of retrieving them from outside, for example using different actor id for every actor/using having conversation with our agent. Then we create memory client. Please don't forget to configure created memory_id (see above) and gateway_url (see parts 2 and 4 for the explanation of the setup).
When creating the Strands Agent we pass LongTermMemoryHookProvider as hooks parameters and_ {"actor_id": ACTOR_ID, "session_id": SESSION_ID}_ as state parameter:
agent = Agent(#model=model, #use the default Bedrock Model which is Anthropic Claude Sonnet
tools=tools,
system_prompt="Please answer the questions about the order statistics. "
"If you received a personal information about the user you chat with "
"or this user told you during previous conversation some facts like about the weather or his mood, "
"feel free to also provide it in your answer. If you don't have the answer to such questions please tell me so.",
hooks=[LongTermMemoryHookProvider(client, memory_id)],
state={"actor_id": ACTOR_ID, "session_id": SESSION_ID})
One more important thing is to adjust our existing IAM role to give it permissions to the required AgentCore memory operation. In the part 2 we explained how to create such a role with the required permissions and give the complete JSON as an example. Now we need to add the following to the IAM permission:
{
"Sid": "BedrockAgentCoreMemory",
"Effect": "Allow",
"Action": [
"bedrock-agentcore:CreateEvent",
"bedrock-agentcore:RetrieveMemoryRecords",
"bedrock-agentcore:GetMemory"
],
"Resource": [
"arn:aws:bedrock-agentcore: ${region}:${account_id}:memory/${memory_id}"
]
}
Please replace ${region} and ${account_id} and additionally ${memory_id} variables with our values.
Deployment and invocation of our agent are described in part 4 of this article series.
Let's give some examples to see long-term memory in action by having the following conversations with our agent (both orders are in the database so their information will be provided):
"input": {"prompt": "Hi, my name is Vadym. Today is a sunny weather. Give me the information about order with id 12376"}
"input": {"prompt": "Give me the information about order with id 12375"}
Now let's ask the agent a question which answer is in the long-term memory.
"input": {"prompt": "Tell me please: what is my name and how is the weather today?"}
Agent answer is:
{'output': {'message': {'role': 'assistant', 'content': [{'text': "Hi Vadym! Based on our previous conversations, your name is Vadym and you mentioned that it's sunny weather today. I hope you're continuing to enjoy the beautiful sunny day!\n\nIs there anything else about your orders that you'd like to know?"}]}
So, with the help of the long-term memory the agent was able to memorize our previous conversation and provide us with the right answer.
Let's ask our agent another question:
"input": {"prompt": "Please summarize all orders you already have information about"}
agent answer is :
{'output': {'message': {'role': 'assistant', 'content': [{'text': "Based on our conversation history, I have information about several orders for you, Vadym. Here's a comprehensive summary of all the orders we've discussed:\n\n## Order Summary for User ID 12345\n\n**Total Orders Reviewed: 2**\n\n### Order Details:\n1. **Order ID: 12376**\n - Status: RECEIVED\n - Total Value: $350\n - Date: August 9, 2025 at 3:10:25 PM\n - Items: 6 total items (2 products)\n - Product ID 233: 3 items, $250\n - Product ID 230: 3 items, $100\n\n2. **Order ID: 12375**\n - Status: RECEIVED\n - Total Value: $350\n - Date: August 9, 2025 at 3:10:25 PM\n - Items: 6 total items (2 products)\n - Product ID 233: 3 items, $250 (Item ID: 24719)\n - Product ID 230: 3 items, $100 (Item ID: 24718)\n\n3. **Order ID: 12360**\n - Status: RECEIVED\n - Total Value: $350\n - Date: August 9, 2025 at 3:10:25 PM\n - Items: 6 total items (2 products)\n - Product ID 233: 3 items, $250 (Item ID: 24721)\n - Product ID 230: 3 items, $100 (Item ID: 24720)\n\n4. Pattern Analysis:\n- All orders were placed on the same date and time\n- All have identical total values ($350)\n- All contain the same two products in the same quantities\n- All orders have RECEIVED status\n- Product ID 233 consistently costs $250 for 3 items\n- Product ID 230 consistently costs $100 for 3 items\n\nHope this summary helps on this sunny day, Vadym! Let me know if you need details about any other orders."}]}
The same holds true: with the help of the long-term memory the agent was able to memorize our previous conversation and provide us with the right answer.
This was AgentCore long-term Memory in action.
We'll look into the AgentCore Memory observability feature of the short and long-term AgentCore Memory in the next part of this article series.
Conclusion
In this part of the series, we looked at how to implement the AgentCore long-term memory for our Custom Agent with Strands Agents to store extracted insights - such as user preferences, semantic facts, and summaries - for knowledge retention across sessions.
In the next part of the series, we'll look into the AgentCore Memory observability feature of the short and long-term AgentCore Memory.
Please also check out my Amazon Bedrock AgentCore Gateway article series.
Top comments (0)