Vadym Kazulkin for AWS Heroes

Posted on Sep 29 • Edited on Oct 5

Amazon Bedrock AgentCore Runtime - Part 6 Using AgentCore short-term Memory with Strands Agents SDK

#aws #agenticai #llm #mcp

Introduction

In the part 4 of this article, we implemented our Custom Agent with Strands SDK and deployed it on Amazon Bedrock AgentCore Runtime. By default, the agents are stateless, so they don't remember the previous conversations. To change that, we can provide the agent with memory capabilities.

For example, Strands offers two built-in session managers for persisting agent sessions:

FileSessionManager: Stores sessions in the local filesystem
S3SessionManager: Stores sessions in Amazon S3 buckets

You can read more about this in the article Strands Session Management.

In this article, we'll focus on another approach to provide our agent with the memory which is AgentCore Memory. We'll explain the concepts of the AgentCore memory and focus on the short-term memory in this and on the long-term memory in the next article. The same way we can adjust the application introduced in the article Using Bedrock AgentCore Runtime Starter Toolkit with Strands Agents SDK to use the AgentCore short-term Memory.

The sample application with some adjustments of our Custom Agent to use short-term AgentCore Memory can be found in my bedrock-agentcore-custom-with-short-term-memory) GitHub repository.

AgentCore (short-term) Memory

AgentCore Memory lets your AI agents deliver intelligent, context-aware, and personalized interactions by maintaining both immediate and long-term knowledge. AgentCore Memory offers two types of memory:

Short-term memory: Stores conversations to keep track of immediate context.

For example, imagine your coding agent is helping you debug. During the session, you ask it to check variable names, correct syntax errors, and find unused imports. The agent stores the interactions as short term events in AgentCore Memory. Later the agent can retrieve the events so that it can converse without you having to repeat previous information.

Short-term memory captures raw interaction events, maintains immediate context, powers real-time conversations, enriches long-term memory systems, and enables building advanced contextual solutions such as multi-step task completion, in-session knowledge accumulation, and context-aware decision making.

Here is a slightly modified diagram taken from this source.

Long-term memory: It will be the focus of the next article of this series.

How to create AgentCore (short-term) memory

Here is the script to create agentcore short-term memory. Below is the most essential part of the code:

client = MemoryClient(region_name=os.environ['AWS_DEFAULT_REGION'])
memory_name = "OrderStatisticsAgentMemory"

memory = client.create_memory_and_wait(
    name=memory_name,
    strategies=[],  # No strategies for short-term memory
    description="Short-term memory for personal agent",
    event_expiry_days=7, # Retention period for short-term memory. This can be up to 365 days.
    )
memory_id = memory['id']

First, we create AgentCore memory client and then invoke create_memory_and_wait function on it passing the memory name and description and the number of event expiry days which is a retention period (how long raw conversation data is retained in short-term memory) for short-term memory and can be upto 365 days. Data within AgentCore Memory is encrypted both at rest and in transit. By default, AWS managed keys are used for this encryption, but we can choose to enable encryption with our own customer managed KMS keys for greater control. By passing empty strategies, we indicate that we create a short-term memory object. After we created memory, we can retrieve its id which we'll be using later.

We can see the created AgentCore memory it the AgentCore console:

Here are the memory details:

Creating AgentCore (short-term) Memory Hook with Strands Agents

Now let's explore how we can use AgentCore (short-term) Memory with Strands Agents. Strands Agents has a concept of hooks. Especially important it is to understand hook event lifecycle.

Here is the implementation of the strands agents short-term memory hook.

Our hook implementation with the name ShortTermMemoryHookProvider inherits strands.hooks.HookProvider and we pass to it the memory client and created memory id:

class ShortTermMemoryHookProvider(HookProvider):
  def __init__(self, memory_client: MemoryClient, memory_id: str):
    self.memory_client = memory_client
    self.memory_id = memory_id

Next, we register 2 hook events:

MessageAddedEvent will be mapped to the invocation of the on_message_added function
AgentInitializedEvent will be mapped to the invocation of the on_agent_initialized function

  def register_hooks(self, registry: HookRegistry):
    # Register memory hooks
    registry.add_callback(MessageAddedEvent, self.on_message_added)
    registry.add_callback(AgentInitializedEvent, self.on_agent_initialized)

Now let's explore what happens in those functions.

In case of MessageAddedEvent which happens and the end of the conversation and the invocation of the on_message_added function the following code will be executed:

def on_message_added(self, event: MessageAddedEvent):
...
  messages = event.agent.messages
  actor_id = event.agent.state.get("actor_id")
  session_id = event.agent.state.get("session_id")

  if messages[-1]["content"][0].get("text"):
    self.memory_client.create_event(
       memory_id=self.memory_id,
       actor_id=actor_id,
       session_id=session_id,
       messages=[(messages[-1]["content"][0]["text"], messages[-1]["role"])]

What we are doing here is to retrieve the agent messages (which is the user prompt) and session and actor ids (more about it later) and then invoke create_event function on the memory client to store the conversation in the memory and pass the mentioned information along with the memory id to it as parameters.

For example, if our prompt is:
"input": {"prompt": "Hi, my name is Vadym. Today is a sunny weather. Give me the information about order with id 12376"}

then event.agent.messages returns the following information for our application: [{'role': 'user', 'content': [{'text': 'Hi, my name is Vadym. Today is a sunny weather. Give me the information about order with id 12376'}]}, {'role': 'assistant', 'content': [{'text': "Hello Vadym! It's nice to meet you, and I'm glad to hear you're enjoying sunny weather today! \n\nLet me get the information about order 12376 for you."}, {'toolUse': {'toolUseId': 'tooluse_07jBcXOVQQ6YtRzpLHMpwA', 'name': 'DemoOpenAPITargetS3OrderAPI___getOrderById', 'input': {'id': '12376'}}}]}] from which we extract text and role.

In case of AgentInitializedEvent which happens right away and the invocation of the on_agent_initialized function the following code will be executed:

def on_agent_initialized(self, event: AgentInitializedEvent):
 ...
  actor_id = event.agent.state.get("actor_id")
  session_id = event.agent.state.get("session_id")

  # Load the last 5 conversation turns from memory
  recent_turns = self.memory_client.get_last_k_turns(
          memory_id=self.memory_id,
          actor_id=actor_id,
          session_id=session_id,
          k=5
    )

  if recent_turns:
      # Format conversation history for context
      context_messages = []
      for turn in recent_turns:
           for message in turn:
                 role = message['role']
                 content = message['content']['text']
                 context_messages.append(f"{role}: {content}")

      context = "\n".join(context_messages)
      # Add context to agent's system prompt.
      event.agent.system_prompt += f"\n\nRecent conversation:\n{context}"

What we are doing here is to retrieve session and actor ids (more about it later) and then invoke get_last_k_turns function on the memory client to load the last k (in our case 5) conversation from memory and pass the mentioned information along with the memory id to it as parameters. When we iterate over the returned messages, concatenate them into one context string and finally add context to agent's system prompt.

We can also look into what is currently in memory by executing search_agentcore_memory script. Please don't forget to configure the following parameters:

ACTOR_ID = "order_statistics_user_123" # It can be any unique identifier (AgentID, User ID, etc.)
SESSION_ID = "order_statistics_session_001" # Unique session identifier
memory_id="{YOUR_MEMORY_ID}"

Using AgentCore (short-term) Memory Hook with Strands Agents

Now comes the final piece of puzzle - how to use the created AgentCore (short-term) Memory Hook with Strands Agents. This happens in the agentcore_runtime_custom_agent_with_short_term_memory_demo script.

First, we do some initialization:

ACTOR_ID = "order_statistics_user_123" # It can be any unique identifier (AgentID, User ID, etc.)
SESSION_ID = "order_statistics_session_001" # Unique session identifier
client = MemoryClient(region_name=os.environ['AWS_DEFAULT_REGION'])
memory_id="{YOUR_MEMORY_ID}"
gateway_url = "${YOUR_GATEWAY_URL}"

We put some static values for the actor and session id, in other scenarios we might think of retrieving them from outside, for example using different actor id for every actor/user having conversation with our agent. Then we create a memory client. Please don't forget to configure created memory_id (see above) and gateway_url (see parts 2 and 4 for the explanation of the setup).

When creating the Strands Agent we pass ShortTermMemoryHookProvider as hooks parameters and {"actor_id": ACTOR_ID, "session_id": SESSION_ID} as state parameter:

agent = Agent(#model=model,  #use the default Bedrock Model which is Anthropic Claude Sonnet 
       tools=tools,
       system_prompt="Please answer the questions about the order statistics. "
          "If you received a personal information about the user you chat with "
          "or this user told you during previous conversation some facts like about the weather or his mood, "
          "feel free to also provide it in your answer. If you don't have the answer to such questions please tell me so.",
       hooks=[ShortTermMemoryHookProvider(client, memory_id)],
       state={"actor_id": ACTOR_ID, "session_id": SESSION_ID})

One more important thing is to adjust our existing IAM role to give it required permissions for the intended AgentCore memory operations. In the part 2 we explained how to create such a role with the required permissions and give the complete JSON as an example. Now we need to add the following to the IAM permission:

{
   "Sid": "BedrockAgentCoreMemory", 
   "Effect": "Allow",
    "Action": [
         "bedrock-agentcore:ListEvents",
         "bedrock-agentcore:CreateEvent", 
         "bedrock-agentcore:RetrieveMemoryRecords"
   ],
    "Resource": [
       "arn:aws:bedrock-agentcore: ${region}:${account_id}:memory/${memory_id}"
   ]
}

Please replace ${region} and ${account_id} and additionally ${memory_id} variables with our values.

Deployment and invocation of our agent are described in part 4 of this article series and remained the same.

Let's give some examples to see short-term memory in action by having the following conversations with our agent (both orders are in the database so their information will be provided):

"input": {"prompt": "Hi, my name is Vadym. Today is a sunny weather. Give me the information about order with id 12376"}
"input": {"prompt": "Give me the information about order with id 12375"}

Now let's ask the agent a question which answer is in the short-term memory.

"input": {"prompt": "Tell me please: what is my name and how is the weather today?"}

Agent answer is:

{'output': {'message': {'role': 'assistant', 'content': [{'text': "Based on our conversation, your name is **Vadym** and the weather today is **sunny**! \n\nYou mentioned both of these when you first introduced yourself earlier. I hope you're continuing to enjoy that beautiful sunny day! ☀️"}]}

So, with the help of the short-term memory the agent was able to memorize our previous conversation and provide us with the right answer.

Let's ask our agent another question:

"input": {"prompt": "Please summarize all orders you already have information about"}

agent answer is:

{'output': {'message': {'role': 'assistant', 'content': [{'text': "Based on our conversation, I have information about the following orders:\n\n## Order Summary\n\n### Order 12375\n- **Order ID:** 12375\n- **User ID:** 12345\n- **Total Value:** $350\n- **Status:** RECEIVED\n- **Date & Time:** August 9, 2025 at 3:10:25 PM\n- **Items:** \n - Product ID 233 (Qty: 3, Value: $250)\n - Product ID 230 (Qty: 3, Value: $100)\n\n### Order 12376\n- **Order ID:** 12376\n- **User ID:** 12345\n- **Total Value:** $350\n- **Status:** RECEIVED\n- **Date & Time:** August 9, 2025 at 3:10:25 PM\n- **Items:**\n - Product ID 233 (Qty: 3, Value: $250)\n - Product ID 230 (Qty: 3, Value: $100)\n\n## Key Observations:\n- Both orders belong to the same user (ID: 12345)\n- Both orders have identical total values ($350)\n- Both orders contain the same products with identical quantities and values\n- Both orders were created on the same date and time\n- Both orders have been successfully received\n- The only differences are the order IDs and individual item IDs\n\nHope you're still enjoying that sunny weather today, Vadym! ☀️"}]}

The same holds true: with the help of the short-term memory the agent was able to memorize our previous conversation and provide us with the right answer.

This was AgentCore short-term Memory in action.

We'll look into the AgentCore Memory observability features in one of the next articles after we have covered long-term memory.

Conclusion

In this part of the series, we looked at how to implement the AgentCore short-term memory for our Custom Agent with Strands Agents to store conversations to keep track of immediate context. We also covered the concept of Strands Agents hooks and hook event lifecycle.

In the next part of the series, we'll look into AgentCore long-term Memory.

Please also check out my Amazon Bedrock AgentCore Gateway article series.