Jettro Coenradie for AWS Community Builders

Posted on Jul 7 • Originally published at jettro.dev

Strands, the new Agent framework supported by Amazon

#aiagent #strands #aws

If you've read my previous posts, you know I've been writing about Agents a lot. Every post is prepared with a demo using a new framework, explaining a specific case. You read about running agents locally using Ollama, with remote LLMs using OpenAI and Bedrock. You also found multiple posts about agents running on Amazon Bedrock. This post shows the new Agent framework called Strands. You will read about the differences between the existing Bedrock Agents and Strands, and you will also see the code to run the Strands Agent. This time, you will learn how to run it locally and deploy it in production on AWS using AWS CDK.

Recap on Bedrock Agents

When Amazon Bedrock introduced agents, it felt like a logical step. They have the pieces of the puzzle needed to create an Agent. They had the LLM for the brains, data stores for knowledge bases, and lambdas for tools. By utilising IAM, you also have the power to secure your applications.

In September 2024, I wrote my first blog about Bedrock and Agents. As a demo, I created a character called Quippy and demonstrated how to deploy it to AWS using CDK. The most significant advantage of Bedrock Agents is that you do not need to maintain your instance. You can almost assemble the complete Agent, including a Knowledge Base and various tools, without writing any code.

It does come at a price. It is more challenging to debug and not possible to run on your local machine.

Build an agent using Amazon Bedrock

To provide developers with a better experience, try debugging faster with Agents. The team provided the inline agents. No heavy deployments, but enable developers through a Python SDK to run agents on their local machines.

I wrote a blog post about a multi-agent demo that utilises these new inline agents. This did overcome the problem of the strict deployment mechanism. It did not offer the same level of flexibility as other frameworks out there.

Create a multi-agent system with Amazon Bedrock

Therefore, I was excited to learn that the Amazon team collaborated with other communities to create a new, open-source framework called Strands.

Some things about Strands

Strands is a code-first framework for building agents using a Python SDK. The framework is built with production features, including observability, tracing, and deployment options, to run at scale. The model used by the agent is not limited to the models provided by Bedrock. You can also use Ollama and OpenAI.

The Agent Loop

An agent is more than a function that performs an action and returns a response. An agent analyses the question or command it receives. It considers using tools to make calls and handles the responses from the tools. After each tool call, it rethinks whether it is finished or not. If all the information is available, an answer is constructed. In Strands, this is referred to as the Agent Loop.

Sessions and state

Most of the Agent frameworks I know use the history of messages as state. When interacting with an agent, you provide the history of messages back to the agent when sending a new request. With more extended conversations, a conversation can become too big for the context of an LLM. Strands provides multiple “Conversation Managers”. The default is a sliding window conversation manager. This keeps the recent messages and discards the older ones. Other conversation managers can create a summary to help prevent information loss from older messages.

I can imagine a conversation manager that is more domain-specific. Provide a state description object that contains the necessary information it needs to retain. Think about a shopping basket, a wish list, your name, whatever. Please let me know if this is something you'd like to have, and I'll do my best to create one.

If you prefer not to maintain the messages yourself and would like to use a session, you can keep the agent instance available. It will keep the session alive for you. You can also persist the session or store it in a web framework session if you integrate the agent in a web application.

Prompting

An agent uses a Large Language Model (LLM) for reasoning and writing responses to the user's questions. Interaction with the LLM is facilitated through a prompt. With the system, you tell the agent how to behave during the conversation. You give it a role, establish boundaries, and explain its capabilities. Capabilities are the tools it can use.

Strands has defaults for everything, but you should provide your specific system prompt. When using a model, familiarise yourself with the best practices for that particular model. Each model has different mechanisms that work best to achieve your desired outcome.

When interacting with users, do not present the user input directly to the LLM or Agent. Strands wrote a nice piece about security and prompt engineering. They discuss verifying the input, constraining the output, and providing clear boundaries on what to do and how to trust user input.

Strands user guide for prompt engineering

Observe your Agent

Like with any system, it is essential to observe how well an agent is performing. Strands includes support in the SDK for traces, metrics, and logs.

Traces contain the system prompt and model parameters, input and output messages, used tokens, and tool invocations. These are ideal to understand what the Agent is doing.

The metrics include loop statistics, latency, memory, and CPU usage. These are for you to decide on the performance of the agent. This data helps you to decide about (virtual) hardware.

Logs of the framework. You’ll see a lot of them in the samples below. The logs help you as a developer to understand what is happening and debug your agent.

Giving Strands a spin

Let's first give it a try with a local Ollama providing the models. I run everything on my Mac with an M3 Pro processor. I have Ollama running locally. I pull the model Gemma 3n. It is not my goal to select the best model. I want to demo working with Strands.

The project uses poetry to manage dependencies. We need to install Ollama together with the strands project.

ollama pull gemma3n:latest
Poetry add strands-agents[ollama]

Next, we can code the initial agent that uses Ollama.

if __name__ == "__main__":
    import logging
    from strands import Agent
    from strands.models.ollama import OllamaModel

    # Enables Strands debug log level
    logging.getLogger("strands").setLevel(logging.DEBUG)

    # Sets the logging format and streams logs to stderr
    logging.basicConfig(
        format="%(levelname)s | %(name)s | %(message)s",
        handlers=[logging.StreamHandler()]
    )

    # Create an Ollama model instance
    ollama_model = OllamaModel(
        host="http://localhost:11434",  # Ollama server address
        model_id="gemma3n"  # Specify which model to use
    )

    # Create an OllamaAgent with the custom tool
    agent = Agent(model=ollama_model)

    # Ask the agent a question that uses the available tools
    message = """
    What is an AI Agent?
    """

    agent(message)

Strands comes with a lot of defaults. You do not need to provide a model; Bedrock will be chosen by default. Just as you do not have to do anything with the agent's response, it will use standard output by default.

Below is the output; focus on the DEBUG lines. This is interesting information. Note that the framework searches for tools in the tools directory. The logs also display metrics, the event loop, and the conversation manager. We attend to those later.

DEBUG | strands.models.ollama | config=<{'model_id': 'gemma3n'}> | initializing
DEBUG | strands.tools.registry | tools_dir=</Users/jettrocoenradie/Development/personal/strands-agent/src/strands_agent/tools> | tools directory not found
DEBUG | strands.tools.registry | tool_modules=<[]> | discovered
DEBUG | strands.tools.registry | tool_count=<0>, success_count=<0> | finished loading tools
DEBUG | strands.tools.registry | tools_dir=</Users/jettrocoenradie/Development/personal/strands-agent/src/strands_agent/tools> | tools directory not found
DEBUG | strands.tools.watcher | tool directory watching initialized
DEBUG | strands.tools.registry | getting tool configurations
DEBUG | strands.tools.registry | tool_count=<0> | tools configured
DEBUG | strands.tools.registry | getting tool configurations
DEBUG | strands.tools.registry | tool_count=<0> | tools configured
INFO | strands.telemetry.metrics | Creating Strands MetricsClient
DEBUG | strands.event_loop.streaming | model=<<strands.models.ollama.OllamaModel object at 0x106ab2b70>> | streaming messages
DEBUG | strands.types.models.model | formatting request
DEBUG | strands.types.models.model | formatted request=<{'messages': [{'role': 'user', 'content': '\n    What is an AI Agent?\n    '}], 'model': 'gemma3n', 'options': {}, 'stream': True, 'tools': []}>
DEBUG | strands.types.models.model | invoking model
DEBUG | strands.types.models.model | got response from model
## What is an AI Agent?

An AI Agent is essentially an autonomous entity that perceives its environment through sensors and acts upon that environment through actuators. Think of it as a software program designed to **think, reason, and act independently** to achieve specific goals. 

Here is a more detailed breakdown:

I Removed this part to focus on the framework, and not on the model


DEBUG | strands.types.models.model | finished streaming response from model
DEBUG | strands.agent.conversation_manager.sliding_window_conversation_manager | window_size=<2>, message_count=<40> | skipping context reduction
DEBUG | strands.agent.agent | thread pool executor shutdown complete

Adding tools

Strands supports multiple mechanisms to add tools. You can decorate a Python function with the @tool annotation and pass it to the Agent, or place it in the tools folder. If you're looking for more flexibility, check the documentation.

I have created a tool to search for books. Now I am searching for a book using the first word of the title. I did have to change models; the Gemma3n model does not support tools out of the box. Therefore, I moved to qwen3:8b. Below is the code for the tool.

@tool
def search_collection(book_title: str) -> dict:
    """
    Search for a book in the collection by title.

    Args:
        book_title (str): The title of the book to search for.

    Returns:
        dict: A dictionary containing the book details if found, otherwise an empty dictionary.
    """
    # Simulated book collection
    book_collection = {
        "Visionaries, Rebels and Machines": {"author": "Jamie Dobson", "year": 2025},
        "AI Snake Oil": {"author": "Arvind Narayanan & Sayash Kapoor", "year": 2024},
        "The Coming Wave": {"author": "Mustafa Suleyman & Michael Bhaskar", "year": 2024},
        "The Wolf Is at the Door": {"author": "Ben Angel", "year": 2024},
        "Life in the Pitlane": {"author": "Calum Nicholas", "year": 2025},
    }

    search = book_title.strip().lower()
    for title, details in book_collection.items():
        if title.lower().startswith(search):
            result = details.copy()
            result["title"] = title
            return result
    return {}

I did not change the Agent, except for the model as discussed. When asking this question, “Find a book with visionaries in the title”, I got the following response.

DEBUG | strands.models.ollama | config=<{'model_id': 'qwen3:8b'}> | initializing
DEBUG | strands.tools.registry | tools_dir=</Users/jettrocoenradie/Development/personal/strands-agent/src/strands_agent/tools> | found tools directory
DEBUG | strands.tools.registry | tools_dir=</Users/jettrocoenradie/Development/personal/strands-agent/src/strands_agent/tools> | scanning
DEBUG | strands.tools.registry | tool_modules=<['book_collection']> | discovered
DEBUG | strands.tools.registry | tool_name=<search_collection>, tool_type=<function>, is_dynamic=<False> | registering tool
DEBUG | strands.tools.registry | tool_count=<1>, success_count=<1> | finished loading tools
DEBUG | strands.tools.registry | tools_dir=</Users/jettrocoenradie/Development/personal/strands-agent/src/strands_agent/tools> | found tools directory
DEBUG | strands.tools.watcher | tools_dir=</Users/jettrocoenradie/Development/personal/strands-agent/src/strands_agent/tools> | started watching tools directory
DEBUG | strands.tools.watcher | tool directory watching initialized
DEBUG | strands.tools.registry | getting tool configurations
DEBUG | strands.tools.registry | tool_name=<search_collection> | loaded tool config
DEBUG | strands.tools.registry | tool_count=<1> | tools configured
DEBUG | strands.tools.registry | getting tool configurations
DEBUG | strands.tools.registry | tool_name=<search_collection> | loaded tool config
DEBUG | strands.tools.registry | tool_count=<1> | tools configured
INFO | strands.telemetry.metrics | Creating Strands MetricsClient
DEBUG | strands.event_loop.streaming | model=<<strands.models.ollama.OllamaModel object at 0x10562eed0>> | streaming messages
DEBUG | strands.types.models.model | formatting request
DEBUG | strands.types.models.model | formatted request=<{'messages': [{'role': 'user', 'content': '\n    Find a book with visionaries in the title\n    '}], 'model': 'qwen3:8b', 'options': {}, 'stream': True, 'tools': [{'type': 'function', 'function': {'name': 'search_collection', 'description': 'Search for a book in the collection by title.\n\nArgs:\n    book_title (str): The title of the book to search for.\n\nReturns:\n    dict: A dictionary containing the book details if found, otherwise an empty dictionary.', 'parameters': {'properties': {'book_title': {'description': 'The title of the book to search for.', 'type': 'string'}}, 'required': ['book_title'], 'type': 'object'}}}]}>
DEBUG | strands.types.models.model | invoking model
DEBUG | strands.types.models.model | got response from model
<think>
Okay, the user wants to find a book with "visionaries" in the title. Let me check the tools available. There's a function called search_collection that takes a book_title as a parameter. The description says it searches for a book by title. So I should call that function with "visionaries" as the book_title argument. I need to make sure the JSON is correctly formatted with the name of the function and the arguments. Let me structure the tool_call accordingly.
</think>


Tool #1: search_collection
DEBUG | strands.types.models.model | finished streaming response from model
DEBUG | strands.tools.executor | tool_count=<1>, tool_executor=<ThreadPoolExecutorWrapper> | executing tools in parallel
DEBUG | strands.tools.executor | tool_count=<1> | submitted tasks to parallel executor
DEBUG | strands.handlers.tool_handler | tool=<{'toolUseId': 'search_collection', 'name': 'search_collection', 'input': {'book_title': 'visionaries'}}> | invoking
DEBUG | strands.event_loop.streaming | model=<<strands.models.ollama.OllamaModel object at 0x10562eed0>> | streaming messages
DEBUG | strands.types.models.model | formatting request
DEBUG | strands.types.models.model | formatted request=<{'messages': [{'role': 'user', 'content': '\n    Find a book with visionaries in the title\n    '}, {'role': 'assistant', 'tool_calls': [{'function': {'name': 'search_collection', 'arguments': {'book_title': 'visionaries'}}}]}, {'role': 'assistant', 'content': '<think>\nOkay, the user wants to find a book with "visionaries" in the title. Let me check the tools available. There\'s a function called search_collection that takes a book_title as a parameter. The description says it searches for a book by title. So I should call that function with "visionaries" as the book_title argument. I need to make sure the JSON is correctly formatted with the name of the function and the arguments. Let me structure the tool_call accordingly.\n</think>\n\n'}, {'role': 'tool', 'content': "{'author': 'Jamie Dobson', 'year': 2025, 'title': 'Visionaries, Rebels and Machines'}"}], 'model': 'qwen3:8b', 'options': {}, 'stream': True, 'tools': [{'type': 'function', 'function': {'name': 'search_collection', 'description': 'Search for a book in the collection by title.\n\nArgs:\n    book_title (str): The title of the book to search for.\n\nReturns:\n    dict: A dictionary containing the book details if found, otherwise an empty dictionary.', 'parameters': {'properties': {'book_title': {'description': 'The title of the book to search for.', 'type': 'string'}}, 'required': ['book_title'], 'type': 'object'}}}]}>
DEBUG | strands.types.models.model | invoking model
DEBUG | strands.types.models.model | got response from model
<think>
Okay, the user asked for a book with "visionaries" in the title. I called the search_collection function with "visionaries" as the argument. The response came back with a book titled "Visionaries, Rebels and Machines" by Jamie Dobson from 2025. Now I need to present this information clearly. Let me check if there's more than one book, but the response seems to be a single entry. I should mention the title, author, and year. Maybe the user is looking for a specific book, so providing the details directly makes sense. I'll format it in a way that's easy to read, using bullet points or a simple list. Also, I should confirm if they need more information or if they were looking for something else. Let me make sure the response is friendly and helpful.
</think>

Here is the book you requested:

**Title:** Visionaries, Rebels and Machines  
**Author:** Jamie Dobson  
**Year:** 2025  

Let me know if you'd like further details!DEBUG | strands.types.models.model | finished streaming response from model
DEBUG | strands.agent.conversation_manager.sliding_window_conversation_manager | window_size=<4>, message_count=<40> | skipping context reduction
DEBUG | strands.agent.agent | thread pool executor shutdown complete

Process finished with exit code 0

Follow the logs, note that the tool is found. Read the model's thinking and pay attention to the decision to call the tool.

Showing metrics

Strands gathers a lot of information during each run. What better way to understand what you get than showing you? Below is the output of the metrics of the previous program. Below the JSON document, I’ll point you to the different parts of the document.

{
  "total_cycles": 2,
  "total_duration": 8.350327014923096,
  "average_cycle_time": 4.175163507461548,
  "tool_usage": {
    "search_collection": {
      "tool_info": {
        "tool_use_id": "search_collection",
        "name": "search_collection",
        "input_params": {
          "book_title": "visionaries"
        }
      },
      "execution_stats": {
        "call_count": 1,
        "success_count": 1,
        "error_count": 0,
        "total_time": 0.006476163864135742,
        "average_time": 0.006476163864135742,
        "success_rate": 1.0
      }
    }
  },
  "traces": [
    {
      "id": "4ba1536a-04b6-4c57-b711-e57fe9ce6e91",
      "name": "Cycle 1",
      "raw_name": null,
      "parent_id": null,
      "start_time": 1751636006.148543,
      "end_time": null,
      "duration": null,
      "children": [
        {
          "id": "cf8e1166-ae6f-42b3-bc98-fcc6303c155e",
          "name": "stream_messages",
          "raw_name": null,
          "parent_id": "4ba1536a-04b6-4c57-b711-e57fe9ce6e91",
          "start_time": 1751636006.148577,
          "end_time": 1751636010.890854,
          "duration": 4.742276906967163,
          "children": [],
          "metadata": {},
          "message": {
            "role": "assistant",
            "content": [
              {
                "toolUse": {
                  "toolUseId": "search_collection",
                  "name": "search_collection",
                  "input": {
                    "book_title": "visionaries"
                  }
                }
              },
              {
                "text": "<think>\nOkay, the user wants to find a book with \"visionaries\" in the title. Let me check the tools available. There's the search_collection function that takes a book_title as a parameter. Since the user mentioned \"visionaries\" in the title, I should use that as the argument. I'll call the function with book_title set to \"visionaries\" and see if it returns any results.\n</think>\n\n"
              }
            ]
          }
        },
        {
          "id": "f314827c-a29e-453b-93f4-711587218e4d",
          "name": "Tool: search_collection",
          "raw_name": "search_collection - search_collection",
          "parent_id": "4ba1536a-04b6-4c57-b711-e57fe9ce6e91",
          "start_time": 1751636010.8984928,
          "end_time": 1751636010.9050298,
          "duration": 0.006536960601806641,
          "children": [],
          "metadata": {
            "toolUseId": "search_collection",
            "tool_name": "search_collection"
          },
          "message": {
            "role": "user",
            "content": [
              {
                "toolResult": {
                  "toolUseId": "search_collection",
                  "status": "success",
                  "content": [
                    {
                      "text": "{'author': 'Jamie Dobson', 'year': 2025, 'title': 'Visionaries, Rebels and Machines'}"
                    }
                  ]
                }
              }
            ]
          }
        },
        {
          "id": "d6510a4c-5849-450e-90b5-6f09b110645b",
          "name": "Recursive call",
          "raw_name": null,
          "parent_id": "4ba1536a-04b6-4c57-b711-e57fe9ce6e91",
          "start_time": 1751636010.905327,
          "end_time": 1751636019.255786,
          "duration": 8.350458860397339,
          "children": [],
          "metadata": {},
          "message": null
        }
      ],
      "metadata": {},
      "message": null
    },
    {
      "id": "72d57790-f2dd-49f4-83db-4c7575dbe921",
      "name": "Cycle 2",
      "raw_name": null,
      "parent_id": null,
      "start_time": 1751636010.905353,
      "end_time": 1751636019.25568,
      "duration": 8.350327014923096,
      "children": [
        {
          "id": "3fb73b5e-1f8e-4737-a826-306437fa76c3",
          "name": "stream_messages",
          "raw_name": null,
          "parent_id": "72d57790-f2dd-49f4-83db-4c7575dbe921",
          "start_time": 1751636010.905442,
          "end_time": 1751636019.255627,
          "duration": 8.350184917449951,
          "children": [],
          "metadata": {},
          "message": {
            "role": "assistant",
            "content": [
              {
                "text": "<think>\nOkay, the user asked for a book with \"visionaries\" in the title. I called the search_collection function with \"visionaries\" as the argument. The response came back with a book titled \"Visionaries, Rebels and Machines\" by Jamie Dobson from 2025. Now I need to present this information clearly. Let me check if there's more than one result, but the tool response shows one book. I'll format the answer with the title, author, and year, making sure to mention that it's the found book. Since the user might be looking for specific details, I'll include all the provided information and offer further help if needed.\n</think>\n\nHere is the book you're looking for:\n\n**Title:** Visionaries, Rebels and Machines  \n**Author:** Jamie Dobson  \n**Year:** 2025  \n\nLet me know if you'd like more details!"
              }
            ]
          }
        }
      ],
      "metadata": {},
      "message": null
    }
  ],
  "accumulated_usage": {
    "inputTokens": 294,
    "outputTokens": 537,
    "totalTokens": 831
  },
  "accumulated_metrics": {
    "latencyMs": 13085.612251
  }
}

It starts with information about the cycles used by this agent run. In our case, there are two cycles—next, the tools usage. Note that you see the tools that are called with the argument. Next, the statistics for the tools. How many tools are called, how many errors and how many successful calls?

The traces section is a bit longer. You see all the calls to the LLM, including the response it provides. Additionally, the start, end, and response times are also offered.

Finally, the accumulated input, output, total tokens, and latency are provided.

When running this application in an AWS environment, it is easy to use AWS products for logging and tracing.

Deploying Strands on AWS

Agents are fun when running on your laptop. They become even more enjoyable when run on AWS infrastructure. First, we refactor the code to use Amazon Bedrock for the LLM. We do not want to scan for available tools; instead, we instruct the agent on which tool to use. Next, we use CDK to deploy the agent as a Lambda behind API Gateway. I’ll reuse my Amazon Bedrock Agent example to deploy the Strands agent.

First, the imports and the system prompt.

import json
import logging
import os

from strands import Agent
from strands import tool
from strands.models import BedrockModel

# Configure logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
logging.getLogger("strands").setLevel(logging.DEBUG)

logger.info("Starting the book search agent lambda ...")

# define the system prompt for the agent
SYSTEM_PROMPT = """You are a helpful agent that can search for books in a collection.

1. You can search for books by title.
2. Process the response to provide the user with information about the book, including the title, author, and year of publication.
3. If the book is not found, inform the user that the book is not available.

When displaying the book information, format it as follows:
Title: {title}
Author: {author}
Year: {year}
If the book is not found, respond with:
Book not found.
"""

I skipped the tool; this is precisely the same as before. Therefore, the next part is the lambda handler.

def handler(event, context):
    region = os.getenv('REGION')
    bedrock_model_id = os.getenv('BEDROCK_MODEL_ID')
    if not region or not bedrock_model_id:
        raise ValueError("Missing mandatory environment variables: REGION, BEDROCK_MODEL_ID")

    body = json.loads(event.get('body', '{}'))
    logger.info(f"The body of the request: {body}")
    message = body.get('message', '')


    bedrock_model = BedrockModel(
        model_id=bedrock_model_id,
        region_name=region,
        temperature=0.3,
    )

    book_agent = Agent(
        model=bedrock_model,
        system_prompt=SYSTEM_PROMPT,
        tools=[search_collection],
    )

    response = book_agent(message)

    logger.info(f"The response to the message")
    logger.info(response)

    return {
        'statusCode': 200,
        'headers': {
            'Access-Control-Allow-Origin': '*',
            'Access-Control-Allow-Methods': 'GET, POST, OPTIONS',
            'Access-Control-Allow-Headers': 'Content-Type, Authorization'
        },
        'body': json.dumps({"message": str(response)})
    }

Note how the environment variables provide the bedrock model and the region. The message is obtained from the body of the request. The response format is critical. Any mistake here will give CORS problems or other unexpected problems with the API Gateway.

This lambda is integrated with API Gateway, like I have done before, as mentioned in the Bedrock Agent post. The Quippy sample is deployed, utilising CDK. Below is the code to create the construct for the lambda.

import * as cdk from "aws-cdk-lib";
import * as lambda from "aws-cdk-lib/aws-lambda";
import * as iam from "aws-cdk-lib/aws-iam";
import {Construct} from "constructs";


export interface SearchBooksLambdaConstructProps extends cdk.StackProps {
    dependenciesLayer: lambda.ILayerVersion;
}

export class SearchBooksLambdaConstruct extends Construct {
    public readonly searchBooksLambda: lambda.Function;

    constructor(scope: Construct, id: string, props: SearchBooksLambdaConstructProps) {
        super(scope, id);

        this.searchBooksLambda = new lambda.Function(this, 'SearchBooksLambda', {
            runtime: lambda.Runtime.PYTHON_3_12,
            handler: 'app.handler',
            code: lambda.Code.fromAsset('lambda/app_search_books_package'),
            layers: [props.dependenciesLayer],
            environment: {
                'REGION': 'eu-west-1',
                'BEDROCK_MODEL_ID': 'eu.anthropic.claude-3-7-sonnet-20250219-v1:0',
            },
            timeout: cdk.Duration.seconds(120),
            architecture: lambda.Architecture.ARM_64,
        });

        this.searchBooksLambda.addToRolePolicy(new iam.PolicyStatement({
            actions: ['bedrock:InvokeModel','bedrock:InvokeModelWithResponseStream'],
            resources: ['*'],  // Replace with more specific resource ARNs if possible
        }))
    }
}

Finally, we have to construct the API Gateway. The code below gives the essential part. For the complete code, look at the GitHub project.

const api = new apigateway.RestApi(this, 'MultiLambdaApi', {
    restApiName: 'MultiLambdaService',
    description: 'This service serves multiple Lambda functions.',
    deployOptions: {
        stageName: 'prod',
        loggingLevel: apigateway.MethodLoggingLevel.INFO, // Set the logging level to INFO
        dataTraceEnabled: true, // Enable logging of full request/response data
        metricsEnabled: true, // Enable CloudWatch metrics
    },
    defaultCorsPreflightOptions: {
        allowOrigins: ['https://www.quippy.online'],  // Your website's origin
        allowMethods: ['GET', 'POST', 'OPTIONS'],    // Allowed HTTP methods
        allowHeaders: ['Content-Type', 'Authorization'], // Allowed headers
    },
    defaultMethodOptions: {
        authorizationType: apigateway.AuthorizationType.NONE,
        methodResponses: [{
            statusCode: '200',
            responseParameters: {
                'method.response.header.Access-Control-Allow-Origin': true,
                'method.response.header.Access-Control-Allow-Methods': true,
                'method.response.header.Access-Control-Allow-Headers': true,
            }
        }]
    }
});

// Create a Lambda Layer for dependencies
const dependenciesLayer = new lambda.LayerVersion(this, 'DependenciesLayer', {
    code: lambda.Code.fromAsset('lambda/dependencies_layer'),
    compatibleRuntimes: [lambda.Runtime.PYTHON_3_12],
    description: 'A layer to include dependencies for the Lambda functions',
});

const searchBooksLambdaConstruct = new SearchBooksLambdaConstruct(this, 'SearchBooksLambdaConstruct', {
    dependenciesLayer: dependenciesLayer,
});

const searchBooks = api.root.addResource('search-books');
searchBooks.addMethod('POST', new apigateway.LambdaIntegration(searchBooksLambdaConstruct.searchBooksLambda), {
    authorizer,
    authorizationType: apigateway.AuthorizationType.COGNITO,
});

Below is a screenshot of the result.

The sample is missing a few essential components; specifically, it does not utilise a session or store the messages in the client and pass them to the agent. The sample does not store metrics and traces; those are essential to improving your agent. For now, we stick to CloudWatch for all the logs, and we store the traces in there as well. Later, I will have a look at OpenTelemetry and the AWS platform.

Add a state for the Agent.

Managing state is often placed outside of the agent. The caller of the agent needs to keep the messages and provide them to the agent. The Strands Agent keeps track of the messages. The issue with a Lambda is that you do not retain the Agent instance. You can use the Strands agent for easy access to messages and the tools' data. That way, it is not too hard to implement state management. In the sample, the code wraps the agent to store a session state and retrieves it again before each call. The store and read states are presented below.

# Save agent state
def save_agent_state(bucket_name, agent, session_id):
    state = {
        "messages": agent.messages,
        "system_prompt": agent.system_prompt
    }
    # Store state in S3
    s3_client.put_object(
        Bucket=bucket_name,
        Key=session_id,
        Body=json.dumps(state).encode('utf-8'),
        ContentType='application/json'
    )


# Restore agent state
def restore_agent_state(bucket_name, session_id, bedrock_model_id, region):
    # Retrieve state from S3
    bedrock_model = BedrockModel(
        model_id=bedrock_model_id,
        region_name=region,
        temperature=0.3,
    )

    try:
        response = s3_client.get_object(
            Bucket=bucket_name,
            Key=session_id
        )
    except ClientError as e:
        if e.response['Error']['Code'] == 'NoSuchKey':
            logger.info(f"No state found for session {session_id}, creating a new agent.")
            return Agent(
                system_prompt=SYSTEM_PROMPT,
                model=bedrock_model,
                tools=[search_collection],
            )
        else:
            raise

    # Parse the state from the S3 object body
    state = json.loads(response['Body'].read().decode('utf-8'))

    # Create agent with restored state
    return Agent(
        messages=state["messages"],
        system_prompt=state["system_prompt"],
        model=bedrock_model,
        tools=[search_collection],
    )

With this code in place, we can have a conversation like the following.

User: I like to have more information about a book starting with AI

Assistant: I found a book in our collection that starts with "AI": 
    Title: AI Snake Oil Author: Arvind Narayanan & Sayash Kapoor Year: 2024

User: Another book starts with Life in the

Assistant: I found a book in our collection that starts with "Life in the":
    Title: Life in the Pitlane Author: Calum Nicholas Year: 2025

User: Which books did you find for me?

Assistant:
Based on our previous searches, I've found two books for you:
Title: AI Snake Oil Author: Arvind Narayanan & Sayash Kapoor Year: 2024
Title: Life in the Pitlane Author: Calum Nicholas Year: 2025
These are the books from our collection that matched your search criteria of 
titles starting with "AI" and "Life in the" respectively. Would you like more 
information about either of these books, or would you like to search for a 
different book?

We can continue like this, but the number of messages could make the context too big for the LLM in use by the agent. To manage that situation, Strands provides Conversation Managers. One implementation is the SlidingWindowConversationManager. This manager deletes old messages to keep focusing on the latest ones, which is often a good approach. An alternative is the SummarizingConversationManager. This manager creates a summary of the older messages and keeps it around. I like this idea, therefore I added it to the sample.

I extracted the code to construct an agent into its own function.

def construct_agent(messages = None, system_prompt = SYSTEM_PROMPT):
    region = os.getenv('REGION')
    bedrock_model_id = os.getenv('BEDROCK_MODEL_ID')

    # Create the summarizing conversation manager with default settings
    conversation_manager = SummarizingConversationManager(
        summary_ratio=0.3,  # Summarize 30% of messages when context reduction is needed
        preserve_recent_messages=10,  # Always keep 10 most recent messages
    )

    bedrock_model = BedrockModel(
        model_id=bedrock_model_id,
        region_name=region,
        temperature=0.3,
    )

    return Agent(
        conversation_manager=conversation_manager,
        messages=messages,
        system_prompt=system_prompt,
        model=bedrock_model,
        tools=[search_collection],
    )

Concluding

My conclusion on Strands so far. An exciting framework to create agents. Especially well integrated with the AWS platform, but also very easy to run locally using a tool like Ollama. Having the traces and metrics in AWS is not yet clear to me. Need more time to investigate.

Have a look at the sample code. Please ask me questions if anything is unclear.

Sample code