Matt Lewis for AWS Heroes

Posted on May 20

Strands Agents + AgentCore Runtime - a perfect match

#ai #architecture #agents #aws

This is the third in a series of posts documenting the architecture, implementation, and lessons learned from building the AWS Briefing Agent - a personalised AWS assistant deployed on Amazon Bedrock AgentCore Runtime.

Part 1: Building a Full-Stack AI Agent on Bedrock AgentCore
Part 2: Data Ingestion: RSS Feeds, Knowledge Base, S3 Vectors, and Metadata Filtering
Part 3: Strands Agents + AgentCore Runtime - a perfect match
Part 4: Adding Memory to the Agent
Part 5: Experimenting with API Gateway
Part 6: Observability and Evaluations
Part 7: Third Party Integrations - Identity, Gateway and Slack Notifications

The initial implementation of the AWS Briefing Agent called the AWS News Feed RSS feed on every invocation. After setting up an Amazon Bedrock Knowledge Base, the next step was to refactor the code to take advantage of an agentic framework. The decision was made to adopt Strands Agents SDK as an open source SDK that helps you build and run AI agents in just a few lines of code. In our case, switching to the Knowledge Base and adopting Strands Agents SDK helped us to reduce the number of lines of code in our implementation logic by 75%.

Using Strands Agents SDK

The core of the Strands Agents code is straightforward and shown in the code snippet below:

from strands import Agent
from strands.models import BedrockModel
from strands.agent.conversation_manager import SlidingWindowConversationManager
from strands_tools import retrieve
from agent.tools.slack_formatter.tool import format_slack_message

model = BedrockModel(
    guardrail_id=GUARDRAIL_ID,
    guardrail_version=GUARDRAIL_VERSION,
    guardrail_trace="enabled",
)

agent = Agent(
    system_prompt=_load_system_prompt(),
    model=model,
    tools=[retrieve, format_slack_message] + gateway_tools,
    session_manager=session_manager,
    conversation_manager=SlidingWindowConversationManager(
        window_size=20,
        should_truncate_results=True,
        per_turn=True,
    ),
    callback_handler=None,
)

result = agent(message)

We start by importing a number of classes and functions from two packages (strands-agents and strands-agents-tools) and one local module. Agent is the core class for the agent itself, BedrockModel is the model provider, SlidingWindowConversationManager controls how conversation history is trimmed, and retrieve is a pre-built tool that is used to query a Bedrock Knowledge Base. The format_slack_message is a local custom tool within this project - a Python function decorated with the @tool annotation.

We instantiate the BedrockModel() without specifying a model_id. At this point, Strands uses its default model, which is current Claude Sonnet on Bedrock. We include details of a Bedrock Guardrail when we instantiate the model, purely to demonstrate the use of guardrails which we cover this later in the blog post.

Finally, we create the agent by wiring together its core components.

Deploy to Amazon Bedrock AgentCore Runtime

The AgentCore Runtime Python SDK provides a lightweight wrapper that helps to deploy your agent function as HTTP services

# Import the runtime
from bedrock_agentcore.runtime import BedrockAgentCoreApp

# Initialise the app
app = BedrockAgentCoreApp()

# Decorate the function
@app.entrypoint
def invoke(payload: Dict[str, Any], context: Any = None) -> Dict[str, Any]:
    """Entry point for AgentCore Runtime."""
    message = payload.get("prompt", payload.get("message", ""))
    ...
    return response

BedrockAgentCoreApp wraps your function in an HTTP server that listens om port 8080 with two endpoints:

/invocations - a POST endpoint for agent interactions. This gets invoked when customers call the InvokeAgentRuntime action with the payload in JSON format
/ping - a GET endpoint for health checks to verify your agent is operational and ready to handle requests

The @app.entrypoint decorator registers your invoke function as the handler for incoming requests. When AgentCore Runtime receives a request, it deserialises the JSON body into payload, provides a context object (with session_id, request_headers, etc.), calls your function, and serialises the returned dict back as the HTTP response.

Using the Container Build

When using the @aws/agentcore CLI and running agentcore deploy, the CLI needs to turn the Python source code into a runnable container image on AgentCore Runtime. This is controlled by the build field in the agentcore.json file. The default setting is CodeZip, in which the CLI zips up the Python source code, uploads it, and AgentCore resolves dependencies using uv --no-build. This is fast but has a hard constraint, as every dependency must have a pre-built wheel. In our code, we have a package that only ships source distributions, which required us to switch to the Container build setting. This also makes our build more production-ready.

When you run agentcore deploy with the Container build type, the CLI synthesis a CloudFormation stack that includes a CodeBuild project, an ECR repository, the AgentCore Runtime resource, and IAM roles. The CLI packages the codeLocation directory (agent/) and uploads it to S3 as the CodeBuild source artefact. CodeBuild pulls the provided Dockerfile and builds the container image. You can see all the steps in the CodeBuild project below:

After the image builds successfully, CodeBuild tags it and pushes it to the ECR repository as shown below:

The stack updates the Runtime resource to point at the new ECR image URI. AgentCore pulls the image from ECR the next time it starts a container for an invocation.

Built-In Conversation Managers

In the Strands Agents SDK, the user messages and agent responses are all added to the context. As the conversation grows within a session, this starting having a material impact on response times. We modified the default SlidingWindowConversationManager manager:

reducing the windowSize from the default of 40 to 20. This sets the maximum number of messages to keep
setting the per_turn parameter to false. This runs the sliding window before every model call within the same invocation, rather than waiting until after the agent loop completes.

This reduced the average response time from around 80 seconds down to 15 seconds.

Adding Bedrock Guardrails

Amazon Bedrock Guardrails are designed to help you safely build and deploy responsible generative AI applications with confidence. We decided to include a guardrail in the architecture, to understand where it fits in and what it can provide.

The guardrail itself was defined in CDK with content filters (sexual, violence, hate, insults, misconduct and prompt attack), a topic policy (deny off-topic sports questions), and a managed profanity word list:

# ----------------------------------------------------------------
# Bedrock Guardrail — content safety for the agent
# ----------------------------------------------------------------
guardrail = bedrock.CfnGuardrail(
    self,
    "BriefingAgentGuardrail",
    name="briefing-agent-guardrail",
    description="Content safety guardrail for the AWS Briefing Agent",
    blocked_input_messaging="I'm sorry, I can't process that request. Please rephrase your question about AWS announcements.",
    blocked_outputs_messaging="I'm sorry, I can't provide that response. Let me try a different approach.",
    content_policy_config=bedrock.CfnGuardrail.ContentPolicyConfigProperty(
        filters_config=[
            bedrock.CfnGuardrail.ContentFilterConfigProperty(
                type="SEXUAL",
                input_strength="HIGH",
                output_strength="HIGH",
            ),
            bedrock.CfnGuardrail.ContentFilterConfigProperty(
                type="VIOLENCE",
                input_strength="HIGH",
                output_strength="HIGH",
            ),
            # HATE, INSULTS, MISCONDUCT, PROMPT_ATTACK
        ],
    ),
    topic_policy_config=bedrock.CfnGuardrail.TopicPolicyConfigProperty(
        topics_config=[
            bedrock.CfnGuardrail.TopicConfigProperty(
                name="Sports",
                definition="Questions about sports scores, match results, player transfers, league standings, fixtures, or any sporting events.",
                type="DENY",
            ),
        ],
    ),
    word_policy_config=bedrock.CfnGuardrail.WordPolicyConfigProperty(
        managed_word_lists_config=[
            bedrock.CfnGuardrail.ManagedWordsConfigProperty(
                type="PROFANITY",
            ),
        ],
    ),
)

When the agent is invoked, the request first reaches the AgentCore Runtime and runs the handler code first. The guardrail itself is only applied when the handler makes the Bedrock inference call. Bedrock evaluates the input before running the model inference, and then inspects the output before returning it. We did encounter some interesting behaviour when implementing the guardrail.

IAM Permission Gap

The first invocation after adding the guardrail failed with:

AccessDeniedException: User is not authorized to perform: bedrock:ApplyGuardrail
on resource: arn:aws:bedrock:eu-west-1.xxx

The AgentCore execution role (auto-created by the @aws/agentcore-cdk construct) includes bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream, but not bedrock:ApplyGuardrail. The construct doesn’t know about guardrails — they’re a Bedrock feature, not an AgentCore feature. We ended up having to use the aws iam put-role-policy CLI command to add the missing permission

Topic policies can false-positive on legitimate queries

The initial topic policy denied "questions not related to AWS services, cloud computing, or technology". The intention was that it would be easy to demonstrate, and would ensure that the user input was relevant. However, when the user asked questions such as "what are the top announcements today", the classifier ended up deciding this was a blocked topic. In the end, to demonstrate how topic policies work, we changed it to explicitly deny sporting questions.

Guardrail versions can be deleted by CDK updates

When we updated the topic policy, we changed the version description for the guardrail. The CDK stack updated the guardrail version resource, so that CloudFormation deleted version 1 and created version 2. Unfortunately, the version number is also defined in the agentcore.json file. This meant that the AgentCore Runtime container still had version 1 baked into its environment, which meant calls now failed with the following exception:

ValidationException: The guardrail identifier or version provided in the request does not exist.

In the end it was a case of having to update the version number in agentcore.json, redeploy the agent, and start a new session.

DEV Community