DEV Community

Vadym Kazulkin for AWS Heroes

Posted on • Edited on

Amazon Bedrock AgentCore Runtime - Part 4 Using Custom Agent with Strands Agents SDK

Introduction

In part 2 of our article series, we implemented with Strands Agents SDK with the Amazon Bedrock AgentCore Runtime Starter Toolkit and hosted it on the AgentCore Runtime.

In this part of the series, we'll rewrite our implementation to use Custom Agent instead of AgentCore Starter Toolkit, which gives us full control over our agent's HTTP interface and deploy it to the Amazon Bedrock AgentCore Runtime.

The precondition for it to complete the setup described, for example, in the article Exposing existing Amazon API Gateway REST API via MCP and Gateway endpoint, which includes creating a Cognito User Pool, Cognito Resource Server, and Cognito User Pool Client, and finally having the AgentCore Gateway URL.

Developing the Custom Agent

I have provided the full source code in my amazon-agentcore-runtime-to-gateway-custom-agent-demo GitHub repository.

This approach demonstrates how to deploy a custom agent using FastAPI and Docker, following AgentCore Runtime requirements.

Those requirements are:

  • FastAPI Server: Web server framework for handling requests
  • /invocations Endpoint: POST endpoint for agent interactions
  • /ping Endpoint: GET endpoint for health checks
  • Docker Container: ARM64 containerized deployment package

Here is how the architecture looks:

First, we define our dependencies. Those include AWS Distro for Open Telemetry (ADOT) SDK, see my AgentCore Observability article:

fastapi
uvicorn[standard]
pydantic
httpx
strands-agents
requests
aws-opentelemetry-distro>=0.10.1
boto3
Enter fullscreen mode Exit fullscreen mode

Our custom agent implementation looks very similar to the implementation of our agent with AgentCore starter toolkit from part 2.

We first create the FastAPI instance and invocation request and response models:

app = FastAPI(title="Custom Strands Agent Server", version="1.0.0")

class InvocationRequest(BaseModel):
    input: Dict[str, Any]

class InvocationResponse(BaseModel):
    output: Dict[str, Any]
Enter fullscreen mode Exit fullscreen mode

The business logic was previously a part of the invoke function with the Starter Toolkit

@app.entrypoint 
def invoke(payload)
...
Enter fullscreen mode Exit fullscreen mode

now became a part of the invocations function exposed as a POST HTTP endpoint

@app.post("/invocations", response_model=InvocationResponse) 
async def invoke_agent(request: InvocationRequest)
...
Enter fullscreen mode Exit fullscreen mode

There are only small changes to define the invocation response and error handling.

We also define our ping GET endpoint for the health checks:

@app.get("/ping")
async def ping():
    return {"status": "healthy"}
Enter fullscreen mode Exit fullscreen mode

and run out FastAPI web server:

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8080)
Enter fullscreen mode Exit fullscreen mode

For the detailed explanation of all functions (get_full_tools_list, get_auth_token, and others), I refer to part 2, as they remained the same.

Don't forget to replace the value of the gateway_url variable in the section gateway_url = "${YOUR_GATEWAY_URL}" with our AgentCore Gateway URL (see part 2 for further details).

Now we can start our agent locally and test it with:

curl -X POST http://localhost:8080/invocations \
  -H "Content-Type: application/json" \
  -d '{"input": {"prompt": "Give me the information about the order with id 12345"}}
Enter fullscreen mode Exit fullscreen mode

Windows users can alternatively use, for example, HTTPie and test locally with:

http POST http://localhost:8080/invocations input[prompt]="Give me the information about order with id 12345"
Enter fullscreen mode Exit fullscreen mode

Those who use the very popular Python package and project manager uv can perform the following steps:

curl -LsSf https://astral.sh/uv/install.sh | sh
uv init 
uv add fastapi uvicorn[standard] pydantic httpx strands-agents requests
Enter fullscreen mode Exit fullscreen mode

Here, we download and install uv, initiate our project, and install all required dependencies.

Dockerizing the Custom Agent

Now we need to build an ARM64 Docker image for our application. We can, for example, use an ARM-based t4.small EC2 instance powered by AWS Graviton2 for it. Our Dockerfile looks like this:

# Use uv's ARM64 Python base image
FROM --platform=linux/arm64 ghcr.io/astral-sh/uv:python3.13-bookworm-slim

WORKDIR /app

COPY requirements.txt requirements.txt
# Install from requirements file
RUN pip install -r requirements.txt

# Copy agent file
COPY agentcore_runtime_custom_agent_demo.py ./
COPY agent_core_utils.py ./


# Expose port
EXPOSE 8080

# Run application
CMD ["opentelemetry-instrument", "uvicorn", "agentcore_runtime_custom_agent_demo:app", "--host", "0.0.0.0", "--port", "8080"]
Enter fullscreen mode Exit fullscreen mode

We install all required dependencies, copy application files to the root directory, expose port 8080, and start the FastAPI web application server and instrument or agent with opentelemetry-instrument.

Here is a Dockerfile version if we work with uv.

Now we need to build the Docker image and upload it to the ECR repository with the name agentcore-runtime-custom-agent-demo, which we create below:

# build the Docker image
sudo docker build --no-cache -t agentcore-runtime-custom-agent-demo:v1 .

# Login to ECR
aws ecr get-login-password --region us-east-1 | sudo docker login --username AWS --password-stdin {account_id}.dkr.ecr.{region}.amazonaws.com  

# Create ECR repository
aws ecr create-repository --repository-name agentcore-runtime-custom-agent-demo --image-scanning-configuration scanOnPush=true --region {region}  

# Tag the Docker image
sudo docker tag agentcore-runtime-custom-agent-demo:v1 {account_id}.dkr.ecr.{region}.amazonaws.com/agentcore-runtime-custom-agent-demo:v1

# Push the Docker Image to the ECR repository
sudo docker push {account_id}.dkr.ecr.{region}.amazonaws.com/agentcore-runtime-custom-agent-demo:v1 
Enter fullscreen mode Exit fullscreen mode

Please replace AWS {account_id} and {region} with our own values.

Deploying the Custom Agent

Now we need to deploy our custom agent to the AgentCore Runtime. For it I created deploy_custom_agent :

import boto3
import os

os.environ['AWS_DEFAULT_REGION'] = 'us-east-1'
client = boto3.client('bedrock-agentcore-control')

response = client.create_agent_runtime(
    agentRuntimeName='strands_custom_agent',
    agentRuntimeArtifact={
        'containerConfiguration': {
            'containerUri': '{YOUR_ECR_REPO_URI}'
        }
    },
    networkConfiguration={"networkMode": "PUBLIC"},
    roleArn='{YOUR_IAM_ROLE_ARN}'
)
Enter fullscreen mode Exit fullscreen mode

Please replace {YOUR_ECR_REPO_URI} with the ECR Container URI (which we just created above) and {YOUR_IAM_ROLE_ARN} with our IAM role ARN, which has all the required permissions. In part 2, I gave a full explanation and provided the code for such an IAM role and attached execution policy. In case we use the same ECR repository as in part 2 for it, we can re-use the execution policy completely, otherwise we need to provide the correct ECR ARN there, like this:

{
            "Sid": "ECRImageAccess",
            "Effect": "Allow",
            "Action": [
                "ecr:BatchGetImage",
                "ecr:GetDownloadUrlForLayer"
            ],
            "Resource": [
                "arn:aws:ecr: ${region}:${account_id}:repository/${agentcore-runtime-ecr-repo}"
            ]
 },
....
Enter fullscreen mode Exit fullscreen mode

Invoking the Custom Agent

We invoke our custom agent with AWS Python SDK, like in the example provided below invoke_agent.py

import boto3
import json
import os

agent_core_client = boto3.client('bedrock-agentcore', region_name=os.environ['AWS_DEFAULT_REGION'])

payload = json.dumps({
     "input": {"prompt": "Give me the information about the order with id 1"}
})

#payload = json.dumps({
#    "input": {"prompt": "Can you list orders created between 1 of August 2025 5 am and 7 of August 2025 3 am. "
#                       "Please use the following date format, for example: 2025-08-02T19:50:55"})
#})

response = agent_core_client.invoke_agent_runtime(
    agentRuntimeArn="{YOUR_RUNTIME_ARN}",
    qualifier="DEFAULT",
    payload=payload
)
response_body = response['response'].read()
response_data = json.loads(response_body)
print("Agent Response:", response_data)
Enter fullscreen mode Exit fullscreen mode

Please replace the value of the variable agentRuntimeArn with our deployed AgentCore Runtime ARN.

Of course, we can ask our developed agent to answer other questions. We can find some of such examples in my article Exposing existing Amazon API Gateway REST API via MCP and Gateway endpoint, as in our example, we used the same AgentCore Gateway, which exposed the same Amazon Gateway Order API as the MCP tools.

Custom Agent Observability

As we instrumented our code with AWS Distro for Open Telemetry (ADOT) SDK, Agents, Sessions, and Traces views will be provided out of the box in the CloudWatch GenAI Observability service, along with model invocation logging and metrics. I refer to my AgentCore Observability article for the detailed explanations.

Conclusion

In this article, we re-wrote our implementation to use Custom Agent instead of Bedrock AgentCore Starter Toolkit, which gave us full control over our agent's HTTP interface and deployed it on Amazon Bedrock AgentCore Runtime.

In the next article of the series, we'll look into how to implement the same Custom Agent in Java programming language with Spring AI framework. Then we'll explore other AgentCore services like Memory, starting with the short-term memory

Please also check out my Amazon Bedrock AgentCore Gateway article series.

Top comments (0)