DEV Community

Cover image for Deploying AI Agents on AWS Without Creating a Security Mess
Morgan Willis
Morgan Willis

Posted on

Deploying AI Agents on AWS Without Creating a Security Mess

Most agents that are useful need access to private data.

They need to query internal databases, call internal systems, or read data that was never intended to be public. These requirements immediately raise questions about network exposure, credential handling, and compliance.

How does the agent connect to a private database? Where does it run? How do you handle multiple users without sharing execution state? How do you grant access to private systems without hardcoding credentials or widening network access?

This post walks through an example of how I answered those questions for an agent I deployed to AWS.

You can find the full-length video where I build this solution end-to-end here.

The running example

I built a simple logistics helper agent using Strands Agents SDK and an OpenAI model. It answers questions about shipments by querying a live PostgreSQL database running on Amazon Relational Database Service (RDS) inside an Amazon Virtual Private Cloud (VPC) on AWS.

The easy part was building the agent logic. I got it running locally using mocked tools for early testing.

The hard part was deploying the agent in a way that:

  • does not expose the database publicly
  • does not embed credentials in code
  • does not punch unnecessary holes in the network
  • properly isolates user sessions

AWS provides the building blocks to solve these problems, but you still need to make deliberate choices about how they fit together.

This post uses the logistics agent as a running example. Each snippet is either from the agent code or the infrastructure files that deploy it.


Amazon Bedrock AgentCore primer

In this example, AgentCore Runtime is the hosting environment for the logistics agent.

AgentCore Runtime is a managed, serverless, hosted runtime that runs agents in isolated sessions, handles authentication, scaling, and lifecycle management without requiring you to completely rewrite your agent for integration. It is framework and model agnostic, and supports multiple protocols including: HTTP, MCP, and A2A.

Amazon Bedrock AgentCore Runtime Overview

You can read more about Amazon Bedrock AgentCore Runtime here.


The architecture at a glance

Architecture Diagram for example

The diagram above shows the architecture for the backend of the logistics agent, including how it connects to the private database and external model provider, OpenAI.

  • The agent runs on Amazon Bedrock AgentCore Runtime.
  • The AgentCore Runtime deploys Elastic Network Interfaces (ENIs) into private subnets inside a VPC to allow connectivity with private resources.
  • The database runs on a private RDS instance in the same VPC.
  • The agent reads database connection information from AWS Systems Manager Parameter Store.
  • The agent reads secrets from AWS Secrets Manager (database credentials and the OpenAI key for the model provider).
  • VPC endpoints keep calls to AWS services on the AWS network, including calls to AgentCore, AWS Systems Manager, and AWS Secrets Manager.
  • A NAT Gateway provides outbound internet access so the agent can call OpenAI for inference.
  • IAM controls:
    • who can invoke the agent
    • what AWS APIs the agent can call once invoked

If you want to see the full code or AWS Cloud Development Kit (CDK) stack, the step by step guide can be found on GitHub here.

A quick map of the security concerns and supporting AWS features

Security concern AWS primitive Where it shows up in this example
Inbound authentication for invocations AgentCore Runtime support for IAM SigV4 or OAuth (JWT) The caller invoking InvokeAgentRuntime
Session isolation AgentCore Runtime sessions Runtime behavior (no shared process across users)
Secrets AWS Secrets Manager Agent loads DB credentials and OpenAI key at runtime
Non-secret config AWS SSM Parameter Store Agent loads endpoint, DB name, and secret ARNs
AgentCore Runtime agent permissions IAM execution role Role associated to the agent in AgentCore Runtime
Private connectivity to AWS services VPC endpoints VPC endpoints
Private connectivity to RDS VPC networking and security groups Runtime ENIs in private subnets and security group rules
Egress only internet access from private subnets NAT gateway NAT Gateway in a public subnet with private subnet route tables for 0.0.0.0/0

Keep this table in mind, as this post will dive deeper into each row.


The agent code, trimmed to the parts that matter

This is the logistics helper agent written in Python using Strands Agents SDK, with some details removed for brevity. The full sample can be found here.

import json
import boto3
import logging
import pg8000.native
from strands import Agent, tool
from strands.models.openai import OpenAIModel
from bedrock_agentcore import BedrockAgentCoreApp

app = BedrockAgentCoreApp()

# Cached within a single runtime session
_db_config = None
_db_credentials = None
_db_connection = None

def _load_db_config():
    ...

def get_db_connection():
    ...

@tool
def get_shipment_status(reference_no: str) -> str:
    ...

@tool
def find_delayed_shipments() -> str:
    ...

SYSTEM_PROMPT = """You are a logistics tracking assistant with access to a real-time shipment database.
..."""

_agent = None
_openai_model = None

def _get_openai_model():
    ...

def _initialize_agent():
    ...

@app.entrypoint
def logistics_query(payload):
    user_query = payload.get("query")
    if not user_query:
        return "Please provide a query in the format: {\"query\": \"your question here\"}"

    agent = _initialize_agent()
    result = agent(user_query)
    return result.message["content"][0]["text"]

if __name__ == "__main__":
    app.run()
Enter fullscreen mode Exit fullscreen mode

How to make the agent AgentCore Runtime compatible

Before we dive into the details about what the specific features in AWS are being used for security in this example, let’s first review how to make an agent AgentCore Runtime compatible. In your agent file, the code needed for integration is minimal.

from bedrock_agentcore import BedrockAgentCoreApp

app = BedrockAgentCoreApp()

@app.entrypoint
def logistics_query(payload):
    ...

Enter fullscreen mode Exit fullscreen mode

That @app.entrypoint is a decorator for the handler or entrypoint for your agent. AgentCore Runtime will call that function with a payload whenever an invocation hits the agent.

Behind the scenes, this is implementing the AgentCore Runtime service contract for HTTP which you can read more about here.

The important part is that it will implement the /invocations endpoint on port 8080 which allows us to invoke the agent once it’s deployed to runtime.

This example is for an agent built using Strands Agents SDK, you can find code snippets supporting other frameworks here.


Deploying the agent to AgentCore Runtime

Once the agent is wired up, you have options for deployment:

  • AgentCore CLI starter toolkit: fast iteration, good for development and early testing. You can use the command line to run agentcore configure to configure your agent, then agentcore deploy to deploy it to runtime. Read more about this here.
  • Infrastructure as code (AWS CloudFormation or AWS CDK): best for production deployments. You can find the AgentCore Construct Library for AWS CDKhere.

I’ll be using snippets from the AWS CDK template I created to deploy the agent in the following sections.


Inbound authentication and authorization

For the logistics agent, the first security boundary is deciding who is allowed to invoke the agent.

That means you need an inbound authentication mechanism, and I don’t know about you, but I am not rolling my own auth.

AgentCore Runtime supports two inbound authentication options:

  • IAM (SigV4): the caller signs the request with AWS credentials. An IAM policy on the caller determines whether they’re allowed to invoke the agent runtime, the same way authorization works for other AWS APIs.
  • OAuth 2.0 (JWT bearer tokens): the caller authenticates with an identity provider and sends a JWT bearer token. The agent runtime validates that token (via your configured IdP).

The logistics helper agent uses IAM for inbound authentication. When the agent is invoked, AgentCore Runtime validates the incoming request. There is no code related to authentication in the actual agent itself. AgentCore Runtime handles that for you.

What IAM permissions look like for the invoker

When invoking the agent, the invoker needs permission to call the bedrock-agentcore:InvokeAgentRuntime API on the runtime ARN.

Example policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowInvokeAgentRuntime",
            "Effect": "Allow",
            "Action": "bedrock-agentcore:InvokeAgentRuntime",
            "Resource": "arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/logistics_agent"
        }
    ]
}

Enter fullscreen mode Exit fullscreen mode

Important distinction:

  • This is the invoker’s permission (who can call the agent).
  • Later we’ll define the agent IAM execution role (what the agent can do once it starts running).

NOTE: In this example, I’m invoking the runtime directly using local IAM credentials, the AgentCore CLI, and the AWS SDK as a proof of concept. In a real-world system, I would place an API hosted using a service like Amazon API Gateway in front of the agent as a proxy. API Gateway would handle end-user authentication and request validation, then it can use its own IAM role to call the InvokeAgentRuntime API for the agent. I wrote another blog about why you should have at least a proxy component sitting in front of your agents here.


Isolating users and execution state

If multiple users hit the same agent at the same time, you don't want a process with data and state shared across users. You also don't need to reinvent the wheel by creating a multi-tenant isolation mechanism yourself.

AgentCore Runtime runs agents in isolated environments, called sessions.

Each time someone invokes your agent, AgentCore either creates a new session or routes the request to an existing session (if you supply a session ID). Agent sessions run in a dedicated microVM with isolated CPU, memory, and filesystem resources.

AgentCore Runtime Session Overview


Storing and accessing credentials and connection details securely

Now that we know how to invoke the agent and how the agent runs in isolated sessions, the next question is: how does the logistics agent gain access to private systems without baking sensitive data into the code or environment variables?

There are two different kinds of data the agent needs in order to connect to the Amazon RDS database and OpenAI model:

  1. Secrets, like database credentials and the OpenAI API keys
  2. Configuration, like hostnames, database names, and the ARNs of the secrets themselves so they can be retrieved programmatically

What to store where

  • You should use AWS Secrets Manager to store sensitive values like:
    • The RDS username and password
    • The OpenAI API key
  • You should use AWS Systems Manager Parameter Store to store non-secret configuration data like:
    • The RDS endpoint
    • The database name
    • The ARNs of the secrets that contain credentials

This split gives you a few practical benefits:

  • Secrets can be rotated independently
  • You can audit secret access
  • You avoid the temptation to pass credentials around “just to make it work”

Configuration lookup using AWS SSM Parameter Store

This code snippet allows the agent to read three parameters from AWS SSM Parameter Store:

  • The RDS endpoint
  • The database name
  • The secret ARN for the DB credentials

Example Python code using boto3 to access AWS SSM Parameter Store:

ssm_client = boto3.client("ssm", region_name=AWS_REGION)
response = ssm_client.get_parameters(
    Names=[
        "/agentcore/rds/endpoint",
        "/agentcore/rds/database",
        "/agentcore/rds/secret-arn",
    ]
)

params = {p["Name"]: p["Value"] for p in response["Parameters"]}
_db_config = {
    "endpoint": params["/agentcore/rds/endpoint"],
    "database": params["/agentcore/rds/database"],
    "secret_arn": params["/agentcore/rds/secret-arn"],
}
Enter fullscreen mode Exit fullscreen mode

This pattern keeps configuration out of code and out of deployment artifacts, and access can be tightly scoped to only the specific parameter paths the agent needs.

From a security standpoint, this also creates a clear separation of responsibilities: Parameter Store answers where the database is and which secret to use to connect, while Secrets Manager controls what the credentials actually are.

If configuration details need to change, you update it centrally without redeploying code, and if access needs to be revoked or audited, it’s handled through IAM rather than application logic.

This keeps configuration flexible, secrets isolated, and permissions explicit.

Fetching credentials from AWS Secrets Manager

Once the logistics agent knows which secret to retrieve, it fetches the credentials from AWS Secrets Manager.

Example Python code using boto3 to access AWS Secrets Manager:

secrets_client = boto3.client('secretsmanager', region_name=AWS_REGION)
secret_response = secrets_client.get_secret_value(
    SecretId=_db_config['secret_arn']
)

_db_credentials = json.loads(secret_response['SecretString'])
Enter fullscreen mode Exit fullscreen mode

The same pattern is used to retrieve the OpenAI API key. The agent never reads secrets from disk, environment variables, or configuration files. Everything comes from managed services at runtime.

When you're working with secrets in code, be careful not to log the secret payload or connection strings; treat exceptions as potentially sensitive and sanitize logs.


Granting the agent permission to call AWS APIs

Every AWS API call the logistics agent makes (AWS SSM, AWS Secrets Manager, Amazon CloudWatch) is authorized through the IAM execution role attached to the agent runtime.

This is distinct from the invoker permissions described earlier. It defines what the runtime can do after an invocation starts.

In this example, the role itself was created using the AWS CDK, and that role is assumed by the AgentCore Runtime service principal and granted least-privilege access to:

  • read specific SSM parameters
  • read specific AWS Secrets Manager secrets
  • write logs/traces/metrics to Amazon CloudWatch

Example code snippet from the AWS CDK stack that defines the IAM permissions for the parameters and secrets the agent needs:

runtime_role.add_to_policy(
    iam.PolicyStatement(
        actions=["ssm:GetParameter", "ssm:GetParameters"],
        resources=[
            f"arn:aws:ssm:{self.region}:{self.account}:parameter/agentcore/rds/endpoint",
            f"arn:aws:ssm:{self.region}:{self.account}:parameter/agentcore/rds/database",
            f"arn:aws:ssm:{self.region}:{self.account}:parameter/agentcore/rds/secret-arn",
        ],
    )
)

runtime_role.add_to_policy(
    iam.PolicyStatement(
        actions=["secretsmanager:GetSecretValue"],
        resources=[db_secret_arn, openai_secret_arn],
    )
)

Enter fullscreen mode Exit fullscreen mode

If the agent tries to fetch a secret it is not allowed to read, it fails.

Additionally, the AgentCore Runtime IAM execution role is the only principal allowed to read these secrets, scoped to specific secret ARNs, and secret encryption is handled by Secrets Manager (optionally with a customer-managed KMS key if you need tighter controls).


Allowing the agent to access private resources inside an Amazon VPC

Because the logistics agent queries a private RDS instance, the runtime itself should run inside the same VPC.

To achieve this, the agent runtime is deployed using the VPC network mode configuration.

The AgentCore Runtime VPC network mode configuration

By default, AgentCore Runtime does not deploy agents to a VPC. Deploying agents to a VPC using VPC network mode enables you to have an agent that connects to other resources within that VPC without opening up any network security holes. This makes it easier to allow your agent to work with private databases, call internal APIs, or integrate with other existing systems running in a VPC.

Example code snippet from the AWS CDK stack that defines the AgentCore Runtime resource using VPC network mode:

runtime = CfnResource(
            self,
            "AgentCoreRuntime",
            type="AWS::BedrockAgentCore::Runtime",
            properties={
                "AgentRuntimeName": "logistics_agent_cdk",
                "Description": "Runtime for logistics Strands agent with RDS backed tools",
                "RoleArn": runtime_role.role_arn,
                "NetworkConfiguration": {
                    "NetworkMode": "VPC",
                    "NetworkModeConfig": {
                        "Subnets": list(private_subnet_ids),
                        "SecurityGroups": [runtime_sg.security_group_id],
                    },
                },
                "AgentRuntimeArtifact": {
                    "CodeConfiguration": {
                        "Code": {
                            "S3": {
                                "Bucket": asset_bucket_name,
                                "Prefix": asset_object_key,
                            }
                        },
                        "EntryPoint": ["agent.py"],
                        "Runtime": "PYTHON_3_12",
                    }
                },
            },
        )
Enter fullscreen mode Exit fullscreen mode

When an agent is invoked with VPC network mode configured, elastic network interfaces, or ENIs, are created in the configured private subnets. This gives each runtime session private IP addresses and allows it to connect to resources inside the VPC, like the logistics RDS database, over internal VPC networking.

VPC endpoints for accessing AWS services

Once the runtime is configured to run inside private subnets, the next issue pops up: the logistics agent still needs to call AWS APIs to work.

  • SSM and Secrets Manager are AWS services.
  • AgentCore itself is an AWS service.
  • CloudWatch Logs is an AWS service.

Without VPC endpoints, these API calls would typically route through public AWS service endpoints via a NAT gateway and traverse the public internet. In many environments, that pattern is not acceptable for compliance or security reasons.

VPC endpoints allow those calls to stay entirely on the AWS private network. By deploying endpoints for services like AWS Systems Manager Parameter Store, AWS Secrets Manager, Amazon Bedrock AgentCore Runtime, and Amazon CloudWatch, API traffic is routed privately within the VPC, reducing reliance on NAT gateways and eliminating exposure to the public internet.


Restricting database access to only the agent runtime

The VPC connectivity feature puts the agent in the right network. It does not actually allow the agent to communicate with the RDS database. That comes from security groups.

In this setup:

  • The agent has a security group.
  • The RDS instance has a security group.
  • The RDS security group allows inbound PostgreSQL traffic only from the agent security group.

This has one security group referencing another (sometimes called security group chaining). Security group chaining makes it so that you don’t have to allow a CIDR range or open access within the VPC.

What the RDS security group rule should look like conceptually

  • Inbound:
    • Protocol: TCP
    • Port: 5432 (Postgres)
    • Source: runtime security group ID

By default, security groups allow all outbound network traffic. You can also restrict the allowed outbound traffic to only what’s required (RDS port, VPC endpoints, and approved egress).

This example also assumes TLS is enforced for the RDS connection so database traffic is encrypted in transit.

A note on database access and query scope

A key design choice in this example that has not been covered yet is that the logistics agent never generates SQL directly.

It can only invoke prewritten tools that execute parameterized queries defined in code. This design avoids letting the model construct arbitrary queries against the database, which introduces risks ranging from accidental data exposure to destructive operations.

The agent can choose which tool to call and which parameters to supply, but it cannot change the shape of the query, the tables involved, or the operations being performed. That keeps the database interaction predictable and reviewable, even as agent behavior evolves.

Even with tool-based access, the database credentials used by the agent are scoped to read-only access on the required schema and views. Database permissions remain the final layer of protection if a tool is misconfigured, expanded later, or reused in ways that were not originally anticipated. It's important to have a layered approach to security.


Allowing outbound internet access from a private subnet

At this point, the logistics agent can talk to:

  • RDS privately inside the VPC
  • AWS services privately through VPC endpoints

But the OpenAI model is not an AWS service. So, the agent also needs outbound internet access.

Because the agent runs in private subnets, it cannot reach the internet directly.

The pattern that allows egress only traffic is:

  • A NAT gateway or NAT instance deployed to a public subnet
  • Private subnet route tables to direct internet bound traffic (0.0.0.0/0) to the NAT gateway

That gives you controlled egress without giving the runtime a public IP. This also keeps your AWS service calls private through VPC endpoints while still enabling external calls for OpenAI model invocation.


Putting all the pieces together

By the time everything is wired up, the security model is pretty straightforward.

This post has focused on the foundational security and deployment mechanics required to run an agent against private systems. It does not represent a complete architecture and intentionally does not cover application-level authorization, data encryption, fine-grained data access controls, model-specific safety techniques, or cost optimization strategies, all of which depend heavily on the specific use case. Those pieces build on top of the patterns shown here rather than replacing them.

Within this scope, the security model comes down to a few clear responsibilities. None of these controls are unique to agents, but skipping them is how agent deployments turn into security problems.

1) Lock down who can invoke the agent

Inbound access is handled by AgentCore Runtime.

  • If you use IAM, invokers need bedrock-agentcore:InvokeAgentRuntime permission on the runtime ARN.
  • If you use OAuth, your callers authenticate with your IdP and present a JWT, and the runtime validates it.

Either way, you have a clear, externalized answer to “who can call this thing.”

2) Do not share execution state across users

  • AgentCore Runtime sessions give you per-session isolation.
  • Your agent isn’t running as one long-lived server process that every user shares.
  • Within a session, you can cache data. Across sessions, state is isolated.

3) Authorize agents to make AWS API calls via an IAM execution role

Once invoked, the runtime assumes an IAM role that defines exactly what AWS API calls it can make:

No static credentials needed and if the role doesn’t allow it, the agent can’t do it.

4) Allow secure access to private resources inside a VPC

  • Use VPC Network Mode in your AgentCore Runtime configuration
  • AgentCore Runtime deploys ENIs to selected private subnets
  • Use VPC endpoints for communication with AgentCore and other AWS services
  • VPC endpoints keep AWS service traffic on the private AWS network

5) Only allow appropriate database access from the agent

Security groups provide an instance level firewall:

  • RDS inbound traffic is limited to traffic coming from the runtime security group on the necessary database port
  • no CIDR-based broad rules
  • no “anything in the VPC can connect”

6) Provide narrow egress internet access for OpenAI

  • NAT gateway gives the runtime outbound access from private subnets
  • Route tables send internet-bound traffic to NAT
  • AWS service calls can still stay private via VPC endpoints

That’s, at a minimum, what it takes to deploy an agent that accesses private systems without creating a security mess.

If you want to follow this example or adapt this architecture, the full repo includes the infrastructure and the deployment steps, plus cleanup.

And check out the video where I walk you through building the whole solution end-to-end here.

Top comments (0)