DEV Community

Cover image for From Local MCP Server to AWS Deployment in Two Commands
Dennis Traub for AWS

Posted on

From Local MCP Server to AWS Deployment in Two Commands

“What's the simplest way to deploy an MCP server on AWS?”

Last week a friend asked me this, and I realized: if you already have a local MCP server, it's really just two CLI commands - agentcore configure and agentcore launch. The entire workflow - from local development to production - takes just a few minutes.

Let me show you how.

To get started, we'll prepare a small example MCP server and test it locally to make sure it works. Once that's done, we're going to directly deploy it to Bedrock AgentCore Runtime, the serverless hosting environment for Agents and MCP servers on AWS.

In just a few minutes you'll have a working, scalable, and secure MCP server on AWS, directly accessible with valid IAM credentials - no complicated OAuth setup needed.

Why AgentCore Runtime?

Before we get started, let's have a quick look at AgentCore Runtime and why it is such an obvious choice.

Many MCP servers need to maintain session state between requests while keeping the individual sessions strictly isolated. Traditional serverless options like AWS Lambda provide session isolation, but they are stateless, which means you'd have to manage sessions yourself (typically via DynamoDB or similar). Container services like ECS can maintain state, but you'd have to implement session isolation yourself or run a separate container for every single session - which can become quite costly over time.

AgentCore Runtime solves both problems: it's serverless (you pay only for processing time), but unlike Lambda, it maintains session state across multiple calls. Every session runs in a completely isolated execution environment. It's specifically designed for agent workloads and handles the infrastructure automatically.

And with that, let's get started!

Prerequisites

To follow this tutorial, you'll need:

Step 1: Create Your MCP Server

Let's start with a simple dice-rolling server to demonstrate the deployment process. We'll use FastMCP, which provides a decorator-based API for building MCP servers quickly.

# Setup your project directory ...
mkdir my-project 

# ... and a subdirectory for the server
mkdir my-project/mcp-server

# Then navigate to the subdirectory an initialize with uv
cd my-project/mcp-server
uv init --bare
uv add mcp
Enter fullscreen mode Exit fullscreen mode

The uv init --bare command creates a minimal project structure without interactive prompts. This gives us a clean pyproject.toml for dependency management.

Here's what we're building: a simple dice-rolling MCP server that demonstrates the deployment workflow.

Create a new file called server.py inside the mcp-server directory

# mcp-server/server.py
import random
from mcp.server.fastmcp import FastMCP

mcp = FastMCP(host="0.0.0.0", stateless_http=True)

@mcp.tool()
def roll_d20(number_of_dice: int = 1) -> dict:
    """Rolls one or more 20-sided dice (d20)"""

    if number_of_dice < 1:
        return {
            "error": f"number_of_dice must be at least 1, got: {number_of_dice}"
        }

    rolls = [random.randint(1, 20) for _ in range(number_of_dice)]
    total = sum(rolls)

    return {
        "number_of_dice": number_of_dice,
        "rolls": rolls,
        "total": total
    }


def main():
    mcp.run(transport="streamable-http")

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

What's happening here:

FastMCP initialization: The host="0.0.0.0" binds to all network interfaces, which is necessary for containerized deployments. The stateless_http=True flag is critical - it tells FastMCP to use HTTP-based transport instead of maintaining persistent connections.

Tool registration: The @mcp.tool() decorator automatically registers your function as an MCP tool. The function's docstring becomes the tool description, and type hints define the input schema. MCP clients will discover this tool and can invoke it with structured arguments.

Transport: mcp.run(transport="streamable-http") starts the server using the streamable HTTP transport. This is the standard MCP transport for HTTP-based deployments and is compatible with AgentCore Runtime.

Step 2: Test Locally

Before deploying, let's verify the server works. Testing locally helps catch issues early and ensures your server logic is correct before dealing with deployment complexities.

Run the MCP server

uv run server.py
Enter fullscreen mode Exit fullscreen mode

You should see:

INFO:     Started server process [55952]
INFO:     Waiting for application startup.
StreamableHTTP session manager started
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
Enter fullscreen mode Exit fullscreen mode

FastMCP uses Uvicorn under the hood, which is why you see Uvicorn logs. The server is now listening on port 8000 and ready to accept MCP requests.

List the available tools

In another terminal, test the server using the MCP protocol. MCP uses JSON-RPC 2.0 over HTTP, so we can test it without an MCP client by sending JSON-RPC requests directly using curl.

The tools/list method is a standard MCP method that returns all available tools. The heredoc syntax (<< 'EOF') lets us inline the JSON payload without creating separate files:

curl -X POST http://127.0.0.1:8000/mcp \
     -H "Content-Type: application/json" \
     -H "Accept: application/json, text/event-stream" \
     -d @- << 'EOF'
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/list"
}
EOF
Enter fullscreen mode Exit fullscreen mode

You should see the roll_d20 tool in the response with its description and input schema.

Send a tool call request

Now let's call the tool. The tools/call method invokes a specific tool with the provided arguments:

curl -X POST http://127.0.0.1:8000/mcp \
     -H "Content-Type: application/json" \
     -H "Accept: application/json, text/event-stream" \
     -d @- << 'EOF'
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "roll_d20",
    "arguments": {
      "number_of_dice": 2
    }
  }
}
EOF
Enter fullscreen mode Exit fullscreen mode

The response will contain the tool's return value - in this case, the dice rolls and total.

Stop the local server with CTRL+C when done. This local testing confirms your server works correctly before we package it for deployment.

Step 3: Configure for AgentCore Runtime

The Bedrock AgentCore Starter Toolkit handles the deployment configuration. It generates Dockerfiles, IAM roles, and all the infrastructure code needed to deploy your server. Install it in your project's root directory:

# Navigate to the project root
cd ..

# Initialize a new Python project
uv init --bare
uv add bedrock-agentcore-starter-toolkit
Enter fullscreen mode Exit fullscreen mode

We're creating a new project at the project root because the toolkit needs to manage the deployment configuration separately from the MCP server. This separation keeps the MCP server code clean and makes it easier to manage multiple deployments.

Configure your agent:

uv run agentcore configure \
   --entrypoint mcp-server/server.py \
   --requirements-file mcp-server/pyproject.toml \
   --disable-memory --disable-otel \
   --non-interactive \
   --deployment-type container \
   --protocol MCP \
   --name my_mcp_server
Enter fullscreen mode Exit fullscreen mode

Let's understand each parameter:

  • --entrypoint points to your server's main file. AgentCore runs this inside the container.
  • --requirements-file tells the toolkit where to find dependencies.
  • --disable-memory --disable-otel disables optional features. Memory persists state across sessions (not needed for MCP servers). OTEL enables observability. We're disabling both to keep this simple.
  • --non-interactive skips interactive CLI prompts and uses defaults.
  • --deployment-type container packages your server as a container image using CodeBuild - no local Docker required.
  • --protocol MCP tells AgentCore this is an MCP server.
  • --name sets the MCP server name.

Note: In some cases, you may see warnings about container engines or platform mismatches. These are safe to ignore - when using --deployment-type container, AgentCore will use CodeBuild for cloud-based builds. CodeBuild can run on ARM64 (the architecture AgentCore Runtime uses), so even if your local machine is x86_64, the cloud build will work correctly.

The configuration creates:

  • .bedrock_agentcore.yaml - The main configuration file that stores all deployment settings. Be very careful when editing this directly.
  • .bedrock_agentcore/my_mcp_server/Dockerfile - The container definition that packages your server. The toolkit generates this based on your entrypoint and dependencies.
  • mcp-server/.dockerignore - Build exclusions to keep the container image small.

Step 4: Deploy to AgentCore Runtime

Now let's deploy the server to AWS:

uv run agentcore launch --agent my_mcp_server
Enter fullscreen mode Exit fullscreen mode

This single command orchestrates the entire deployment process.

Here's what happens under the hood:

1: ECR Repository Creation - Amazon Elastic Container Registry (ECR) stores your container images. The toolkit automatically creates a new repository.

2: IAM Role Setup - The toolkit creates two IAM roles:

  • Execution Role - Grants permissions for the runtime to pull images from ECR, execute your container, and write logs to CloudWatch.
  • CodeBuild Role - Allows CodeBuild to build your container and push it to ECR.

3: Container Build - CodeBuild builds your container in the cloud. This is why you don't need Docker locally - CodeBuild handles the build process.

The build:

  • Uses the generated Dockerfile
  • Installs dependencies from your pyproject.toml
  • Packages your server code
  • Creates an ARM64 image (AgentCore Runtime's architecture)
  • Pushes the image to your ECR repository

4: Runtime Deployment - AgentCore Runtime creates a new runtime instance using your container image.

The runtime is now ready to accept MCP requests.

The build takes about 25-30 seconds. When complete, you'll see:

╭────────────────────────────── Deployment Success ───────────────────────────────╮
│ Agent Details:                                                                  │
│ Agent Name: my_mcp_server                                                       │
│ Agent ARN: arn:aws:bedrock-agentcore:REGION:ACCOUNT_ID:runtime/RUNTIME_ID       │
│ ...                                                                             │
╰─────────────────────────────────────────────────────────────────────────────────╯
Enter fullscreen mode Exit fullscreen mode

Important:

Note down the the following information - you'll need it to connect to your server:

  • Agent ARN - The Full ARN of your deployed server. This uniquely identifies your runtime across all AWS accounts and regions.

Then extract the following information from the ARN:

  • AWS Region - Directly after arn:aws:bedrock-agentcore:, e.g. us-west-2.
  • AWS Account ID - A 12-digit number, right next to the region. Region and account ID are required for IAM-based authentication when connecting clients.
  • Runtime ID - The last part of the ARN, directly following runtime/, e.g., my_mcp_server-abcde12345.

The ARN format is: arn:aws:bedrock-agentcore:{region}:{account-id}:runtime/{runtime-id}

Step 5: Test the Deployed Server

Test your deployed server using the AWS CLI. The invoke-agent-runtime command sends requests to your deployed server, similar to how we tested locally with curl.

We'll use process substitution (<(...)) to inline the JSON payloads without creating separate files:

# Set your server ARN
export SERVER_ARN="arn:aws:bedrock-agentcore:REGION:ACCOUNT:runtime/RUNTIME_ID"

# List available tools
aws bedrock-agentcore invoke-agent-runtime \
    --agent-runtime-arn $SERVER_ARN \
    --content-type "application/json" \
    --accept "application/json, text/event-stream" \
    --payload fileb://<(cat <<'EOF'
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/list"
}
EOF
) list-tools.txt

# Call the tool
aws bedrock-agentcore invoke-agent-runtime \
    --agent-runtime-arn $SERVER_ARN \
    --content-type "application/json" \
    --accept "application/json, text/event-stream" \
    --payload fileb://<(cat <<'EOF'
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "roll_d20",
    "arguments": {
      "number_of_dice": 2
    }
  }
}
EOF
) call-tool.txt
Enter fullscreen mode Exit fullscreen mode

What you'll see: The AWS CLI opens a pager showing HTTP response metadata (status codes, headers, including the MCP session ID). Press q to exit. The actual JSON-RPC response is written to the output files - check list-tools.txt for the tool list and call-tool.txt for the dice roll results.

Notice that the requests are identical to your local tests - the MCP request is the same whether running locally or on AgentCore Runtime. The only difference is the endpoint and authentication method.

Step 6: Connect an Agent (Optional)

Now we can connect an AI agent to the deployed server using the official IAM MCP client, which handles MCP request creation and IAM authentication automatically. This demonstrates how real-world agents would connect to your server.

We're going to use the Strands Agents SDK as an example framework, but this pattern works with any agent framework that supports MCP. Check out this post, where I show you how to use the IAM MCP client with LangChain, LlamaIndex, and Microsoft's Agent Framework.

While still in the project root, add the following packages:

uv add mcp-proxy-for-aws strands-agents
Enter fullscreen mode Exit fullscreen mode

The mcp-proxy-for-aws is an official AWS wrapper around the standard MCP client. The standard client expects OAuth, but this one signs requests with AWS credentials using SigV4, making it compatible with IAM authentication on AWS. The strands-agents package provides the agent framework we'll use for this example.

Create test_agent.py and add the runtime ID, AWS account ID, and region:

# Using Strands Agents SDK for this example
from strands import Agent
from strands.tools.mcp import MCPClient
from mcp_proxy_for_aws.client import aws_iam_streamablehttp_client

# Set your MCP server details from Step 4
RUNTIME_ID = "[YOUR RUNTIME ID]"
ACCOUNT_ID = "[YOUR ACCOUNT ID]"
REGION = "[YOUR AWS REGION]"

def main():
    # Build the MCP server URL
    url = f"https://bedrock-agentcore.{REGION}.amazonaws.com/runtimes/{RUNTIME_ID}/invocations?qualifier=DEFAULT&accountId={ACCOUNT_ID}"

    print(f"\nInitializing MCP client with IAM-based auth for:\n{url}")

    mcp_client_factory = lambda: aws_iam_streamablehttp_client(
        aws_service="bedrock-agentcore",
        aws_region=REGION,
        endpoint=url,
        terminate_on_close=False
    )

    with MCPClient(mcp_client_factory) as mcp_client:
        mcp_tools = mcp_client.list_tools_sync()
        agent = Agent(tools=mcp_tools)

        query_1 = "What tools do you have available?"
        print(f"\nQ: {query_1}")
        agent(query_1)

        query_2 = "Roll three dice"
        print(f"\nQ: {query_2}")
        agent(query_2)

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

Let's understand the key parts:

Endpoint URL construction: The URL follows AgentCore's invocation endpoint pattern. The qualifier=DEFAULT parameter specifies which endpoint version to use (AgentCore supports multiple versions for gradual rollouts).

Client factory pattern: The Strands Agents SDK's MCPClient expects a factory function (a lambda that returns a client). This allows the framework to manage connection lifecycle - creating new connections when needed and cleaning them up properly.

The aws_iam_streamablehttp_client function creates an MCP client that automatically signs all requests with your AWS credentials as well as the region and AWS service the server is hosted in.

Tool discovery: mcp_client.list_tools_sync() fetches all available tools from your server. The agent then receives these tools and can invoke them based on user queries. When the agent sees "Roll three dice," it recognizes this matches the roll_d20 tool and calls it with number_of_dice=3.

Session management: The terminate_on_close=False parameter tells the client not to explicitly terminate the MCP session when closing. AgentCore Runtime manages session lifecycle automatically, so this isn't necessary.

Run it:

uv run test_agent.py
Enter fullscreen mode Exit fullscreen mode

Note: You may see a deprecation warning: DeprecationWarning: Use 'streamable_http_client' instead. The official MCP client library recently renamed their HTTP client, and the IAM client is being updated to match. This warning can be safely ignored. By the time you read this, it may already be fixed.

The agent will connect using your AWS credentials and use the deployed MCP server's tools.

This demonstrates the complete flow: your server is deployed, accessible via IAM, and integrated with an AI agent framework.

The same pattern works with other frameworks like LangChain or LlamaIndex - just swap the agent framework while keeping the IAM MCP client.

Step 7: Cleanup

When you're done experimenting, destroy the deployment to avoid ongoing costs:

uv run agentcore destroy --agent my_mcp_server
Enter fullscreen mode Exit fullscreen mode

This command removes all resources created during deployment:

  • AgentCore runtime - The running MCP server instance
  • ECR repository and images - Container images stored in ECR
  • CodeBuild project - The build configuration (builds are ephemeral, but the project definition persists)
  • IAM roles - Both the execution role and CodeBuild role
  • S3 artifacts - Build artifacts stored in S3 buckets created by CodeBuild

The toolkit is careful about cleanup - it won't delete resources that were created outside the toolkit. You'll be prompted to confirm before deletion, as this operation is irreversible.

What We've Learned

This tutorial covered the complete deployment workflow:

  1. Local Development - Build and test MCP servers locally using FastMCP. Testing locally helps catch issues before deployment.

  2. Configuration - The AgentCore Starter Toolkit generates all infrastructure code (Dockerfiles, IAM roles, etc.) from the configuration. This abstraction lets you focus on your server logic rather than deployment details.

  3. Cloud Deployment - AgentCore Runtime uses CodeBuild for container builds, so you don't need Docker locally. The runtime automatically manages state, session isolation, and infrastructure - you just provide the container image.

  4. Testing - The AWS CLI's invoke-agent-runtime command lets you test deployed servers using the same MCP protocol as local testing. IAM authentication happens automatically via your AWS credentials.

  5. Integration - The IAM MCP client enables any agent framework to connect to AgentCore-hosted MCP servers using AWS credentials, eliminating the need for OAuth infrastructure.

Next Steps

  • Integrate with other frameworks - Check out No OAuth Required: An MCP Client For AWS IAM to learn how to connect LangChain, LlamaIndex, or other agent frameworks
  • Deploy via AgentCore Gateway - Use Bedrock AgentCore Gateway to turn existing APIs into MCP servers
  • Explore more complex MCP servers - Add multiple tools, resources, and prompts to your server

Key takeaway: Bedrock AgentCore Runtime provides serverless execution with stateful sessions. You pay only for processing time, even though your server maintains its state between requests. This session-based model makes it the perfect for agentic workloads, including MCP.

Top comments (0)