Introduction:
The Model Context Protocol (MCP) is rapidly becoming the standard for connecting AI agents to external tools and data sources. But building the server is only half the battle. The real challenge is deploying it securely to the cloud and connecting it to a frontend AI platform like AgentCore without triggering a cascade of browser security errors.
Recently, I went through the process of containerizing a Python-based MCP server that talks to AWS Bedrock Knowledge Bases and deploying it to AWS ECS Fargate. It was a journey filled with "504 Gateway Timeouts," "Mixed Content Blocking," and mysterious IAM failures.
This post is the guide I wish I had. It details the exact architecture and the critical, minute configuration steps required to make the connection stable and secure.
The Architecture:
We aren't just running a container; we are building a secure networking chain. The specific challenge here is connecting the AgentCore Gateway - which requires a secure HTTPS endpoint - to our backend Docker container running on AWS ECS Fargate, which natively listens on insecure HTTP.
Here is the winning flow:
Step 1: The MCP Server Code & Docker
For the server implementation, I utilized the FastMCP Python library, which simplifies the creation of Model Context Protocol servers.
Crucial Configuration- Transport Mode: When initializing the MCP server, it is critical to use the streamable-http transport mode. This mode is essential for avoiding "Mixed Content" security blocking when connecting a secure browser client (like AgentCore) to a backend container. Unlike other transport modes (like SSE) that can trigger browser security blocks if the handshake redirects to an insecure internal link, streamable-http provides a compatible, stateless endpoint that works seamlessly behind a proxy.
Here is the concise code snippet for the simpleadd tool and the main execution block. You can append this to the bottom of your main.py file.
This includes the critical transport='streamable-http' setting required for your CloudFront/ALB architecture.
`# Initialize boto3 client
client = boto3.client(
'bedrock-agent-runtime',
region_name='us-east-2'
)
@mcp.tool()
def simpleadd(a: int, b: int) -> int:
"""A simple tool to add two numbers. Useful for testing connectivity."""
return a + b
def main():
host = os.environ.get("HOST", "0.0.0.0")
port = int(os.environ.get("PORT", 8080))
print(f"Starting FastMCP server on http://{host}:{port}")
print("Transport: streamable-http")
mcp.run(
transport="streamable-http",
host=host,
port=port,
)
if name == "main":
main()`
Step 2: Containerization & Pushing to ECR
Before touching the infrastructure, we need to package our code into a Docker container and upload it to AWS Elastic Container Registry (ECR).
The Dockerfile: Create a file named Dockerfile in your project root. We keep it simple, but we must ensure we expose the port that matches our Python code (8080).
`FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8080
ENV HOST=0.0.0.0
ENV PORT=8080
CMD ["python", "main.py"]`
- Pushing to AWS ECR Run these commands in your terminal. Replace us-east-2 with your desired region and AWS Account ID.
A. Create the Repository:
aws ecr create-repository --repository-name knowledge-mcp-server --region us-east-2
B. Authenticate Docker:
aws ecr get-login-password --region us-east-2 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-2.amazonaws.com
C. Build the Image (The Crucial "Minute Detail"): If you are building this on a Mac with Apple Silicon (M1/M2/M3), Docker will default to arm64 architecture. However, AWS Fargate usually defaults to linux/amd64. If these don't match, your task will crash instantly with an obscure "Exec Format Error."
docker build --platform linux/amd64 -t knowledge-mcp-server .
D. Tag and Push the image
Tag the image
docker tag knowledge-mcp-server:latest 123456789012.dkr.ecr.us-east-2.amazonaws.com/knowledge-mcp-server:latest
Push to AWS
docker push 123456789012.dkr.ecr.us-east-2.amazonaws.com/knowledge-mcp-server:latest
Now your code is safely stored in ECR, ready for deployment.
Step 3: The AWS Foundation (ALB & Networking)
With the container in ECR, we need to build the networking path. We typically default to HTTPS everywhere, but because we are using CloudFront as our "SSL Wrapper" later (in Step 5), we intentionally keep the internal AWS networking simple to avoid protocol mismatches.
- The Application Load Balancer (ALB) Create an ALB in your public subnets.
Listener: Configure the listener to use HTTP on Port 80.
Why not HTTPS? We will let CloudFront handle the SSL termination. Configuring the ALB for HTTP avoids the complexity of managing internal certificates between AWS services and prevents the "504 Gateway Timeout" errors caused by CloudFront trying to speak HTTPS to a container that only speaks HTTP.
- The Target Group (Crucial Detail) Create a Target Group that points to your Fargate instances.
Target Type: IP Addresses (required for Fargate).
Protocol: HTTP on Port 8080 (matching your container).
Health Check Path: By default, AWS checks the root path /. However, because we switched our fastmcp server to streamable-http, the server listens specifically on /mcp.
Action: You must change the Health Check path to /mcp.
The Consequence: If you leave it as /, the health check will fail (404 Not Found), and ECS will continually kill and restart your task, leaving you with a "Zombie" service that never stabilizes.
- Security Groups (The Chain of Trust) We need to configure two security groups to ensure traffic flows correctly without exposing the container to the open internet.
ALB Security Group: Allow Inbound traffic on Port 80 from Anywhere (0.0.0.0/0). This allows CloudFront to reach the balancer.
ECS Task Security Group: Allow Inbound traffic on Port 8080, but for the Source, select Custom and paste the Security Group ID of your ALB.
Why? This ensures no one can bypass the load balancer to hit your container directly.
Step 4: ECS Fargate & The "Two Roles" Trap
Now we deploy the container. This step contains the most common pitfall in AWS ECS: confusing the Execution Role with the Task Role.
- Create the Cluster Go to ECS -> Create Cluster.
Choose Fargate (Serverless).
Name it (e.g., agent-cluster).
- Create the Task Definition (The Blueprint) Create a new Task Definition with the following settings:
Launch Type: Fargate.
OS/Architecture: Linux / X86_64.
CPU/Memory: .25 vCPU / .5 GB (MCP servers are lightweight).
Container Details: Image URI: Paste the ECR URI from Step 2.
Port Mappings: 8080 (TCP).
- IAM Roles (CRITICAL) ECS asks for two different roles. If you swap them, your code will crash with "Access Denied" or "Credentials Not Found."
Role A: Task EXECUTION Role (For AWS)
Purpose: Lets ECS pull your Docker image from ECR and push logs to CloudWatch.
Permissions: AmazonECSTaskExecutionRolePolicy.
Role B: Task Role (For Your Code)
Purpose: This is the identity assumed by your running container. If your Python code calls boto3.client('bedrock'), it uses this role.
Action: You must create a custom IAM Role (e.g., McpTaskRole) and attach the necessary permissions (e.g., bedrock:Retrieve, s3:GetObject).
Crucially, ensure the "Trust Relationship" policy allows ecs-tasks.amazonaws.com, not just EC2.
The Trap: If you leave "Task Role" as "None" or use the Execution Role, your fastmcp server will start, but every time it tries to search the Knowledge Base, it will crash.
- Create the Service Go to your Cluster -> Create Service.
Launch Type: Fargate.
Task Definition: Select the one you just made.
Service Name: mcp-service.
Network Configuration (Crucial):
VPC: Select the same VPC where you built your ALB.
Subnets: Select your subnets.
Security Group: Select the "ECS Task Security Group" created in Step 3. (Do not let AWS create a new default group, or it will block the ALB).
Auto-assign Public IP: ENABLED (Required for Fargate to pull Docker images from ECR unless you have a NAT Gateway).
Load Balancing: Select "Application Load Balancer".
Container to Load Balance: Select your container:8080.
Target Group: Select the "Existing Target Group" created in Step 3.
Step 5: The Magic Layer : AWS CloudFront
This is the most critical step. If you try to connect AgentCore directly to your ALB via HTTP, the browser will block it (Secured sites cannot call unsecured APIs). If you try to set up SSL directly on the container, it becomes an operational nightmare.
CloudFront acts as our smart, secure proxy that handles encryption, protocol translation, and browser security rules so your Python code doesn't have to.
- Create the Distribution
Go to CloudFront -> Create Distribution.
Origin Domain: Select your ALB from the dropdown list.
- Protocol Policy (Fixing the 504 Error)
Look for the Origin Protocol Policy setting.
Action: Select HTTP Only.
Why? Your ALB is listening on HTTP (Port 80). If you select "Match Viewer" or "HTTPS Only," CloudFront will try to speak HTTPS to the ALB. Since we didn't install certificates on the ALB, the connection will fail, resulting in the dreaded 504 Gateway Timeout. By selecting "HTTP Only," CloudFront handles the secure HTTPS connection with the user but speaks plain HTTP to your backend.
- CORS Headers (Fixing Browser Blocks) Browsers block cross-origin requests by default. We need to tell the browser it's safe to talk to our API.
Go to Default Cache Behavior.
Viewer Protocol Policy: Redirect HTTP to HTTPS.
Allowed HTTP Methods: Select GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE. (You need POST for JSON-RPC).
Response Headers Policy: Search for and select CORS-with-preflight.
Why? This managed policy automatically adds the Access-Control-Allow-Origin: * header to every response. Without this, AgentCore will see a "Network Error" even if your server is working perfectly.
- The Final Connection (AgentCore Gateway): Once the distribution is deployed (it takes a few minutes), copy your Distribution Domain Name (e.g., d12345abcdef.cloudfront.net). Now, configure your AgentCore environment:
Create a Gateway: In your AgentCore Gateway dashboard, create a new Gateway instance.
Add a Target: Inside that Gateway, create a new Target.
The URL: Paste your CloudFront URL and append the endpoint path you defined in your Python code: https://d12345abcdef.cloudfront.net/mcp
Status: Because CloudFront is handling SSL and CORS, the connection status should turn "Ready" immediately. You now have a secure, serverless AI tool backend running on AWS!
- Verification: Testing with Postman Before you even use the AI agent, you can verify the entire pipeline is working using Postman. This confirms that CloudFront, the ALB, and your Fargate task are all talking to each other.
Method: POST
URL: https://d12345abcdef.cloudfront.net/mcp
Body (Raw JSON):
{
"jsonrpc": "2.0",
"id": "test-1",
"method": "tools/call",
"params": {
"name": "simpleadd",
"arguments": { "a": 10, "b": 20 }
}
}
Expected Result: You should receive a 200 OK response with the result 30. If you see this, your secure, serverless AI backend is live and ready for production!

Top comments (0)