
Modern applications must be fast, highly available, and resilient across regions.
If a user opens your app from India, you want it to load instantly.
If one region fails, another should take over automatically.
This blog explains how to deploy a production-grade Flask API across two AWS regions, while keeping the explanation simple enough for beginners.
We will use:
- Route 53 → Global DNS + failover
- CloudFront → Global CDN
- ALB (Application Load Balancer) → Routing + health checks
- Auto Scaling Group → EC2 automation
- SSM Parameter Store → Store environment variables securely
- Flask + Gunicorn → Application server
🚀 Why Multi-Region Architecture?
Because single-region apps fail.
Real-world issues include:
- Region-wide outages
- Network failures
- Slow latency for users far away
- High traffic volume
- Scaling limitations
A multi-region setup solves all of these:
- Users get routed to nearest region → faster API
- If a region fails, traffic automatically fails over
- Load spreads across regions
- Down-time becomes nearly zero
🧠 AWS Architecture Overview (Simple Explanation)
AWS uses separate services to achieve global performance:
User
↓
CloudFront (Global CDN)
↓
Route 53 (Global Traffic Routing)
↓
Region A (e.g., us-east-1) Region B (e.g., eu-west-1)
↓ ↓
Application Load Balancer (ALB) Application Load Balancer (ALB)
↓ ↓
Auto Scaling Group (EC2 instances) Auto Scaling Group
↓ ↓
Flask Application Flask Application
🌐 Understanding the Components (Beginner Friendly)
CloudFront – The Global Accelerator
This is AWS’s CDN.
It stores cached API responses in edge locations around the world.
Benefits:
- Faster responses
- Lower load on origin
- Global reach
- Shield Standard (free DDoS protection)
Route 53 – Smart Traffic Director
Sends users to the nearest AWS region using Latency-Based Routing.
Application Load Balancer (ALB)
ALB is the “brain” handling:
- Routing requests
- Internal health checks
- HTTP → HTTPS redirects
- Distributing traffic across EC2
- Integration with Auto Scaling
EC2 Auto Scaling Group
Automatically:
- Creates more EC2 machines when traffic increases
- Removes machines when traffic drops
- Ensures minimum healthy instances
SSM Parameter Store
Stores environment variables securely.
This avoids .env files and keeps credentials safe.
🏗️ Step 1 — Store Environment Variables in Parameter Store
Do this per region. Replace values as needed.
aws ssm put-parameter --name "/app/FLASK_ENV" --value "production" --type String
aws ssm put-parameter --name "/app/SECRET_KEY" --value "xyz123" --type SecureString
aws ssm put-parameter --name "/app/API_URL" --value "https://api.example.com" --type String
🏗️ Step 2 — Create a Launch Template for EC2
This defines how every EC2 instance will be created.
Include this User Data script:
#!/bin/bash
yum update -y
yum install -y python3 git
pip3 install flask gunicorn boto3
# Fetch ENV from Parameter Store
export FLASK_ENV=$(aws ssm get-parameter --name "/app/FLASK_ENV" --query 'Parameter.Value' --output text)
export SECRET_KEY=$(aws ssm get-parameter --name "/app/SECRET_KEY" --with-decryption --query 'Parameter.Value' --output text)
# Pull your app code
git clone https://github.com/your/repo.git /opt/app
cd /opt/app
# Run Flask with Gunicorn
gunicorn app:app -b 0.0.0.0:5000 --daemon
This ensures that when the Auto Scaling Group spins up more EC2 instances, they automatically:
- Install dependencies
- Pull your application code
- Fetch environment variables securely
- Start Gunicorn
🏗️ Step 3 — Create Auto Scaling Group
Use the AWS console.
Set:
- Minimum instances: 2
- Maximum instances: 5
- Desired: 2
- Target tracking: 70% CPU
- Subnets: Two AZs per region
This guarantees fault tolerance within each region.
🏗️ Step 4 — Create ALB (Application Load Balancer)
Listener Rules
- Port 80 → Redirect to HTTPS
- Port 443 → Forward to Target Group
Target Group Settings
- Protocol: HTTP
- Port: 5000 (your Flask app port)
Health Check:
Path: /health
Success Codes: 200-399
Interval: 10s
Timeout: 5s
Unhealthy threshold: 2
Healthy threshold: 2
This is critical for Auto Scaling + failover.
🏗️ Step 5 — Create CloudFront Distribution
Origin
- Origin = ALB DNS name
- Origin Protocol = HTTPS
- Cache policy = CachingDisabled (for APIs)
Multi-Origin Failover
Add 2 origins:
- Origin A: ALB in primary region
- Origin B: ALB in secondary region
Failover rule:
If Origin A returns 500, 502, 503, or 504 → switch to Origin B
This gives automatic regional failover.
🏗️ Step 6 — Configure Route 53 (Multi-Region Routing)
Create two records with same hostname:
Record 1
Name: api.example.com
Value: CloudFront Distribution URL
Region: us-east-1
Routing Policy: Latency
Record 2
Name: api.example.com
Value: CloudFront Distribution URL
Region: eu-west-1
Routing Policy: Latency
Now users are routed to the nearest AWS region.
🧪 Step 7 — End-to-End Testing
1️⃣ Test ALB
curl http://<ALB-DNS>/health
2️⃣ Test CloudFront
curl https://<cloudfront-url>/health
3️⃣ Test Route 53
curl https://api.example.com/health
4️⃣ Failover test
Stop instances in Region A → traffic moves to Region B.
🔐 Security Best Practices
- Enable WAF on CloudFront
- Use AWS Shield Standard (automatic)
- ALB → EC2 using security groups only
- No public IPs on EC2
- Enable VPC Flow Logs
- Rotate secrets regularly
🎯 Conclusion
A multi-region architecture transforms your application into a globally available, fault-tolerant, and high-performing system.
This AWS setup gives you:
- Global acceleration via CloudFront
- Smart routing and failover via Route 53
- Reliable request handling via ALB
- Automatic scaling via EC2 Auto Scaling
- Secure environment handling via Parameter Store
- Zero-downtime architecture across regions
Top comments (0)