Haripriya Veluchamy

Posted on Nov 14 • Edited on Nov 15

Deploying a Flask App Across Multiple AWS Regions

#aws #devops #cloud #flask

Modern applications must be fast, highly available, and resilient across regions.
If a user opens your app from India, you want it to load instantly.
If one region fails, another should take over automatically.

This blog explains how to deploy a production-grade Flask API across two AWS regions, while keeping the explanation simple enough for beginners.

We will use:

Route 53 → Global DNS + failover
CloudFront → Global CDN
ALB (Application Load Balancer) → Routing + health checks
Auto Scaling Group → EC2 automation
SSM Parameter Store → Store environment variables securely
Flask + Gunicorn → Application server

🚀 Why Multi-Region Architecture?

Because single-region apps fail.
Real-world issues include:

Region-wide outages
Network failures
Slow latency for users far away
High traffic volume
Scaling limitations

A multi-region setup solves all of these:

Users get routed to nearest region → faster API
If a region fails, traffic automatically fails over
Load spreads across regions
Down-time becomes nearly zero

🧠 AWS Architecture Overview (Simple Explanation)

AWS uses separate services to achieve global performance:

User
  ↓
CloudFront (Global CDN)
  ↓
Route 53 (Global Traffic Routing)
  ↓
Region A (e.g., us-east-1)               Region B (e.g., eu-west-1)
  ↓                                                ↓
Application Load Balancer (ALB)            Application Load Balancer (ALB)
  ↓                                                ↓
Auto Scaling Group (EC2 instances)        Auto Scaling Group
  ↓                                                ↓
Flask Application                          Flask Application

🌐 Understanding the Components (Beginner Friendly)

CloudFront – The Global Accelerator

This is AWS’s CDN.
It stores cached API responses in edge locations around the world.

Benefits:

Faster responses
Lower load on origin
Global reach
Shield Standard (free DDoS protection)

Route 53 – Smart Traffic Director

Sends users to the nearest AWS region using Latency-Based Routing.

Application Load Balancer (ALB)

ALB is the “brain” handling:

Routing requests
Internal health checks
HTTP → HTTPS redirects
Distributing traffic across EC2
Integration with Auto Scaling

EC2 Auto Scaling Group

Automatically:

Creates more EC2 machines when traffic increases
Removes machines when traffic drops
Ensures minimum healthy instances

SSM Parameter Store

Stores environment variables securely.
This avoids .env files and keeps credentials safe.

🏗️ Step 1 — Store Environment Variables in Parameter Store

Do this per region. Replace values as needed.

aws ssm put-parameter --name "/app/FLASK_ENV" --value "production" --type String
aws ssm put-parameter --name "/app/SECRET_KEY" --value "xyz123" --type SecureString
aws ssm put-parameter --name "/app/API_URL" --value "https://api.example.com" --type String

🏗️ Step 2 — Create a Launch Template for EC2

This defines how every EC2 instance will be created.
Include this User Data script:

#!/bin/bash
yum update -y
yum install -y python3 git
pip3 install flask gunicorn boto3

# Fetch ENV from Parameter Store
export FLASK_ENV=$(aws ssm get-parameter --name "/app/FLASK_ENV" --query 'Parameter.Value' --output text)
export SECRET_KEY=$(aws ssm get-parameter --name "/app/SECRET_KEY" --with-decryption --query 'Parameter.Value' --output text)

# Pull your app code
git clone https://github.com/your/repo.git /opt/app
cd /opt/app

# Run Flask with Gunicorn
gunicorn app:app -b 0.0.0.0:5000 --daemon

This ensures that when the Auto Scaling Group spins up more EC2 instances, they automatically:

Install dependencies
Pull your application code
Fetch environment variables securely
Start Gunicorn

🏗️ Step 3 — Create Auto Scaling Group

Use the AWS console.

Set:

Minimum instances: 2
Maximum instances: 5
Desired: 2
Target tracking: 70% CPU
Subnets: Two AZs per region

This guarantees fault tolerance within each region.

🏗️ Step 4 — Create ALB (Application Load Balancer)

Listener Rules

Port 80 → Redirect to HTTPS
Port 443 → Forward to Target Group

Target Group Settings

Protocol: HTTP
Port: 5000 (your Flask app port)

Health Check:

Path: /health
Success Codes: 200-399
Interval: 10s
Timeout: 5s
Unhealthy threshold: 2
Healthy threshold: 2

This is critical for Auto Scaling + failover.

🏗️ Step 5 — Create CloudFront Distribution

Origin

Origin = ALB DNS name
Origin Protocol = HTTPS
Cache policy = CachingDisabled (for APIs)

Multi-Origin Failover

Add 2 origins:

Origin A: ALB in primary region
Origin B: ALB in secondary region

Failover rule:

If Origin A returns 500, 502, 503, or 504 → switch to Origin B

This gives automatic regional failover.

🏗️ Step 6 — Configure Route 53 (Multi-Region Routing)

Create two records with same hostname:

Record 1

Name: api.example.com
Value: CloudFront Distribution URL
Region: us-east-1
Routing Policy: Latency

Record 2

Name: api.example.com
Value: CloudFront Distribution URL
Region: eu-west-1
Routing Policy: Latency

Now users are routed to the nearest AWS region.

🧪 Step 7 — End-to-End Testing

1️⃣ Test ALB

curl http://<ALB-DNS>/health

2️⃣ Test CloudFront

curl https://<cloudfront-url>/health

3️⃣ Test Route 53

curl https://api.example.com/health

4️⃣ Failover test

Stop instances in Region A → traffic moves to Region B.

🔐 Security Best Practices

Enable WAF on CloudFront
Use AWS Shield Standard (automatic)
ALB → EC2 using security groups only
No public IPs on EC2
Enable VPC Flow Logs
Rotate secrets regularly

🎯 Conclusion

A multi-region architecture transforms your application into a globally available, fault-tolerant, and high-performing system.

This AWS setup gives you:

Global acceleration via CloudFront
Smart routing and failover via Route 53
Reliable request handling via ALB
Automatic scaling via EC2 Auto Scaling
Secure environment handling via Parameter Store
Zero-downtime architecture across regions

DEV Community