Day 27 of my Terraform journey moved from a single-region scalable app to a multi-region high-availability design.
Yesterday, I built a scalable web application in one AWS region. Today, I expanded that pattern into a 3-tier architecture spread across two regions using:
- a VPC per region
- an ALB per region
- an Auto Scaling Group per region
- a primary Multi-AZ RDS instance
- a cross-region read replica
- optional Route53 failover DNS
- reusable Terraform modules
- remote state with S3 and DynamoDB
GitHub reference:
https://github.com/mary20205090/30-day-Terraform-Challenge/tree/main/day_27
Project Structure
For Day 27, I separated the infrastructure into five focused modules:
day27-multi-region-ha/
├── modules/
│ ├── vpc/
│ ├── alb/
│ ├── asg/
│ ├── rds/
│ └── route53/
├── envs/
│ └── prod/
├── bootstrap/
├── backend.tf
└── provider.tf
The goal was not just to make the stack work.
The goal was to make the design reusable, understandable, and safe to change across regions.
Why Five Modules Instead of One?
I split the project into five modules because each part has a different responsibility.
The vpc module owns networking:
- VPC
- public subnets
- private subnets
- internet gateway
- NAT gateways
- route tables
The alb module owns traffic entry:
- Application Load Balancer
- target group
- listener
- ALB security group
The asg module owns compute and scaling:
- launch template
- instance security group
- Auto Scaling Group
- scaling policies
- CloudWatch CPU alarms
The rds module owns the database layer:
- DB subnet group
- RDS security group
- primary RDS instance
- cross-region replica logic
The route53 module owns failover DNS:
- health checks
- primary failover record
- secondary failover record
If everything lived in one large file, it would still work, but it would be harder to reuse and harder to reason about.
Modules make the boundaries clear.
How the Modules Connect
The most important part of today was understanding the data flow between modules.
The VPC modules create the base networking:
module.vpc_primary.vpc_id
module.vpc_primary.public_subnet_ids
module.vpc_primary.private_subnet_ids
module.vpc_secondary.vpc_id
module.vpc_secondary.public_subnet_ids
module.vpc_secondary.private_subnet_ids
Those outputs feed the rest of the stack.
The ALB module in the primary region creates a target group:
module.alb_primary.target_group_arn
That output flows into the primary ASG module:
target_group_arns = [module.alb_primary.target_group_arn]
That tells the Auto Scaling Group where its EC2 instances should register.
Then the RDS primary module creates the main database and outputs:
module.rds_primary.db_instance_arn
That output flows into the replica module:
replicate_source_db = module.rds_primary.db_instance_arn
That tells AWS to create the secondary database as a cross-region read replica of the primary database.
This closes the loop:
VPC → ALB → Target Group → ASG → Primary RDS → Cross-Region Replica
That is what made the architecture feel like one connected system instead of separate AWS resources.
Deployment Output
After applying the Terraform plan, Terraform returned regional ALB DNS names.
I verified the primary ALB in the browser and the app responded with:
Region: us-east-1 | AZ: us-east-1b | Environment: prod
That confirmed the ALB, target group, Auto Scaling Group, and EC2 user data were all working together correctly.
Route53 Failover Design
The Route53 module was included in the project design to support DNS failover between the two regions.
In my actual lab run, I left Route53 disabled because I did not have a hosted zone and domain ready in the account. So I verified the stack through the ALB DNS names directly instead.
Still, the failover behavior is important to understand.
If the primary region fails, the sequence looks like this:
- the primary Route53 health check fails
- Route53 stops returning the primary record
- clients continue using cached DNS until TTL expires
- after TTL expiry, new DNS lookups resolve to the secondary record
- traffic shifts to the secondary ALB
- the secondary ALB continues serving the application tier in the backup region
That means failover is not instant in the same way an internal service failover might be. It depends on:
- health check detection
- Route53 failover policy
- DNS cache expiry
Multi-AZ vs Cross-Region Read Replicas
One of the most useful lessons today was understanding that these solve different problems.
Multi-AZ protects against Availability Zone failure within one region.
If one AZ in us-east-1 fails, AWS can fail the database over to another AZ in that same region.
Cross-region read replicas protect against a full regional outage.
If the primary region has a larger failure, the replica already exists in the secondary region and can become part of the recovery plan.
So the difference is:
- Multi-AZ = resilience within a region
- cross-region replica = resilience across regions
You need both ideas to think clearly about high availability.
A Useful Debugging Lesson
Day 27 had a few real-world AWS constraints that made the project more realistic.
A few examples:
- ALB naming had to be shortened to stay within AWS limits
- the RDS module needed the application security group ID, not the ASG name
- the cross-region replica needed proper encryption handling to be created from an encrypted source
- the backend had to be bootstrapped first before the main environment could use remote state
That was a good reminder that Terraform can describe the architecture, but AWS service rules still matter.
Remote State and Bootstrap
Like previous days, I used a remote backend with:
- S3 for Terraform state
- DynamoDB for state locking
For this project, I also used a separate bootstrap stack to create the backend resources first.
That mattered because Terraform cannot use a backend bucket and lock table until they already exist.
Remote state keeps the stack safer and more realistic, especially once infrastructure starts growing beyond one simple environment.
Cleanup
After verifying the app worked, I destroyed both:
- the Day 27 production stack
- the bootstrap backend resources
This matters because NAT Gateways, ALBs, EC2 instances, and RDS resources can keep generating cost even after the learning task is complete.
Final Takeaway
Day 27 helped me connect several Terraform lessons into one practical multi-region system.
A high-availability architecture is not just “more resources.” It is the relationship between networking, load balancing, scaling, database topology, failover strategy, and state management.
The biggest lesson:
Terraform modules are not just for organizing files. They help define the boundaries of responsibility in infrastructure.
That is what makes the system easier to understand, reuse, and safely change.
Follow My Journey
This is Day 27 of my 30-Day Terraform Challenge.
See you on Day 28.
Top comments (0)