DEV Community

Mary Mutua
Mary Mutua

Posted on

Building a 3-Tier Multi-Region High Availability Architecture with Terraform

Day 27 of my Terraform journey moved from a single-region scalable app to a multi-region high-availability design.

Yesterday, I built a scalable web application in one AWS region. Today, I expanded that pattern into a 3-tier architecture spread across two regions using:

  • a VPC per region
  • an ALB per region
  • an Auto Scaling Group per region
  • a primary Multi-AZ RDS instance
  • a cross-region read replica
  • optional Route53 failover DNS
  • reusable Terraform modules
  • remote state with S3 and DynamoDB

GitHub reference:

https://github.com/mary20205090/30-day-Terraform-Challenge/tree/main/day_27

Project Structure

For Day 27, I separated the infrastructure into five focused modules:

day27-multi-region-ha/
├── modules/
│   ├── vpc/
│   ├── alb/
│   ├── asg/
│   ├── rds/
│   └── route53/
├── envs/
│   └── prod/
├── bootstrap/
├── backend.tf
└── provider.tf
Enter fullscreen mode Exit fullscreen mode

The goal was not just to make the stack work.

The goal was to make the design reusable, understandable, and safe to change across regions.

Why Five Modules Instead of One?

I split the project into five modules because each part has a different responsibility.

The vpc module owns networking:

  • VPC
  • public subnets
  • private subnets
  • internet gateway
  • NAT gateways
  • route tables

The alb module owns traffic entry:

  • Application Load Balancer
  • target group
  • listener
  • ALB security group

The asg module owns compute and scaling:

  • launch template
  • instance security group
  • Auto Scaling Group
  • scaling policies
  • CloudWatch CPU alarms

The rds module owns the database layer:

  • DB subnet group
  • RDS security group
  • primary RDS instance
  • cross-region replica logic

The route53 module owns failover DNS:

  • health checks
  • primary failover record
  • secondary failover record

If everything lived in one large file, it would still work, but it would be harder to reuse and harder to reason about.

Modules make the boundaries clear.

How the Modules Connect

The most important part of today was understanding the data flow between modules.

The VPC modules create the base networking:

module.vpc_primary.vpc_id
module.vpc_primary.public_subnet_ids
module.vpc_primary.private_subnet_ids

module.vpc_secondary.vpc_id
module.vpc_secondary.public_subnet_ids
module.vpc_secondary.private_subnet_ids
Enter fullscreen mode Exit fullscreen mode

Those outputs feed the rest of the stack.

The ALB module in the primary region creates a target group:

module.alb_primary.target_group_arn
Enter fullscreen mode Exit fullscreen mode

That output flows into the primary ASG module:

target_group_arns = [module.alb_primary.target_group_arn]
Enter fullscreen mode Exit fullscreen mode

That tells the Auto Scaling Group where its EC2 instances should register.

Then the RDS primary module creates the main database and outputs:

module.rds_primary.db_instance_arn
Enter fullscreen mode Exit fullscreen mode

That output flows into the replica module:

replicate_source_db = module.rds_primary.db_instance_arn
Enter fullscreen mode Exit fullscreen mode

That tells AWS to create the secondary database as a cross-region read replica of the primary database.

This closes the loop:

VPC → ALB → Target Group → ASG → Primary RDS → Cross-Region Replica
Enter fullscreen mode Exit fullscreen mode

That is what made the architecture feel like one connected system instead of separate AWS resources.

Deployment Output

After applying the Terraform plan, Terraform returned regional ALB DNS names.

I verified the primary ALB in the browser and the app responded with:

Region: us-east-1 | AZ: us-east-1b | Environment: prod
Enter fullscreen mode Exit fullscreen mode

That confirmed the ALB, target group, Auto Scaling Group, and EC2 user data were all working together correctly.

Route53 Failover Design

The Route53 module was included in the project design to support DNS failover between the two regions.

In my actual lab run, I left Route53 disabled because I did not have a hosted zone and domain ready in the account. So I verified the stack through the ALB DNS names directly instead.

Still, the failover behavior is important to understand.

If the primary region fails, the sequence looks like this:

  1. the primary Route53 health check fails
  2. Route53 stops returning the primary record
  3. clients continue using cached DNS until TTL expires
  4. after TTL expiry, new DNS lookups resolve to the secondary record
  5. traffic shifts to the secondary ALB
  6. the secondary ALB continues serving the application tier in the backup region

That means failover is not instant in the same way an internal service failover might be. It depends on:

  • health check detection
  • Route53 failover policy
  • DNS cache expiry

Multi-AZ vs Cross-Region Read Replicas

One of the most useful lessons today was understanding that these solve different problems.

Multi-AZ protects against Availability Zone failure within one region.

If one AZ in us-east-1 fails, AWS can fail the database over to another AZ in that same region.

Cross-region read replicas protect against a full regional outage.

If the primary region has a larger failure, the replica already exists in the secondary region and can become part of the recovery plan.

So the difference is:

  • Multi-AZ = resilience within a region
  • cross-region replica = resilience across regions

You need both ideas to think clearly about high availability.

A Useful Debugging Lesson

Day 27 had a few real-world AWS constraints that made the project more realistic.

A few examples:

  • ALB naming had to be shortened to stay within AWS limits
  • the RDS module needed the application security group ID, not the ASG name
  • the cross-region replica needed proper encryption handling to be created from an encrypted source
  • the backend had to be bootstrapped first before the main environment could use remote state

That was a good reminder that Terraform can describe the architecture, but AWS service rules still matter.

Remote State and Bootstrap

Like previous days, I used a remote backend with:

  • S3 for Terraform state
  • DynamoDB for state locking

For this project, I also used a separate bootstrap stack to create the backend resources first.

That mattered because Terraform cannot use a backend bucket and lock table until they already exist.

Remote state keeps the stack safer and more realistic, especially once infrastructure starts growing beyond one simple environment.

Cleanup

After verifying the app worked, I destroyed both:

  • the Day 27 production stack
  • the bootstrap backend resources

This matters because NAT Gateways, ALBs, EC2 instances, and RDS resources can keep generating cost even after the learning task is complete.

Final Takeaway

Day 27 helped me connect several Terraform lessons into one practical multi-region system.

A high-availability architecture is not just “more resources.” It is the relationship between networking, load balancing, scaling, database topology, failover strategy, and state management.

The biggest lesson:

Terraform modules are not just for organizing files. They help define the boundaries of responsibility in infrastructure.

That is what makes the system easier to understand, reuse, and safely change.

Follow My Journey

This is Day 27 of my 30-Day Terraform Challenge.

See you on Day 28.

Top comments (0)