Day 12 of my Terraform journey focused on one of the most practical infrastructure problems: how to deploy changes without taking an application offline.
This is the kind of Terraform topic that matters immediately in real systems. It is one thing to provision infrastructure. It is another thing entirely to update that infrastructure safely while users are still depending on it.
Today I worked through:
- why Terraformโs default replacement behavior can cause downtime
- how
create_before_destroychanges the order of operations - why Auto Scaling Group naming becomes a problem
- how blue/green deployment works with a load balancer
- how to test the difference between a rolling replacement and an atomic traffic switch
Why Default Terraform Can Cause Downtime
By default, if Terraform needs to replace a resource that cannot be updated in place, it often destroys the old resource before creating the new one.
For a live service, that is risky.
In a setup using an Auto Scaling Group and a load balancer, the default replacement flow can look like this:
- old ASG is destroyed
- instances are terminated
- traffic has nowhere healthy to go
- new ASG is created
- new instances boot and pass health checks
- the app finally comes back
That gap between old capacity disappearing and new capacity becoming healthy is the downtime window.
For production workloads, that is not acceptable.
The Fix: create_before_destroy
Terraform provides a lifecycle rule for this exact problem:
lifecycle {
create_before_destroy = true
}
This changes the replacement order:
- create the new resource first
- let the replacement become ready
- destroy the old resource only after that
In my Day 12 lab, I used this on both:
- launch templates
- Auto Scaling Groups
That let Terraform attempt a safer rolling replacement instead of immediately dropping the old infrastructure.
The ASG Naming Problem
There is an important catch.
When create_before_destroy = true is enabled, Terraform needs the old and new versions of the resource to exist at the same time for a short period.
That becomes a problem with Auto Scaling Groups because AWS does not allow two ASGs with the same name to exist at once.
If the name is hardcoded, the deployment fails even though the lifecycle rule is correct.
Solving It with random_id
The solution is to give the replacement ASG a unique name.
I used a random_id resource tied to the application version:
resource "random_id" "rolling" {
keepers = {
app_version = var.app_version
}
byte_length = 4
}
Then I used that value in the rolling launch template and ASG naming.
That solved a real deployment problem:
- the old ASG could stay alive
- the new ASG could be created beside it
- Terraform could shift traffic without a name collision
This was one of the most important practical lessons of the day.
Rolling Zero-Downtime Deployment
To test the rolling replacement flow, I deployed a versioned response behind an Application Load Balancer.
The app initially returned:
Hello World v1
Then I changed the Terraform input to:
Hello World v2
and ran terraform apply again while continuously hitting the ALB.
The goal was simple:
- traffic should continue working throughout the apply
- at some point the response should switch from
v1tov2 - there should be no outage during the transition
That is the essence of a zero-downtime rolling update.
What I Observed During Testing
One important real-world lesson came up during testing: zero-downtime often requires temporary extra capacity.
Because the old and new ASGs briefly coexist, AWS may need to run both old and new instances at the same time. On a small AWS quota, that can hit vCPU limits.
That happened in my lab, so I had to test the rolling and blue/green parts separately instead of running everything at once.
That is actually a useful lesson in itself:
- zero-downtime is safer
- but it can temporarily cost more capacity
Blue/Green Deployment
I also implemented a blue/green deployment pattern.
Instead of gradually replacing one environment, blue/green keeps two separate environments:
- blue = currently live
- green = next version
Traffic is controlled by the load balancer listener rule:
action {
type = "forward"
target_group_arn = var.active_environment == "blue" ? aws_lb_target_group.blue.arn : aws_lb_target_group.green.arn
}
That means traffic switching is not done by rebuilding the environment live. It is done by changing a single routing decision.
In practice:
- blue served traffic first
- I changed
active_environmenttogreen - ran
terraform apply - traffic switched to green
This made the cutover feel much cleaner and more atomic than a normal replacement.
One Testing Detail That Matters
During the blue/green test, I initially thought the switch had not worked because the browser kept showing the old environment.
The real issue was browser caching.
Using curl or a hard refresh made it clear that Terraform had updated the listener rule correctly and the green environment was serving traffic.
That was a small but very real operational lesson:
- always verify deployment changes with tools that are less likely to hide the truth behind cached responses
Rolling vs Blue/Green
The two patterns solve similar problems, but in different ways.
Rolling zero-downtime:
- replaces the existing infrastructure safely
- uses
create_before_destroy - depends on overlap and health checks
Blue/green:
- keeps two environments ready
- switches traffic at the load balancer
- makes the cutover more explicit and atomic
Both are valuable. Blue/green feels cleaner, but it also requires more duplicate infrastructure.
My Main Takeaway
Day 12 made one thing very clear: zero-downtime is not just about writing Terraform syntax correctly.
It depends on:
- resource lifecycle order
- naming strategy
- healthy load balancer targets
- enough temporary capacity
- good testing discipline
Terraform gives you the tools, but understanding how those tools behave during replacement is what makes a production deployment safe.
Full Code
GitHub reference:
๐ Github Link
Terminal Transition
Example transition to include from your terminal testing:
Hello World v1
Hello World v1
Hello World v1
Hello World v2
Hello World v2
Hello World v2
And for blue/green:
Blue environment v1
Blue environment v1
Green environment v2
Green environment v2
Follow My Journey
This is Day 12 of my 30-Day Terraform Challenge.
See you on Day 13 ๐
Top comments (0)