It happened while I was working on a production environment.
I was managing a DV (data volume / database-related resource) using Terraform. Like most production systems, the infrastructure was defined as code, and the DV was holding real data not configuration, not metadata, but actual business data.
As part of a migration and cleanup activity, I refactored my Terraform configuration. The change was simple: I renamed a few resource blocks to improve readability and structure. No values were changed. No destructive action was intended.
I ran terraform apply.
The DV was deleted and recreated.
Why this hit differently in production
In dev or test environments, losing a DV is inconvenient but acceptable.
In production, it’s a different story.
A DV is not “just infrastructure”.
It holds the state of the system. Losing it means losing trust, data, and sometimes the business itself.
What bothered me the most was not the deletion it was why it happened.
I didn’t:
- delete the resource intentionally
- change its configuration
- modify its size or type
I only changed the Terraform code structure.
Terraform interpreted that as:
“The old resource no longer exists. Create a new one.”
And from Terraform’s perspective, that interpretation was correct.
The uncomfortable realization
Terraform doesn’t understand intent.
It doesn’t know which resources are “safe to recreate” and which ones must never be touched.
Terraform only understands:
- configuration
- state
- and lifecycle rules
If we don’t explicitly define lifecycle behavior, Terraform will apply its default logic even in production.
That realization is what pushed me to deeply understand Terraform lifecycle management.
Not as a feature.
But as a production safety mechanism.
What is Terraform lifecycle management (really)?
Terraform lifecycle management controls how Terraform behaves when something changes.
It answers questions like:
- Should this resource ever be destroyed?
- Should certain changes be ignored?
- When is replacement unavoidable?
- How do we refactor Terraform code safely?
Lifecycle rules are defined using the lifecycle block inside a resource.
resource "example_resource" "demo" {
lifecycle {
# behavior rules
}
}
This block doesn’t create infrastructure.
It controls Terraform’s reactions to change.
1. Replacement — When Terraform Must Rebuild a Resource
What replacement actually means
Replacement means Terraform must delete the existing resource and create a new one.
This happens when a property:
- cannot be changed in place
- is marked immutable by the cloud provider
Terraform has no workaround here.
Real production example (DV / disk)
You create a DV with:
- a specific disk type
- attached to a VM
Later, you change:
- disk type
- encryption setting
- attachment configuration
The cloud provider does not allow this change in place.
Terraform plan shows:
-/+ resource will be replaced
Which means:
-
-destroy old DV -
+create new DV
If this DV holds production data, the data is gone.
Key rule to remember
Ask one question:
Can the cloud provider update this property without deleting the resource?
- Yes → update
- No → replacement
2. replace_triggered_by — When You Want Replacement
Sometimes replacement is not required, but desired.
Real scenario: security or immutability
- A DV or VM depends on a secret
- The secret changes
- The infrastructure technically still works
- But you want a clean rebuild
You can explicitly tell Terraform:
lifecycle {
replace_triggered_by = [
some_secret_resource
]
}
Meaning:
“If this dependency changes, rebuild this resource.”
This is commonly used in:
- immutable infrastructure
- security-sensitive systems
- controlled rebuild workflows
3. ignore_changes — Avoid Fighting External Systems
Terraform expects to be the single source of truth.
In production, this is rarely true.
Real production example
A DV or storage resource has:
- tags added by policy
- metadata updated by another team
- monitoring tools injecting values
Terraform sees this as drift and wants to revert it.
Plan shows constant diffs, even though nothing is broken.
Solution: ignore_changes
lifecycle {
ignore_changes = [tags]
}
This tells Terraform:
“I still manage this resource, but I don’t care about these fields.”
Terraform will:
- stop showing noisy plans
- stop overwriting external changes
- keep CI/CD stable
Important caution
Do not ignore critical configuration.
Bad:
ignore_changes = all
This removes Terraform’s control entirely.
Use ignore_changes only when:
- changes are automatic
- another system is the owner
- reverting is unnecessary or harmful
4. Refactoring Terraform Code — The Hidden Production Risk
This is where many production incidents happen.
What looks harmless
Renaming a resource block for readability:
resource "example_dv" "old_name" { }
to:
resource "example_dv" "new_name" { }
No values changed.
Same DV name.
Same configuration.
What Terraform thinks
Terraform identifies resources by:
resource_type.resource_name
So it sees:
-
old_name→ removed → destroy -
new_name→ new → create
Terraform does not know this was a refactor.
Result:
Destroy DV
Create new DV
In production, this means data loss.
5. moved Blocks — Safe Refactoring
To refactor safely, you must update Terraform state awareness.
moved {
from = example_dv.old_name
to = example_dv.new_name
}
This tells Terraform:
“Same resource. New logical name.”
Terraform:
- updates state
- does NOT destroy
- does NOT recreate
This is essential during:
- refactoring
- module restructuring
- account migrations
6. prevent_destroy — Protect Data at All Costs
For data-holding resources, deletion should never be accidental.
lifecycle {
prevent_destroy = true
}
Terraform will now:
- refuse
terraform destroy - fail plans that attempt deletion
- force conscious intervention
This should be used for:
- DVs
- databases
- state storage
- critical backups
7. create_before_destroy — Reduce Downtime (When Replacement Is Needed)
When replacement is unavoidable, this helps reduce impact.
lifecycle {
create_before_destroy = true
}
Terraform will:
- create new resource
- switch dependencies
- destroy old resource
Useful for:
- stateless services
- load-balanced workloads
⚠️ Not always possible for data resources due to naming or attachment limits.
Putting It All Together — Mental Model
Terraform lifecycle management answers one question:
“How should Terraform behave when change happens?”
| Scenario | Lifecycle tool |
|---|---|
| Immutable change | Replacement |
| Forced rebuild | replace_triggered_by |
| External drift | ignore_changes |
| Refactor rename | moved |
| Critical data | prevent_destroy |
| Downtime risk | create_before_destroy |
Final Thoughts
Terraform is extremely powerful but also extremely literal.
It will not protect your data unless you explicitly tell it how.
Lifecycle management is not an advanced feature.
It is a production requirement, especially for resources that hold data.
My production incident didn’t happen because Terraform failed it happened because I didn’t fully control the lifecycle.
Hopefully, this breakdown helps you avoid learning the same lesson the hard way.
Top comments (0)