DEV Community

Haripriya Veluchamy
Haripriya Veluchamy

Posted on

A Terraform Rename That Deleted Production Data Taught Me About Lifecycle Management

It happened while I was working on a production environment.

I was managing a DV (data volume / database-related resource) using Terraform. Like most production systems, the infrastructure was defined as code, and the DV was holding real data not configuration, not metadata, but actual business data.

As part of a migration and cleanup activity, I refactored my Terraform configuration. The change was simple: I renamed a few resource blocks to improve readability and structure. No values were changed. No destructive action was intended.

I ran terraform apply.

The DV was deleted and recreated.


Why this hit differently in production

In dev or test environments, losing a DV is inconvenient but acceptable.
In production, it’s a different story.

A DV is not “just infrastructure”.
It holds the state of the system. Losing it means losing trust, data, and sometimes the business itself.

What bothered me the most was not the deletion it was why it happened.

I didn’t:

  • delete the resource intentionally
  • change its configuration
  • modify its size or type

I only changed the Terraform code structure.

Terraform interpreted that as:

“The old resource no longer exists. Create a new one.”

And from Terraform’s perspective, that interpretation was correct.


The uncomfortable realization

Terraform doesn’t understand intent.
It doesn’t know which resources are “safe to recreate” and which ones must never be touched.

Terraform only understands:

  • configuration
  • state
  • and lifecycle rules

If we don’t explicitly define lifecycle behavior, Terraform will apply its default logic even in production.

That realization is what pushed me to deeply understand Terraform lifecycle management.

Not as a feature.
But as a production safety mechanism.


What is Terraform lifecycle management (really)?

Terraform lifecycle management controls how Terraform behaves when something changes.

It answers questions like:

  • Should this resource ever be destroyed?
  • Should certain changes be ignored?
  • When is replacement unavoidable?
  • How do we refactor Terraform code safely?

Lifecycle rules are defined using the lifecycle block inside a resource.

resource "example_resource" "demo" {
  lifecycle {
    # behavior rules
  }
}
Enter fullscreen mode Exit fullscreen mode

This block doesn’t create infrastructure.
It controls Terraform’s reactions to change.


1. Replacement — When Terraform Must Rebuild a Resource

What replacement actually means

Replacement means Terraform must delete the existing resource and create a new one.

This happens when a property:

  • cannot be changed in place
  • is marked immutable by the cloud provider

Terraform has no workaround here.


Real production example (DV / disk)

You create a DV with:

  • a specific disk type
  • attached to a VM

Later, you change:

  • disk type
  • encryption setting
  • attachment configuration

The cloud provider does not allow this change in place.

Terraform plan shows:

-/+ resource will be replaced
Enter fullscreen mode Exit fullscreen mode

Which means:

  • - destroy old DV
  • + create new DV

If this DV holds production data, the data is gone.


Key rule to remember

Ask one question:

Can the cloud provider update this property without deleting the resource?

  • Yes → update
  • No → replacement

2. replace_triggered_by — When You Want Replacement

Sometimes replacement is not required, but desired.

Real scenario: security or immutability

  • A DV or VM depends on a secret
  • The secret changes
  • The infrastructure technically still works
  • But you want a clean rebuild

You can explicitly tell Terraform:

lifecycle {
  replace_triggered_by = [
    some_secret_resource
  ]
}
Enter fullscreen mode Exit fullscreen mode

Meaning:

“If this dependency changes, rebuild this resource.”

This is commonly used in:

  • immutable infrastructure
  • security-sensitive systems
  • controlled rebuild workflows

3. ignore_changes — Avoid Fighting External Systems

Terraform expects to be the single source of truth.
In production, this is rarely true.


Real production example

A DV or storage resource has:

  • tags added by policy
  • metadata updated by another team
  • monitoring tools injecting values

Terraform sees this as drift and wants to revert it.

Plan shows constant diffs, even though nothing is broken.


Solution: ignore_changes

lifecycle {
  ignore_changes = [tags]
}
Enter fullscreen mode Exit fullscreen mode

This tells Terraform:

“I still manage this resource, but I don’t care about these fields.”

Terraform will:

  • stop showing noisy plans
  • stop overwriting external changes
  • keep CI/CD stable

Important caution

Do not ignore critical configuration.

Bad:

ignore_changes = all
Enter fullscreen mode Exit fullscreen mode

This removes Terraform’s control entirely.

Use ignore_changes only when:

  • changes are automatic
  • another system is the owner
  • reverting is unnecessary or harmful

4. Refactoring Terraform Code — The Hidden Production Risk

This is where many production incidents happen.


What looks harmless

Renaming a resource block for readability:

resource "example_dv" "old_name" { }
Enter fullscreen mode Exit fullscreen mode

to:

resource "example_dv" "new_name" { }
Enter fullscreen mode Exit fullscreen mode

No values changed.
Same DV name.
Same configuration.


What Terraform thinks

Terraform identifies resources by:

resource_type.resource_name
Enter fullscreen mode Exit fullscreen mode

So it sees:

  • old_name → removed → destroy
  • new_name → new → create

Terraform does not know this was a refactor.

Result:

Destroy DV
Create new DV
Enter fullscreen mode Exit fullscreen mode

In production, this means data loss.


5. moved Blocks — Safe Refactoring

To refactor safely, you must update Terraform state awareness.

moved {
  from = example_dv.old_name
  to   = example_dv.new_name
}
Enter fullscreen mode Exit fullscreen mode

This tells Terraform:

“Same resource. New logical name.”

Terraform:

  • updates state
  • does NOT destroy
  • does NOT recreate

This is essential during:

  • refactoring
  • module restructuring
  • account migrations

6. prevent_destroy — Protect Data at All Costs

For data-holding resources, deletion should never be accidental.

lifecycle {
  prevent_destroy = true
}
Enter fullscreen mode Exit fullscreen mode

Terraform will now:

  • refuse terraform destroy
  • fail plans that attempt deletion
  • force conscious intervention

This should be used for:

  • DVs
  • databases
  • state storage
  • critical backups

7. create_before_destroy — Reduce Downtime (When Replacement Is Needed)

When replacement is unavoidable, this helps reduce impact.

lifecycle {
  create_before_destroy = true
}
Enter fullscreen mode Exit fullscreen mode

Terraform will:

  1. create new resource
  2. switch dependencies
  3. destroy old resource

Useful for:

  • stateless services
  • load-balanced workloads

⚠️ Not always possible for data resources due to naming or attachment limits.


Putting It All Together — Mental Model

Terraform lifecycle management answers one question:

“How should Terraform behave when change happens?”

Scenario Lifecycle tool
Immutable change Replacement
Forced rebuild replace_triggered_by
External drift ignore_changes
Refactor rename moved
Critical data prevent_destroy
Downtime risk create_before_destroy

Final Thoughts

Terraform is extremely powerful but also extremely literal.

It will not protect your data unless you explicitly tell it how.

Lifecycle management is not an advanced feature.
It is a production requirement, especially for resources that hold data.

My production incident didn’t happen because Terraform failed it happened because I didn’t fully control the lifecycle.

Hopefully, this breakdown helps you avoid learning the same lesson the hard way.


Top comments (0)