Eunice js

Posted on Mar 29

The Terraform Mistakes Survival Guide: How I Migrated a Monolith State Without Destroying a Single Resource

#terraform #devops #infrastructure #tutorial

I migrated a monolith Terraform state without destroying a single resource.

Here is how I approached it. There might be better ways to do this, but this worked for me.

The Problem

We had one massive state file managing all our GitHub resources. Teams. Members. Admins. Permissions. Everything in one place.

Every change touched everything. Risky. Slow. Hard to review.

If someone needed to add a new team member, the plan would show changes across the entire state. One wrong move and you could accidentally destroy resources that had nothing to do with your change.

I was asked to break it into smaller modules. Teams in one state file. Members in another. Each piece moving independently.

Sounds simple, right?

It was not.

The Danger: State Drift During Refactor

Here is the problem with splitting state:

When you move resources to a new module with its own state file, Terraform does not automatically know those resources already exist.

So this happens:

New module tries to CREATE the resources (because they are not in its state yet)
Old root tries to DESTROY them (because you removed the code from there)

This is classic state drift during refactor.

If you run terraform apply on both without handling this properly, you could end up with:

Duplicate resources (if creation succeeds before destruction)
Deleted resources (if destruction runs first)
Failed applies with conflicts
A very bad day

I was not about to let that happen.

Prerequisites

Before attempting this migration, make sure you have:

Terraform 1.5 or later (for the import and removed blocks syntax)
Backend access to both state files (old root and new module)
Resource IDs for everything you are migrating (you will need these for imports)
A backup of your current state file (run terraform state pull > backup.tfstate)
Time and patience (do not rush this)

The Solution: Step by Step

I did it in five steps. Each one is critical. Do not skip any.

Step 1: Create the New Module

First, I created a new directory for the teams module with its own state file.

github-management/
  main.tf
  terraform.tfstate        # old monolith state
  teams/
    main.tf
    backend.tf             # points to new state file
    terraform.tfstate      # new isolated state

I moved the github_team and github_team_members resources into the new teams/main.tf file.

# teams/main.tf

resource "github_team" "teams" {
  for_each = var.teams

  name        = each.value.name
  description = each.value.description
  privacy     = each.value.privacy
}

resource "github_team_members" "members" {
  for_each = var.teams

  team_id = github_team.teams[each.key].id

  dynamic "members" {
    for_each = each.value.members
    content {
      username = members.value.username
      role     = members.value.role
    }
  }
}

At this point, if I ran terraform plan in the new module, it would try to create all the teams. That is expected. We fix that next.

Step 2: Import Existing Resources into the New State

This is where the magic happens.

I created an import.tf file in the new teams module:

# teams/import.tf

import {
  to = github_team.teams["devops"]
  id = "1234567"
}

import {
  to = github_team.teams["backend"]
  id = "2345678"
}

import {
  to = github_team.teams["frontend"]
  id = "3456789"
}

# Repeat for all teams you are migrating

How to find the resource IDs:

For GitHub teams, you can get the team ID from:

The GitHub API: GET /orgs/{org}/teams/{team_slug}
Your existing state file: terraform state show github_team.teams["devops"]
The GitHub web UI (inspect network requests when viewing the team)

What this does:

The import block tells Terraform:

"These resources already exist in the real world. Do not create them. Just attach them to this state file."

When you run terraform plan after adding imports, you should see:

Plan: 0 to add, 0 to change, 0 to destroy.

If you see changes, review them carefully. Minor drift is normal (like description formatting), but structural changes mean something is wrong.

Step 3: Remove Resources from the Old Root Safely

Now we need to tell the old root module to stop managing these resources without destroying them.

I created a remove.tf file in the old root:

# remove.tf (in old root)

removed {
  from = github_team.teams

  lifecycle {
    destroy = false
  }
}

removed {
  from = github_team_members.members

  lifecycle {
    destroy = false
  }
}

What this does:

The removed block with destroy = false tells Terraform:

"Stop tracking these resources in this state file. But do NOT delete them from the real world."

This is the critical piece. Without destroy = false, Terraform would delete your teams when you apply.

Step 4: Apply the Migration

Now we apply in the correct order.

First, apply the new module:

cd teams/
terraform plan    # Should show imports, no creates
terraform apply   # Imports resources into new state

Then, apply the old root:

cd ..
terraform plan    # Should show removals, no destroys
terraform apply   # Removes resources from old state

The result:

Old state: resources removed (not destroyed)
New state: resources now tracked
Real world: nothing changed
No downtime. No recreation. No deletion.

Step 5: Clean Up

After successful migration, delete the temporary files:

rm teams/import.tf
rm remove.tf

Why clean up?

Import blocks are one time operations. Once the resource is in state, the import block does nothing.
Removed blocks are only needed during transition. Keeping them adds confusion.

Your final structure should look like:

github-management/
  main.tf                  # remaining resources only
  terraform.tfstate        # smaller, focused state
  teams/
    main.tf                # team resources
    terraform.tfstate      # isolated teams state
  members/
    main.tf                # future migration
    terraform.tfstate      # isolated members state

Common Pitfalls to Avoid

1. Applying in the wrong order

If you apply the old root removal before importing into the new module, you might lose track of resources. Always import first.

2. Forgetting `destroy = false`

This is the most dangerous mistake. Without it:

# DANGEROUS - will delete resources
removed {
  from = github_team.teams
}

# SAFE - keeps resources alive
removed {
  from = github_team.teams

  lifecycle {
    destroy = false
  }
}

3. Missing resource IDs

If you import with the wrong ID, Terraform will either fail or attach to the wrong resource. Double check every ID before applying.

4. Not backing up state

Always run terraform state pull > backup.tfstate before starting. If something goes wrong, you can restore with terraform state push backup.tfstate.

5. Rushing the migration

This is not a task to do on a Friday afternoon. Take your time. Verify each step. Run terraform plan obsessively.

Troubleshooting

"Resource already exists" error

This means you tried to create without importing first. Add the import block and try again.

Plan shows unexpected changes after import

Some drift is normal. Review carefully:

Safe drift: formatting differences, computed defaults
Dangerous drift: structural changes, missing attributes

If you see dangerous drift, investigate before applying.

"Resource not found" during import

The resource ID is wrong or the resource was deleted. Verify the ID exists in your provider (GitHub, AWS, etc.) before importing.

State file locked

Someone else is running Terraform, or a previous run crashed. Wait for the lock to release or manually unlock (carefully) with terraform force-unlock <LOCK_ID>.

Key Takeaways

Never split state without a migration plan. The import and removed blocks are your safety net.
Import first, remove second. Order matters.
Always use destroy = false in removed blocks. Unless you actually want to delete resources.
Back up your state before starting. Every time.
Take your time. A careful migration takes hours. Fixing a broken one takes days.

Final Thoughts

This approach took time. But now changes are cleaner and safer. Each module can be updated independently. Reviews are focused. Risk is contained.

I am sure there are other ways to handle this. Terraform has terraform state mv commands that can also work. Some teams use Terragrunt for state management. Others use workspaces.

If you have done something similar, I would love to hear how you approached it.

What is your go to method for splitting Terraform state?

Save this before your next Terraform refactor.

DEV Community

The Terraform Mistakes Survival Guide: How I Migrated a Monolith State Without Destroying a Single Resource

The Problem

The Danger: State Drift During Refactor

Prerequisites

The Solution: Step by Step

Step 1: Create the New Module

Step 2: Import Existing Resources into the New State

Step 3: Remove Resources from the Old Root Safely

Step 4: Apply the Migration

Step 5: Clean Up

Common Pitfalls to Avoid

1. Applying in the wrong order

2. Forgetting `destroy = false`

3. Missing resource IDs

4. Not backing up state

5. Rushing the migration

Troubleshooting

"Resource already exists" error

Plan shows unexpected changes after import

"Resource not found" during import

State file locked

Key Takeaways

Final Thoughts

Top comments (0)

The Problem

The Danger: State Drift During Refactor

Prerequisites

The Solution: Step by Step

Step 1: Create the New Module

Step 2: Import Existing Resources into the New State

Step 3: Remove Resources from the Old Root Safely

Step 4: Apply the Migration

Step 5: Clean Up

Common Pitfalls to Avoid

1. Applying in the wrong order

2. Forgetting destroy = false

3. Missing resource IDs

4. Not backing up state

5. Rushing the migration

Troubleshooting

"Resource already exists" error

Plan shows unexpected changes after import

"Resource not found" during import

State file locked

Key Takeaways

Final Thoughts

2. Forgetting `destroy = false`