Bringing unmanaged AWS infrastructure under Terraform control—the classic 'Brownfield Migration'—is one of the most deceptive challenges in DevOps.
On the surface, it looks like a simple scripting task: just wrap terraform import and loop through resources. However, based on my experience navigating these migrations in complex environments, this naive approach almost always fails.
The impedance mismatch between AWS and Terraform creates two distinct classes of problems that standard tools miss: Hidden API Defaults and Graph Cycles.
Here is a technical breakdown of what went wrong and how we solved it using graph theory and strict schema mapping.
Trap 1: The root_block_device & The "Silent" Replacement
The task was simple: Import 34 production EC2 instances.
After generating the HCL code and running terraform plan, I expected a clean state.
Instead, I got this:
Plan: 34 to add, 0 to change, 34 to destroy.
# aws_instance.prod_web_01 must be replaced
Every single production instance was flagged for replacement.
The Investigation
Rule #1 of IaC is "Always read the plan." But when the plan says "replace," you need to know why.
The diff looked like this:
-/+ resource "aws_instance" "prod_web_01" {
~ id = "i-0abc123def456" -> (known after apply)
- root_block_device {
- volume_size = 100 # Actual Prod State
- volume_type = "gp3"
- device_name = "/dev/xvda"
}
+ root_block_device {
+ volume_size = 8 # AMI Default
+ volume_type = "gp2"
}
}
The Root Cause: Read-Only vs. Writable Attributes
The issue wasn't just "I forgot to declare values." It was a conflict between the AWS API and the Terraform Schema.
-
AWS API Reality: When you query an instance, AWS returns everything, including the
DeviceName(e.g.,/dev/xvda). -
Terraform Schema: The
device_nameinsideroot_block_deviceis a Computed (Read-Only) attribute. You cannot set it.
If you blindly map the API response to HCL, Terraform errors out because you're trying to set a read-only field.
If you omit the block entirely (thinking "it already exists"), Terraform assumes you want the AMI defaults (often 8GB gp2).
Because AWS cannot shrink a 100GB volume to 8GB in-place, Terraform's only option is to destroy and recreate the instance.
The Fix: Surgical Mapping
You can't just dump the API response. You have to filter it through a logic layer that understands the Terraform provider's quirks:
# Pseudo-code for the fix
def transform_root_block_device(api_response):
ebs = api_response.get('Ebs', {})
volume_type = ebs.get('VolumeType', 'gp2')
result = {
# Keep writable attributes
'volume_size': ebs.get('VolumeSize'),
'volume_type': volume_type,
'delete_on_termination': ebs.get('DeleteOnTermination'),
'encrypted': ebs.get('Encrypted'),
}
# Filter out Read-Only attributes that cause errors
# - device_name
# - volume_id
return result
This ensures the generated code matches the actual state of the disk without triggering schema violations.
Trap 2: The Cycle (Graph Theory vs. AWS Reality)
If the first trap was a configuration error, the second was a fundamental structural conflict.
Terraform requires a Directed Acyclic Graph (DAG). AWS allows cycles.
The Deadlock
The most common culprit is Security Groups. Imagine two microservices:
-
SG-Appallows outbound traffic toSG-DB -
SG-DBallows inbound traffic fromSG-App
If you write this with inline rules (which is what terraform import generates by default), you create a cycle:
resource "aws_security_group" "app" {
egress {
security_groups = [aws_security_group.db.id] # Needs DB's ID
}
}
resource "aws_security_group" "db" {
ingress {
security_groups = [aws_security_group.app.id] # Needs App's ID
}
}
Terraform cannot apply this. It can't create app without db's ID, and vice versa.
Visualizing the Problem
In a healthy Terraform config, dependencies flow one way:
[VPC] --> [Subnet] --> [EC2]
But Security Groups often form cycles (Strongly Connected Components):
┌──────────────┐
▼ │
[SG-App] [SG-DB]
│ ▲
└──────────────┘
The Solution: Tarjan's Algorithm & "Shell & Fill"
When building RepliMap (the tool I wrote to automate this), I realized we couldn't just export resources one by one. We had to model the entire AWS account as a graph using NetworkX.
We use Tarjan's algorithm to detect Strongly Connected Components (SCCs)—the "knots" in the graph.
Once a cycle is detected, we use a "Shell & Fill" strategy to break it:
- Create Empty Shells: Generate the Security Groups with no rules. Terraform can create these instantly because they have no dependencies.
-
Fill with Rules: Extract the rules into separate
aws_security_group_ruleresources. These reference the IDs of the shells created in step 1.
Step 1: Create Shells (No Dependencies)
[SG-App (empty)] [SG-DB (empty)]
Step 2: Create Rules (Reference Shells)
▲ ▲
│ │
[Rule: egress->DB] [Rule: ingress<-App]
The graph is now acyclic, and Terraform is happy.
Conclusion
Tools like terraform import or Terraformer are great starting points, but they often act as simple API-to-HCL dumpers. They don't always account for:
- Implicit Defaults: Where missing config != existing state.
- Graph Topology: Where valid AWS states are invalid Terraform states.
For small projects, you can fix these manually. For brownfield migrations with 2,000+ resources, you need a deterministic engine to handle the translation.
I've open-sourced the documentation and the read-only IAM policies for the engine we built to solve this. If you're interested in the edge cases of AWS imports, check it out:
RepliMap
/
replimap-community
Reverse-engineer AWS infrastructure into production-ready Terraform. Visualize dependencies, detect drift, estimate costs.
RepliMap
AWS Infrastructure Intelligence Engine
Reverse-engineer any AWS account. Visualize dependencies. Generate Terraform. Optimize costs
Quick Start • Features • Use Cases • Installation • Docs
👋 About This Repository
RepliMap is a commercial tool built with a "Local-First" architecture.
This repository (replimap-community) hosts documentation, issue tracking, and examples.
The core engine is distributed via PyPI.
Your AWS credentials and data never leave your machine — the only network call is license key validation.
The Problem
You inherited an AWS account. Or maybe you built it yourself over 3 years of "just one more click."
Now you have:
- 🤷 500+ resources and no idea what connects to what
- 😰 No Terraform — everything was ClickOps
- 💸 Oversized instances burning money 24/7
- 📋 SOC2 audit next month — good luck
Sound familiar?
The Solution
RepliMap scans your AWS, builds a dependency graph, and gives you…
💬 Join the discussion
Interested in the graph theory aspect? We're discussing the Tarjan implementation and edge cases over on Hacker News.

Top comments (0)