Bogdan

Posted on Feb 3

Reverse Engineering Existing Cloud Infrastructure into Terraform

#devops #cloud #automation #terraform

Reverse Engineering Existing Cloud Infrastructure into Terraform

You inherit a cloud environment with dozens of manually created resources. Or your team started with the console and ClickOps before adopting Infrastructure as Code. Or an acquisition brings in infrastructure that nobody documented. The resources exist and are running production workloads, but there is no Terraform, no CloudFormation, no code at all. Just resources in the cloud and tribal knowledge about how they connect.

This is one of the most common challenges in infrastructure management, and solving it systematically is the difference between months of painful manual work and a structured approach that gets you to a maintainable state.

Why This Matters

Unmanaged infrastructure is a liability. Without code, you cannot reliably reproduce environments, track changes, review modifications before they happen, or recover from disasters. Every change is a manual operation that might work or might break something unexpected.

The goal is not just to generate Terraform files. The goal is to bring existing infrastructure under management so that future changes go through a proper workflow: code review, plan, apply.

The Manual Approach

The straightforward method is to write Terraform configurations by hand based on what you see in the console, then use terraform import to bring each resource under Terraform management.

For a single EC2 instance:

resource "aws_instance" "web_server" {
  ami           = "ami-0123456789abcdef0"
  instance_type = "t3.medium"
  subnet_id     = "subnet-0123456789abcdef0"

  tags = {
    Name = "web-server"
  }
}

terraform import aws_instance.web_server i-0123456789abcdef0

After import, run terraform plan to see what differs between your written configuration and the actual resource. Adjust your configuration until the plan shows no changes.

This works for small environments. For anything beyond a handful of resources, it becomes impractical. An environment with 200 resources means 200 import commands and 200 configuration blocks written by hand, each requiring iteration until the plan is clean.

Automating the Discovery

Cloud providers offer APIs to list and describe resources. The first step in any automated approach is discovering what exists.

For AWS, you can use the CLI to enumerate resources:

# List all EC2 instances
aws ec2 describe-instances --query 'Reservations[*].Instances[*].[InstanceId,InstanceType,Tags[?Key==`Name`].Value|[0]]' --output table

# List all S3 buckets
aws s3api list-buckets --query 'Buckets[*].Name' --output table

# List all RDS instances
aws rds describe-db-instances --query 'DBInstances[*].[DBInstanceIdentifier,DBInstanceClass,Engine]' --output table

For Azure:

# List all resources in a subscription
az resource list --output table

# List all VMs
az vm list --output table

This gives you an inventory but not Terraform code. The next step is translating resource descriptions into HCL.

Existing Tools

Several tools attempt to automate Terraform generation from existing infrastructure.

Terraformer (by Google) supports multiple providers and can generate both Terraform configurations and state files:

terraformer import aws --resources=ec2_instance,s3,rds --regions=eu-west-1

It produces working code but often with hardcoded values and minimal structure. You will spend time refactoring the output into something maintainable.

Former2 is a web-based tool for AWS that generates CloudFormation or Terraform from your existing resources. It runs in the browser and uses your AWS credentials to scan resources.

Azure Export for Terraform is Microsoft's official tool for Azure:

aztfexport resource-group my-resource-group

These tools help but have limitations. They generate code that technically works but rarely matches how you would structure a production codebase. Resources are often exported with absolute IDs rather than references, making the code brittle.

A Practical Workflow

Rather than expecting tools to produce perfect output, treat them as accelerators in a multi-step process.

Step 1: Inventory and categorize

List everything in the environment and categorize by type and criticality. Not everything needs to be in Terraform immediately. Start with the core infrastructure: networking, compute, databases. Leave ephemeral resources like auto-scaling instances for later.

Step 2: Generate initial code

Use automated tools to generate a first pass at the Terraform code. Do not expect this to be production-ready.

Step 3: Refactor and structure

Reorganize the generated code into a logical module structure. Replace hardcoded IDs with references where resources depend on each other. Extract common patterns into reusable modules.

Before:

resource "aws_instance" "web" {
  subnet_id = "subnet-0123456789abcdef0"
  vpc_security_group_ids = ["sg-0123456789abcdef0"]
}

After:

resource "aws_instance" "web" {
  subnet_id = aws_subnet.public.id
  vpc_security_group_ids = [aws_security_group.web.id]
}

Step 4: Import and validate

Import resources into Terraform state and run plans until they show no changes. This validates that your code accurately represents the existing infrastructure.

Step 5: Iterate

Bring more resources under management in batches. Each batch goes through the same generate-refactor-import-validate cycle.

Handling State

Terraform state is where imported resources are tracked. For team environments, state must be stored remotely:

terraform {
  backend "s3" {
    bucket = "my-terraform-state"
    key    = "infrastructure/terraform.tfstate"
    region = "eu-west-1"
  }
}

When importing into existing state, be careful not to overwrite resources that are already managed. Use terraform state list to see what is currently tracked before importing new resources.

For large environments, consider splitting state into multiple workspaces or separate state files per component (networking, compute, databases). This reduces blast radius and improves plan performance.

Common Challenges

Circular dependencies: Generated code often has circular references that Terraform cannot resolve. You may need to break these by using data sources for initial reads:

data "aws_security_group" "existing" {
  id = "sg-0123456789abcdef0"
}

resource "aws_instance" "web" {
  vpc_security_group_ids = [data.aws_security_group.existing.id]
}

Later, once both resources are under management, refactor to direct references.

Drift detection: The real infrastructure may have drifted from what was originally created. Document these differences and decide whether to accept the current state or plan changes to align with intended configuration.

Sensitive values: Some resources contain sensitive data (database passwords, API keys) that you do not want in plain text in your Terraform code. Use variables with sensitive flags and pull values from secret managers:

variable "db_password" {
  type      = string
  sensitive = true
}

data "aws_secretsmanager_secret_version" "db_password" {
  secret_id = "prod/database/password"
}

Provider version compatibility: Generated code may use syntax or resources that differ across provider versions. Pin your provider versions and validate generated code against those versions.

Building a Sustainable Practice

Reverse engineering is a one-time effort to get existing infrastructure under management. The larger goal is ensuring new infrastructure is created through Terraform from the start.

Establish guardrails:

CI/CD pipelines that run terraform plan on pull requests
Policy checks (using tools like OPA or Sentinel) that enforce standards
Tagging requirements that identify resource ownership
Regular drift detection to catch manual changes

The effort invested in bringing legacy infrastructure under code management pays dividends in reliability, auditability, and team velocity. The first import is always the hardest. Each subsequent batch gets easier as patterns emerge and tooling improves.

Conclusion

Inheriting unmanaged cloud infrastructure is frustrating but solvable. Use automated tools to accelerate discovery and initial code generation, but expect to invest time in refactoring and validation. Approach the work in batches, starting with critical infrastructure and expanding coverage over time.

The end state is an infrastructure codebase that accurately represents what is running, enables safe changes through standard workflows, and serves as living documentation of your environment. That is worth the effort.

DEV Community

Reverse Engineering Existing Cloud Infrastructure into Terraform

Reverse Engineering Existing Cloud Infrastructure into Terraform

Why This Matters

The Manual Approach

Automating the Discovery

Existing Tools

A Practical Workflow

Handling State

Common Challenges

Building a Sustainable Practice

Conclusion

Top comments (0)