Mary Mutua

Posted on Apr 9

The Importance of Manual Testing in Terraform

#terraform #aws #infrastructureascode #devops

Day 17 of my Terraform journey focused on something that sounds simple, but is actually a big part of real infrastructure work: manual testing.

In Chapter 9 of Terraform: Up & Running, Yevgeniy Brikman makes a strong case that manual testing is not something you outgrow once automated testing enters the picture. Instead, it is often the step that teaches you what should be automated, what success really looks like, and what can go wrong in the real world.

The biggest lesson from today was this:

manual testing is not the opposite of automated testing.

It is often the step that makes good automated testing possible.

Before you can automate a test well, you need to know:

what you are testing
what success looks like
what failure looks like
how the infrastructure behaves in real life
how to clean it up safely afterward

For Day 17, I built a structured manual testing process for my Terraform webserver cluster, ran it against both dev and production environments, documented the results, and treated cleanup as part of the test itself.

GitHub reference:
👉 Github Link

Why Manual Testing Still Matters

As Yevgeniy Brikman explains in Chapter 9, automated tests are powerful, but they do not replace the need to first understand the behavior of your infrastructure. Manual testing helps you build that understanding.

Automated tests are excellent for:

repeatability
regression detection
CI/CD confidence
faster feedback on future changes

But manual testing is still valuable because it helps you discover things automated tests do not always reveal immediately, such as:

confusing outputs
weak defaults
cloud-provider quirks
slow readiness times
cleanup surprises
differences between environments
gaps in documentation

Manual testing teaches you what actually matters to verify.

That is why I now see it as the foundation for better automation, not as something temporary or optional.

What a Structured Manual Test Should Cover

A manual test should be more than just “run terraform apply and click around.”

For Day 17, I organized my checklist into these categories.

1. Provisioning Verification

This is the Terraform layer.

Questions:

Did terraform init complete without errors?
Did terraform validate pass cleanly?
Did terraform plan show the expected resources?
Did terraform apply complete successfully?

This confirms that the configuration is valid and deployable.

2. Resource Correctness

This is the “did Terraform create what I expected?” layer.

Questions:

Are the expected AWS resources present?
Do names, tags, and regions match the variables?
Are security group rules exactly what I intended?

This is important because infrastructure can be “working” while still being incorrectly shaped.

3. Functional Verification

This is the behavior layer.

Questions:

Does the ALB DNS name resolve?
Does curl return the expected response?
Are instances healthy behind the load balancer?
Does the environment behave as the service is supposed to?

This is the part that tells you whether the deployed system is actually usable.

4. State Consistency

This checks whether Terraform and reality still match.

Questions:

Does terraform plan return “No changes” after apply?
Does the state reflect what exists in AWS?

This is important because drift or hidden mismatches are easy to miss otherwise.

5. Regression Check

This tests whether a small change behaves predictably.

Questions:

If I change one small thing, does Terraform show only that change?
After applying it, does terraform plan return clean again?

This helps prove the configuration is stable and understandable.

6. Cleanup

This is the most overlooked part of manual testing.

Questions:

Did terraform plan -destroy look correct?
Did terraform destroy succeed?
Were any active resources left behind afterward?

Cleanup is not just cost control.

It is part of whether the test process itself is trustworthy.

Provisioning Verification vs Functional Verification

This difference became much clearer to me today.

Provisioning verification asks:

“Did Terraform successfully create the infrastructure?”

Examples:

terraform init
terraform validate
terraform plan
terraform apply

Functional verification asks:

“Does the infrastructure actually do what it is supposed to do?”

Examples:

opening the ALB URL
running curl
checking the expected app response
checking whether health checks pass

You need both.

A Terraform apply can succeed while the application is still broken.

And an app can look reachable while the Terraform state or resource shape is still wrong.

That is why a complete manual test must cover both layers.

The Day 17 Test Environments

To make the tests more realistic, I ran the manual test process against two environments:

day_17/environments/dev
day_17/environments/production

Both used the Day 17 reusable modules, but with different environment settings.

That was useful because it let me compare whether behavior changed between:

dev
production

That kind of comparison matters, because many real issues only show up when environments differ in:

instance count
naming
defaults
scaling assumptions

My Actual Day 17 Results

Here is what I found.

Dev environment

Passes:

terraform init
terraform validate
terraform plan
terraform apply
terraform output
browser check showed Hello from Day 17 Dev
terraform plan returned clean after apply
terraform plan -destroy showed the expected destroy plan
terraform destroy completed successfully

One useful cleanup nuance:
after destroy, a broad EC2 query still showed an instance ID briefly. At first glance, that looked like a failed cleanup.

But the real issue was the query, not Terraform.

A better command filtered to active instance states only:

aws ec2 describe-instances \
  --filters "Name=tag:ManagedBy,Values=terraform" \
            "Name=instance-state-name,Values=pending,running,stopping,stopped" \
  --query "Reservations[*].Instances[*].InstanceId"

That returned:

[]

So destroy had actually worked.

That was a great reminder that verifying cleanup also needs the right query logic.

Production environment

Passes:

terraform init
terraform validate
terraform plan
terraform apply
terraform output
browser and curl checks returned Hello from Day 17 Production
terraform plan returned clean after apply
terraform plan -destroy looked correct
terraform destroy completed successfully
post-destroy verification returned clean results for active resources

That told me the same module structure behaved correctly in both environments.

Why Cleanup Discipline Matters So Much

This was one of the biggest takeaways of the day.

As Brikman emphasizes in this chapter, a test is not really complete if you do not know how to tear the infrastructure down and verify that cleanup worked.

If you test infrastructure in AWS without strong cleanup habits, you can end up with:

running EC2 instances
load balancers
security groups
alarms
orphaned resources
unexpected costs

So a manual test is not complete when the app loads once in the browser.

A manual test is complete when:

the infrastructure worked
the results were recorded
the cleanup was run
the cleanup was verified

That is the standard I want to keep carrying forward.

What Manual Testing Gives You That Automation Alone Cannot

Day 17 helped me see that manual testing gives a different kind of value than automation.

Manual testing gives you:

context
observation
understanding
discovery

Automation gives you:

repetition
speed
consistency
regression protection

That means manual testing helps answer:

what should I automate next?
what actually matters to check?
what can go wrong in the real cloud environment?

That is why manual testing is still important, even when Terratest and other automation exist.

My Main Takeaway

The biggest lesson from Day 17 was this:

good infrastructure testing is not just about proving that Terraform runs.

It is about proving that the system behaves correctly, stays consistent, and can be cleaned up safely.

Manual testing helped me define that process clearly.

And once that process is clear, automated testing becomes much more meaningful.

Full Code

GitHub reference:
👉 Github Link

Follow My Journey

This is Day 17 of my 30-Day Terraform Challenge.

See you on Day 18 🚀

DEV Community

The Importance of Manual Testing in Terraform

Why Manual Testing Still Matters

What a Structured Manual Test Should Cover

1. Provisioning Verification

2. Resource Correctness

3. Functional Verification

4. State Consistency

5. Regression Check

6. Cleanup

Provisioning Verification vs Functional Verification

Provisioning verification asks:

Functional verification asks:

The Day 17 Test Environments

My Actual Day 17 Results

Dev environment

Production environment

Why Cleanup Discipline Matters So Much

What Manual Testing Gives You That Automation Alone Cannot

My Main Takeaway

Full Code

Follow My Journey

Top comments (0)