DEV Community

Mary Mutua
Mary Mutua

Posted on

The Importance of Manual Testing in Terraform

Day 17 of my Terraform journey focused on something that sounds simple, but is actually a big part of real infrastructure work: manual testing.

In Chapter 9 of Terraform: Up & Running, Yevgeniy Brikman makes a strong case that manual testing is not something you outgrow once automated testing enters the picture. Instead, it is often the step that teaches you what should be automated, what success really looks like, and what can go wrong in the real world.

The biggest lesson from today was this:

manual testing is not the opposite of automated testing.

It is often the step that makes good automated testing possible.

Before you can automate a test well, you need to know:

  • what you are testing
  • what success looks like
  • what failure looks like
  • how the infrastructure behaves in real life
  • how to clean it up safely afterward

For Day 17, I built a structured manual testing process for my Terraform webserver cluster, ran it against both dev and production environments, documented the results, and treated cleanup as part of the test itself.

GitHub reference:
👉 Github Link

Why Manual Testing Still Matters

As Yevgeniy Brikman explains in Chapter 9, automated tests are powerful, but they do not replace the need to first understand the behavior of your infrastructure. Manual testing helps you build that understanding.

Automated tests are excellent for:

  • repeatability
  • regression detection
  • CI/CD confidence
  • faster feedback on future changes

But manual testing is still valuable because it helps you discover things automated tests do not always reveal immediately, such as:

  • confusing outputs
  • weak defaults
  • cloud-provider quirks
  • slow readiness times
  • cleanup surprises
  • differences between environments
  • gaps in documentation

Manual testing teaches you what actually matters to verify.

That is why I now see it as the foundation for better automation, not as something temporary or optional.

What a Structured Manual Test Should Cover

A manual test should be more than just “run terraform apply and click around.”

For Day 17, I organized my checklist into these categories.

1. Provisioning Verification

This is the Terraform layer.

Questions:

  • Did terraform init complete without errors?
  • Did terraform validate pass cleanly?
  • Did terraform plan show the expected resources?
  • Did terraform apply complete successfully?

This confirms that the configuration is valid and deployable.

2. Resource Correctness

This is the “did Terraform create what I expected?” layer.

Questions:

  • Are the expected AWS resources present?
  • Do names, tags, and regions match the variables?
  • Are security group rules exactly what I intended?

This is important because infrastructure can be “working” while still being incorrectly shaped.

3. Functional Verification

This is the behavior layer.

Questions:

  • Does the ALB DNS name resolve?
  • Does curl return the expected response?
  • Are instances healthy behind the load balancer?
  • Does the environment behave as the service is supposed to?

This is the part that tells you whether the deployed system is actually usable.

4. State Consistency

This checks whether Terraform and reality still match.

Questions:

  • Does terraform plan return “No changes” after apply?
  • Does the state reflect what exists in AWS?

This is important because drift or hidden mismatches are easy to miss otherwise.

5. Regression Check

This tests whether a small change behaves predictably.

Questions:

  • If I change one small thing, does Terraform show only that change?
  • After applying it, does terraform plan return clean again?

This helps prove the configuration is stable and understandable.

6. Cleanup

This is the most overlooked part of manual testing.

Questions:

  • Did terraform plan -destroy look correct?
  • Did terraform destroy succeed?
  • Were any active resources left behind afterward?

Cleanup is not just cost control.

It is part of whether the test process itself is trustworthy.

Provisioning Verification vs Functional Verification

This difference became much clearer to me today.

Provisioning verification asks:

“Did Terraform successfully create the infrastructure?”

Examples:

  • terraform init
  • terraform validate
  • terraform plan
  • terraform apply

Functional verification asks:

“Does the infrastructure actually do what it is supposed to do?”

Examples:

  • opening the ALB URL
  • running curl
  • checking the expected app response
  • checking whether health checks pass

You need both.

A Terraform apply can succeed while the application is still broken.

And an app can look reachable while the Terraform state or resource shape is still wrong.

That is why a complete manual test must cover both layers.

The Day 17 Test Environments

To make the tests more realistic, I ran the manual test process against two environments:

  • day_17/environments/dev
  • day_17/environments/production

Both used the Day 17 reusable modules, but with different environment settings.

That was useful because it let me compare whether behavior changed between:

  • dev
  • production

That kind of comparison matters, because many real issues only show up when environments differ in:

  • instance count
  • naming
  • defaults
  • scaling assumptions

My Actual Day 17 Results

Here is what I found.

Dev environment

Passes:

  • terraform init
  • terraform validate
  • terraform plan
  • terraform apply
  • terraform output
  • browser check showed Hello from Day 17 Dev
  • terraform plan returned clean after apply
  • terraform plan -destroy showed the expected destroy plan
  • terraform destroy completed successfully

One useful cleanup nuance:
after destroy, a broad EC2 query still showed an instance ID briefly. At first glance, that looked like a failed cleanup.

But the real issue was the query, not Terraform.

A better command filtered to active instance states only:

aws ec2 describe-instances \
  --filters "Name=tag:ManagedBy,Values=terraform" \
            "Name=instance-state-name,Values=pending,running,stopping,stopped" \
  --query "Reservations[*].Instances[*].InstanceId"
Enter fullscreen mode Exit fullscreen mode

That returned:

[]
Enter fullscreen mode Exit fullscreen mode

So destroy had actually worked.

That was a great reminder that verifying cleanup also needs the right query logic.

Production environment

Passes:

  • terraform init
  • terraform validate
  • terraform plan
  • terraform apply
  • terraform output
  • browser and curl checks returned Hello from Day 17 Production
  • terraform plan returned clean after apply
  • terraform plan -destroy looked correct
  • terraform destroy completed successfully
  • post-destroy verification returned clean results for active resources

That told me the same module structure behaved correctly in both environments.

Why Cleanup Discipline Matters So Much

This was one of the biggest takeaways of the day.

As Brikman emphasizes in this chapter, a test is not really complete if you do not know how to tear the infrastructure down and verify that cleanup worked.

If you test infrastructure in AWS without strong cleanup habits, you can end up with:

  • running EC2 instances
  • load balancers
  • security groups
  • alarms
  • orphaned resources
  • unexpected costs

So a manual test is not complete when the app loads once in the browser.

A manual test is complete when:

  • the infrastructure worked
  • the results were recorded
  • the cleanup was run
  • the cleanup was verified

That is the standard I want to keep carrying forward.

What Manual Testing Gives You That Automation Alone Cannot

Day 17 helped me see that manual testing gives a different kind of value than automation.

Manual testing gives you:

  • context
  • observation
  • understanding
  • discovery

Automation gives you:

  • repetition
  • speed
  • consistency
  • regression protection

That means manual testing helps answer:

  • what should I automate next?
  • what actually matters to check?
  • what can go wrong in the real cloud environment?

That is why manual testing is still important, even when Terratest and other automation exist.

My Main Takeaway

The biggest lesson from Day 17 was this:

good infrastructure testing is not just about proving that Terraform runs.

It is about proving that the system behaves correctly, stays consistent, and can be cleaned up safely.

Manual testing helped me define that process clearly.

And once that process is clear, automated testing becomes much more meaningful.

Full Code

GitHub reference:
👉 Github Link

Follow My Journey

This is Day 17 of my 30-Day Terraform Challenge.

See you on Day 18 🚀

Top comments (0)