DEV Community

Cover image for My First Week with Terraform: The Struggles Nobody Talks About (And How I Overcame Them)
Arbythecoder
Arbythecoder

Posted on

My First Week with Terraform: The Struggles Nobody Talks About (And How I Overcame Them)

I still remember my first job application that required Terraform experience. I thought I had it figured out—after all, Infrastructure as Code sounded straightforward enough: write code, deploy infrastructure, done.

The reality? I couldn't complete the take-home assignment. The state file corrupted. My resources kept getting destroyed accidentally. Error messages made no sense. I didn't get that job.

But that failure became my best teacher. During my #90DaysOfDevOps challenge, I forced myself to work with Terraform daily. I broke things. I fixed them. I broke them again. Today, after managing infrastructure for multiple projects and writing extensively about DevOps, I want to share the real struggles beginners face with Terraform—the ones tutorials gloss over—and the practical solutions that actually work.


The State File Mystery (And Why 30% of Developers Get It Wrong)

What I Thought vs. Reality

My naive assumption: "State file? Must be like a capital of a state, right? Just a log of what happened." (Yes, I actually thought this—I even made jokes about it.)

The reality: The state file is Terraform's memory—it tracks the current status of your infrastructure, mapping what exists in the cloud to what's in your code. Without it, Terraform is completely blind.

The Painful Lesson

I was working on that job assignment, confidently running terraform apply across different machines. Suddenly, Terraform wanted to recreate everything. My heart sank. I had ignored the state file, and now Terraform had amnesia about what it had built.

According to a 2024 Stack Overflow survey, 30% of developers reported state file breaches when configurations weren't properly secured. I was firmly in that 30%.

What Actually Causes State File Problems

The primary culprits behind state file corruption are network issues during updates, manual modifications to infrastructure that Terraform is managing, and software bugs. I learned this the hard way when I manually edited resources in the AWS console while Terraform was managing them. Big mistake.

During my 90 days of DevOps, I probably corrupted my state file at least 5 times in the first two weeks. Each time taught me something new about what NOT to do.

The Fix That Saved Me

  1. Always use remote state (S3 + DynamoDB for AWS, Azure Storage for Azure, GCS for Google Cloud)
  2. Never manually edit infrastructure that Terraform is managing—not even "just this once"
  3. Enable state file versioning to rollback when things break
  4. Use state locking to prevent concurrent modifications (this saved me during team projects)
# This configuration saved my sanity
terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "project/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-state-lock"
    encrypt        = true
  }
}
Enter fullscreen mode Exit fullscreen mode

Running apply Without plan (The Lesson That Cost Me Sleep)

The Anxiety is Real

Every beginner feels that pit in their stomach the first time they see:

Plan: 0 to add, 0 to change, 8 to destroy
Enter fullscreen mode Exit fullscreen mode

Your hands shake. You double-check. You triple-check. Is this right? What if I press yes?

I remember the first time I saw terraform destroy in a tutorial. My immediate thought was: "This command should require a password, a fingerprint scan, AND a written letter from your mother." The fear was REAL.

What Actually Breaks

When you skip terraform plan, you're essentially deploying blind. Here's what typically goes wrong:

  1. Resource conflicts: Dependencies aren't met, causing cascading failures
  2. Accidental deletions: That one typo destroys your test database (happened to me on day 23 of my challenge)
  3. Network misconfigurations: Servers get created without proper network access
  4. Cost explosions: Wrong instance types get deployed

During my DevOps challenge, I once accidentally deployed resources in the wrong AWS region because I didn't carefully read the plan output. It took me 3 hours to figure out why nothing was working.

The Safe Workflow (That I Now Use Religiously)

# ALWAYS this sequence—no exceptions
terraform plan -out=tfplan

# Read EVERY line of the output
# Ask yourself: "Does this make sense?"
# Check: What's being added? Changed? DESTROYED?

terraform apply tfplan
Enter fullscreen mode Exit fullscreen mode

Running apply blindly can cause unintended deletions or recreations. The 30 seconds you "save" by skipping plan can cost hours—sometimes days—in recovery.


Resource Dependencies (When Terraform Isn't Psychic)

The Error Message That Confused Me for Hours

Error: Error creating EC2 instance: VPCIdNotSpecified
Enter fullscreen mode Exit fullscreen mode

"But I DEFINED the VPC! It's right there in my code! Why can't Terraform see it?"

This was during week 2 of my 90-day challenge. I spent 4 hours debugging before I understood that just because resources are in the same file doesn't mean Terraform knows which order to create them.

What Beginners Try to Create (And What Goes Wrong)

Most beginners start with:

  • A virtual machine (EC2, Azure VM, GCP Compute)
  • A database (RDS, Cloud SQL)
  • Some networking (VPC, subnets, security groups)

The problem? They forget the creation order matters in the real cloud, even if it doesn't matter in your .tf file.

The Solution: Explicit Dependencies

Sometimes Terraform can't infer relationships from your code. You need to be explicit:

resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"
  subnet_id     = aws_subnet.main.id  # Implicit dependency - Terraform sees this

  depends_on = [aws_internet_gateway.gw]  # Explicit dependency - you tell Terraform
}
Enter fullscreen mode Exit fullscreen mode

Pro Tip That Changed Everything for Me

Use terraform graph | dot -Tsvg > graph.svg to visualize your dependencies. This command generates a visual representation of how your resources relate to each other.

When I discovered this during my challenge, it was like turning on the lights in a dark room. Suddenly I could SEE why things were failing.

For more interactive visualizations, tools like Blast Radius, Inframap, Rover, or Terraform Visual can provide better clarity than the raw graph command.


The Fear of terraform destroy (It's Healthy Fear)

The Nightmare Scenario

You're tired. It's late. You're on day 47 of your DevOps challenge. You type terraform destroy in what you THINK is your dev environment.

It's production.

You type "yes."

🚨💀🔥

I never did this (thank God), but the FEAR of doing it kept me up at night during my learning phase. That fear is actually good—it keeps you careful.

How to Protect Yourself

  1. Use workspaces to separate environments (terraform workspace new dev)
  2. Add -target flag when destroying specific resources
  3. Implement destroy protection for critical resources
  4. Always verify with terraform workspace show before running destroy
  5. Check your cloud console to confirm which account/region you're in
resource "aws_db_instance" "production" {
  # ... other configuration ...

  lifecycle {
    prevent_destroy = true  # Can't destroy without removing this first
  }
}
Enter fullscreen mode Exit fullscreen mode

This simple setting has saved countless production databases from accidental destruction.


What the Tutorials Don't Tell You (The Real Learning)

1. Start Ridiculously Small

Don't try to Terraform your entire infrastructure in week one. During my challenge, I started with:

  • Day 1-3: One EC2 instance
  • Day 4-7: One S3 bucket with versioning
  • Day 8-14: One VPC with subnets
  • Day 15+: Started combining them

Master basics before complexity. Your future self will thank you.

2. Error Messages Are Cryptic (And That's Normal)

Terraform errors often come from the underlying cloud provider, not Terraform itself. That confusing AWS error about IAM permissions? That's AWS talking, not Terraform.

Your debugging workflow:

  1. Copy the entire error message
  2. Google: [error message] + [cloud provider] + terraform
  3. Someone on StackOverflow has had your exact problem
  4. Read the provider documentation carefully

I probably Googled 200+ different Terraform errors during my 90-day challenge. It gets easier.

3. Version EVERYTHING

terraform {
  required_version = ">= 1.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"  # Pin this!
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Allowing Terraform to always pull the latest provider version can cause sudden failures when a provider update changes behavior. Pin your versions. Trust me on this.

4. The Documentation Is Your Best Friend

I spent week 1 fighting Terraform. Week 2, I spent half my time reading the AWS provider documentation. Week 2 was infinitely more productive.

The HashiCorp documentation is actually excellent—you just need to give it time.

5. Modules Come Later

Beginners think they should use modules immediately. Don't. Copy-paste is okay when you're learning. I didn't touch modules until day 40 of my challenge, and that timing was perfect.

Understand what the code does before you abstract it.


Real Companies Using Terraform Successfully

Let me share some examples of how real companies are using Terraform, based on publicly available case studies and documentation:

Example 1: GitLab's Infrastructure Approach

GitLab uses Terraform for infrastructure automation while GitLab serves as the single source of truth for version control, ensuring the same infrastructure environment deploys each time and eliminating inconsistent configurations.

Key lesson for beginners: Treat your infrastructure code like application code—version control, code reviews, and CI/CD pipelines apply here too.

Example 2: Financial Services at Scale

A top-5 financial services company standardized on HashiCorp Terraform for all self-service provisioning needs, allowing developers to deploy infrastructure on whatever cloud vendor they need.

Key lesson for beginners: Terraform's multi-cloud capability isn't just marketing—companies actually use it to avoid vendor lock-in.

Example 3: Startups Moving Fast

Many early-stage startups use Terraform to:

  • Spin up entire environments in 15 minutes instead of 2 days
  • Clone production to staging with a single command
  • Tear down dev environments overnight to save costs
  • Onboard new developers faster with documented infrastructure

Key lesson for beginners: Infrastructure as Code means Infrastructure as an ON/OFF switch. This is especially valuable when you're watching every dollar.


My Honest Assessment After the Struggle

Three Months and 90 Days Later

I'm not a Terraform expert. I still Google things daily. I still make mistakes. But I can now:

  • Deploy multi-tier infrastructure in minutes instead of hours
  • Collaborate with teams without stepping on each other's toes
  • Reproduce environments consistently across AWS, Azure, and GCP
  • Sleep at night knowing I can rollback any change
  • Confidently take on DevOps projects that require IaC

Would I Recommend Terraform?

Yes, but with realistic expectations:

Do use Terraform if:

  • You deploy infrastructure more than once
  • You work with a team (or plan to)
  • You want reproducible environments
  • You're tired of clicking through cloud consoles
  • You need to document your infrastructure decisions

Don't start with Terraform if:

  • You're deploying a single static website (overkill)
  • You have zero command-line experience (learn bash basics first)
  • You're in a rush (the learning curve is real—budget 2-4 weeks)
  • You expect it to be "easy" (it's not, but it's worth it)

The Transformation

That job I didn't get because of Terraform? It pushed me to start my 90 Days of DevOps challenge. I studied. I broke things in dev environments. I fixed them. I wrote about my experiences.

Six months later, I landed a better role specifically because of my documented Terraform journey on Dev.to and my GitHub projects. The failure was the beginning, not the end.

Your struggles with Terraform right now? They're not signs you should quit. They're signs you're learning something valuable that most people give up on.


🎁 FREE RESOURCE: Don't Get Destroyed by Cloud Costs

Speaking of Infrastructure as Code and cloud infrastructure—one of the biggest shocks for beginners isn't just learning Terraform, it's the AWS bill at the end of the month.

I created a free checklist based on mistakes I've seen (and made myself): "Don't Get Financially Destroyed by the Cloud" Checklist

It covers:

  • ✅ Cost optimization settings you should enable on Day 1
  • ✅ The 5 cloud resources that silently drain your budget
  • ✅ Monitoring alerts that actually matter

Download it free here — it's the resource I wish I had when I started.


Practical Resources That Actually Helped Me

  1. HashiCorp's Official Tutorials - Start here, not YouTube (seriously)
  2. Terraform Registry - Real module examples from real companies
  3. Your Cloud Provider's Terraform Documentation - AWS/Azure/GCP have excellent Terraform docs
  4. Dev.to DevOps Community - Where I share my journey and get amazing feedback
  5. r/Terraform on Reddit - Quick answers to specific questions
  6. The Terraform Discord - Real-time help when you're stuck

Your Turn: Let's Learn Together

If you're learning Terraform right now, I want to hear from you:

  • What's your biggest confusion right now?
  • What error message is driving you crazy?
  • What concept do the tutorials keep skipping over?
  • Are you doing your own 90-day challenge?

Drop it in the comments. I probably made the same mistake two months ago. Let's figure it out together.

And if you're past the beginner stage: What do you wish someone had told you in week one? Share your wisdom. Help the next person who's struggling with their first terraform apply.


Key Takeaways (TL;DR)

  1. State files are Terraform's memory - protect them like production data, use remote state from day one
  2. Always run plan before apply - 30 seconds of review saves hours of recovery
  3. Dependencies aren't always obvious - use terraform graph to visualize, use explicit depends_on when needed
  4. Fear of destroy is healthy - implement safeguards, use workspaces, verify before confirming
  5. Start small, master basics - don't Terraform everything on day one
  6. Error messages come from providers - Google the cloud provider name with the error
  7. Documentation beats tutorials - spend time in HashiCorp and provider docs
  8. Real companies use this successfully - from startups to Fortune 500 financial institutions

Top comments (0)