Mukami

Posted on Apr 13

Putting It All Together: My 22-Day Terraform Journey

#terraform #devops #beginners #aws

From "What's a provider?" to a Complete CI/CD Pipeline

Day 22 of the 30-Day Terraform Challenge — and I can't believe how far I've come.

Twenty-two days ago, I didn't know what a provider alias was. I hardcoded instance types. I thought terraform destroy was the scariest command in the world.

Today, I have a complete integrated pipeline. And I'm about to finish the book.

Here's what I built. What I learned. And what surprised me most.

The Journey in 22 Days

Days	What I Built
1	Introduction to the challenge
2	Setting up AWS CLI, credentials, Terraform
3	First EC2 instance with user data script
4	Configurable web server with variables
5-6	Auto Scaling Group + Application Load Balancer
7	Remote state with S3 + DynamoDB locking
8-9	Reusable modules (webserver cluster)
10	Loops (`count`, `for_each`) and conditionals
11	Environment-aware module (dev vs prod)
12	Zero-downtime deployments
13	Secrets management with AWS Secrets Manager
14-15	Multiple providers (multi-region, EKS, Docker)
16	Production-grade refactor (tags, alarms, validation)
17	Manual testing
18	Automated testing (terraform test + Terratest)
19	IaC adoption strategy for teams
20-21	Application + infrastructure deployment workflows
22	Integrated CI/CD pipeline + Sentinel policies

That's a complete infrastructure platform. Most engineers don't build this much in their first year.

What I Built (The Highlights)

Week 1 (Days 1-7): I went from a single hardcoded EC2 instance to a configurable, load-balanced, auto-scaling cluster with remote state. The jump from Day 3 to Day 5 was the biggest learning curve — understanding ASGs and ALBs took hours.

Week 2 (Days 8-14): I built reusable modules, added conditionals for multi-environment deployments, learned zero-downtime with create_before_destroy, and deployed an EKS cluster with Kubernetes. The EKS section was the most complex — 68 resources, 15 minutes of waiting, and a lot of coffee.

Week 3 (Days 15-22): I refactored everything to production-grade standards (tags, alarms, validation), wrote manual and automated tests, designed an adoption strategy for teams, and finally built an integrated CI/CD pipeline with Sentinel policies.

The Integrated Pipeline

My final workflow combines everything:

name: Infrastructure CI

on:
  pull_request:
    branches: [main]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - run: terraform fmt -check -recursive
      - run: terraform init -backend=false
      - run: terraform validate
      - run: terraform test

  plan:
    runs-on: ubuntu-latest
    needs: validate
    steps:
      - run: terraform plan -out=ci.tfplan
      - uses: actions/upload-artifact@v4
        with:
          name: terraform-plan
          path: ci.tfplan

What this does:

Format check on every PR
Validation and unit tests
Plan generation saved as immutable artifact

The same plan can be promoted from dev to prod — no regeneration, no drift.

Sentinel Policies

I wrote two policies to enforce standards:

Policy 1: Require ManagedBy tag

main = rule {
  all tfplan.resource_changes as _, rc {
    rc.change.after.tags["ManagedBy"] is "terraform"
  }
}

Policy 2: Allow only approved instance types

allowed_types = ["t3.micro", "t3.small", "t3.medium"]

These caught several violations in my early code — security groups without tags, t2.micro instances — and forced me to fix them.

The Side-by-Side Comparison

Component	Application Code	Infrastructure Code
Source of truth	Git repository	Git repository
Local run	`npm start`	`terraform plan`
Artifact	Docker image	Saved `.tfplan` file
Versioning	Semantic tag	Semantic tag
Automated tests	Unit + integration	`terraform test` + Terratest
Policy enforcement	Linting	Sentinel policies
Deployment	CI/CD pipeline	`terraform apply <plan>`
Rollback	Redeploy previous image	`terraform apply <previous plan>`

The parallel is striking. Infrastructure code follows the same discipline as application code.

What Changed in How I Think

The single biggest mental shift: Infrastructure is not a one-time setup. It's a software product.

Before: "I'll set up the servers once and never touch them again."

After: "Every infrastructure change goes through version control, review, testing, and CI/CD — just like application code."

Infrastructure isn't a project. It's a product that needs continuous improvement.

What Was Harder Than Expected

State management. Not the technical part — the understanding that state is the source of truth. A corrupted state file is worse than corrupted code.

Testing. Unit tests were easy. Integration tests (Terratest) were hard — writing Go code, handling timeouts, waiting for ALBs to provision.

Multi-region deployments. I thought adding a second region would be trivial. Provider aliases, module design, remote state — everything got more complicated.

EKS. The cluster took 15 minutes to provision, and kubectl authentication was a nightmare. But it worked.

What I Would Do Differently

Week 1: Create a .gitignore before writing any code. I committed state files and provider binaries. Embarrassing.

Week 2: Write unit tests earlier. I waited until Day 18. Testing from Day 1 would have caught bugs earlier.

Week 3: Separate state buckets by environment earlier. One bucket for everything was fine for learning but dangerous for production.

Overall: Read the book before starting the challenge, not during. But that's cheating.

What Comes Next

Terraform Associate Certification. I've been studying the exam objectives. My weak areas are Terraform Cloud features (I used S3 backend) and Sentinel policies.

First real project: Refactor my team's development environment. Five developers, each with a manually configured AWS sandbox. I'll write a module that provisions identical sandboxes for everyone.

Longer term: Contribute to open source Terraform modules. I've learned enough to help others.

Chapter 10's Most Important Insight

Immutable versioned artifacts.

The same Docker image that passes tests in CI gets promoted to staging, then production. No rebuilding. No "but it worked in dev."

For infrastructure, the saved .tfplan file is that immutable artifact. The plan reviewed in the PR is the exact plan applied in production. No drift. No surprises.

This insight changes everything. Infrastructure deployment becomes predictable.

The Bottom Line

Twenty-two days ago, I was afraid of terraform destroy.

Today, I trust it because I know:

State is backed up in versioned S3
Plans are reviewed before apply
Tests run automatically
Sentinel enforces rules
Rollback is one command away

Infrastructure as Code isn't just a tool. It's a discipline.

And it's one I'll carry into every project from now on.

DEV Community