DEV Community

Production-Ready Terraform - Part 3: The Finale — Automating Terraform with CI/CD

Hello again,
In [Part 2], we moved our state to the cloud using S3 and secured it with locking (via DynamoDB or S3 Native). But we still have a problem: we are running Terraform from our laptops.

Running terraform apply locally is risky. It relies on local credentials, lacks audit trails, and "works on my machine" doesn't cut it for production. In this final chapter, we will build a fully automated CI/CD Pipeline that moves Terraform execution to a controlled environment.

We will cover:

  1. Automating the Plan: Running terraform plan automatically on every Pull Request.
  2. The "Human in the Loop": using GitHub Environments or Atlantis to approve changes safely.
  3. The "PR-to-Prod" Workflow: A complete end-to-end lifecycle for your infrastructure.

1. Automating the Plan (GitHub Actions)

The first rule of CI/CD for infrastructure is visibility. When a developer opens a Pull Request (PR), we want to immediately see what would happen if we merged that code.

We will use GitHub Actions to automatically run terraform plan and post the results as a comment on the PR.

Prerequisites: OIDC (Stop using Keys!)

Security Note: Do not store long-lived AWS Access Keys in your GitHub Secrets. Instead, use OpenID Connect (OIDC). It allows GitHub Actions to assume an IAM Role in your AWS account temporarily.

  • Create an IAM Role in AWS with a trust policy allowing your GitHub repo.
  • Grant this role permissions to manage your infrastructure and access the S3 backend.

The "Plan" Workflow

Create a file at .github/workflows/terraform-plan.yml:

name: "Terraform Plan"

on:
  pull_request:
    branches: [ main ]

permissions:
  id-token: write # Required for OIDC
  contents: read
  pull-requests: write # Required to post comments

jobs:
  plan:
    name: "Terraform Plan"
    runs-on: ubuntu-latest

    steps:
      - name: Checkout Code
        uses: actions/checkout@v3

      - name: Configure AWS Credentials (OIDC)
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsTerraformRole
          aws-region: us-east-1

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v3

      - name: Terraform Init
        run: terraform init

      - name: Terraform Plan
        id: plan
        run: terraform plan -no-color -out=tfplan
        continue-on-error: true # Don't fail the job yet, we want to see the error in the comment

      - name: Post Plan to PR
        uses: actions/github-script@v6
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          script: |
            const plan = `${{ steps.plan.outputs.stdout }}`;
            const outcome = `${{ steps.plan.outcome }}`;

            const output = `#### Terraform Plan 📖 \`${outcome}\`
            <details><summary>Show Plan</summary>

            \`\`\`hcl
            ${plan}
            \`\`\`

            </details>`;

            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: output
            })
Enter fullscreen mode Exit fullscreen mode

What this does:

  1. Triggers whenever a PR is opened against main.
  2. Authenticates to AWS securely via OIDC.
  3. Runs terraform plan.
  4. Critically: It uses a script to post the plan output directly into the PR conversation. Now, your team can review infrastructure changes just like they review code.

2. The "Human in the Loop": Approving the Apply

We never want to automatically apply changes to production without a human sanity check. Infrastructure mistakes can be catastrophic (e.g., accidentally deleting a database).

We have two main strategies for this:

Option A: The "GitHub Native" Way (Environments)

GitHub has a feature called Environments. You can create an environment called production and add a protection rule: "Required Reviewers".

When you set this up, the "Apply" workflow (triggered after merge) will pause and wait for a designated person to click "Approve" in the GitHub UI before running terraform apply.

  • Pros: Native to GitHub, easy to set up, free for public repos (and paid plans).
  • Cons: Requires maintaining two separate workflows (Plan on PR, Apply on Merge).

Option B: The "ChatOps" Way (Atlantis)

Atlantis is a popular open-source tool dedicated to Terraform automation. It runs as a service (e.g., in Fargate or K8s) and listens to webhooks from your repo.

Instead of clicking buttons, you interact via comments:

  1. You comment atlantis plan on the PR. Atlantis runs the plan and replies with the output.
  2. You comment atlantis apply. Atlantis locks the state, runs the apply, and automatically merges the PR if successful.
  • Pros: Incredible locking mechanism (locks the PR so no one else can edit), keeps the "Plan" and "Apply" strictly coupled (you apply exactly what you planned).
  • Cons: Requires hosting and maintaining a server/container.

3. The Full "PR-to-Prod" Automated Workflow

Let's assume we are using Option A (GitHub Native) for simplicity. Here is what the full lifecycle looks like:

1. The Trigger

A developer creates a new branch, changes an EC2 instance type in main.tf, and opens a Pull Request.

2. The Automated Plan

GitHub Actions triggers the Plan Workflow. It runs terraform plan, sees that an instance needs to change from t2.micro to t3.medium, and comments this diff on the PR.

3. The Human Review

The Tech Lead reviews the PR code and the Plan comment. They see that the change is intentional and safe. They approve the PR.

4. The Merge & Apply

The developer merges the PR into main. This triggers the Apply Workflow:

name: "Terraform Apply"

on:
  push:
    branches: [ main ]

jobs:
  apply:
    name: "Terraform Apply"
    runs-on: ubuntu-latest
    environment: production # <--- This enforces the "Human Approval" rule!

    steps:
      - name: Checkout
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsTerraformRole
          aws-region: us-east-1

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v3

      - name: Terraform Init
        run: terraform init

      - name: Terraform Apply
        run: terraform apply -auto-approve
Enter fullscreen mode Exit fullscreen mode

Because we added environment: production, this job will enter a "Waiting" state. GitHub notifies the required reviewers. Once approved, the job resumes, runs terraform apply -auto-approve, and your infrastructure is updated.


Conclusion & Series Takeaways 🎓

We have come a long way. We started with a fragile terraform.tfstate file on a laptop and ended with a robust, automated, team-ready CI/CD pipeline.

Key Takeaways from the Series:

  1. Remote State is Non-Negotiable: Never keep state local. Use S3 (or compatible storage) to share state and prevent data loss.
  2. Locking Saves Lives: Always implement locking (DynamoDB or S3 Native) to prevent two people from corrupting state by applying at the same time.
  3. Don't Apply Blindly: Use CI/CD to visualize terraform plan on every Pull Request. Treat infrastructure changes with the same rigor as application code.
  4. Least Privilege: Use OIDC for your pipelines. Don't scatter long-lived AWS Access Keys across developer laptops or CI systems.

By following this pattern, you transform Terraform from a "dangerous tool only the Ops lead can touch" into a safe, collaborative platform for the entire engineering team.

Happy Terraforming! Feel free to leave your questions in the comments, and I will be glad to connect on LinkedIn.


Disclaimer: Parts of this article were drafted with the help of an AI assistant. The technical concepts, code examples, and overall structure were directed, curated, and verified by the author to ensure technical accuracy and reflect real-world experience.

Top comments (0)