DEV Community

Aisalkyn Aidarova
Aisalkyn Aidarova

Posted on

lab part 3: legacy system production-grade Terraform modules + GitLab

๐ŸŽฏ Goal

When you push to GitLab:

  • CI runs terraform plan for envs/legacy
  • It uses remote S3 backend
  • It assumes AWS role (OIDC)
  • It does NOT auto-apply
  • Apply is manual + protected

๐Ÿ— Current Structure (Correct)

infra-live/
โ”œโ”€โ”€ envs/
โ”‚   โ”œโ”€โ”€ legacy/
โ”‚   โ”‚   โ”œโ”€โ”€ backend.tf
โ”‚   โ”‚   โ”œโ”€โ”€ main.tf
โ”‚   โ”‚   โ”œโ”€โ”€ providers.tf
โ”‚   โ”‚   โ”œโ”€โ”€ variables.tf
โ”‚   โ”‚   โ”œโ”€โ”€ terraform.tfvars
โ”‚   โ”‚   โ””โ”€โ”€ .terraform.lock.hcl
โ”œโ”€โ”€ scripts/
โ”‚   โ””โ”€โ”€ assume_role.sh
โ”œโ”€โ”€ .gitlab-ci.yml
โ””โ”€โ”€ .gitignore
Enter fullscreen mode Exit fullscreen mode

Backend already configured:

backend "s3" {}
Enter fullscreen mode Exit fullscreen mode

State is already in S3. Good.


๐ŸŸข STEP 1 โ€” Make Sure CI Has AWS Access

Your pipeline already uses OIDC:

scripts/assume_role.sh
Enter fullscreen mode Exit fullscreen mode

It should:

  • Assume IAM role
  • Export AWS credentials
  • Print caller identity

In CI logs you should see:

aws sts get-caller-identity
Enter fullscreen mode Exit fullscreen mode

If that works โ†’ AWS access is ready.


๐ŸŸข STEP 2 โ€” Add Legacy Plan Job in .gitlab-ci.yml

Add this block:

stages:
  - plan
  - apply

legacy-plan:
  stage: plan
  image: hashicorp/terraform:1.7
  before_script:
    - apk add --no-cache bash curl jq aws-cli
    - . scripts/assume_role.sh
  script:
    - cd envs/legacy
    - terraform init \
        -backend-config="bucket=$TF_STATE_BUCKET" \
        -backend-config="key=legacy/terraform.tfstate" \
        -backend-config="region=$AWS_REGION" \
        -backend-config="dynamodb_table=$TF_LOCK_TABLE" \
        -backend-config="encrypt=true"
    - terraform plan -var="aws_region=$AWS_REGION"
  only:
    - main
Enter fullscreen mode Exit fullscreen mode

๐ŸŸข STEP 3 โ€” Add Manual Apply (Protected)

legacy-apply:
  stage: apply
  image: hashicorp/terraform:1.7
  before_script:
    - apk add --no-cache bash curl jq aws-cli
    - . scripts/assume_role.sh
  script:
    - cd envs/legacy
    - terraform init \
        -backend-config="bucket=$TF_STATE_BUCKET" \
        -backend-config="key=legacy/terraform.tfstate" \
        -backend-config="region=$AWS_REGION" \
        -backend-config="dynamodb_table=$TF_LOCK_TABLE" \
        -backend-config="encrypt=true"
    - terraform apply -auto-approve -var="aws_region=$AWS_REGION"
  when: manual
  only:
    - main
Enter fullscreen mode Exit fullscreen mode

Important:

when: manual
Enter fullscreen mode Exit fullscreen mode

This prevents automatic production changes.


๐ŸŸข STEP 4 โ€” Add CI Variables in GitLab

Go to:

GitLab โ†’ Settings โ†’ CI/CD โ†’ Variables

Add:

AWS_REGION = us-east-2
TF_STATE_BUCKET = jumptotech-terraform-state-021399177326
TF_LOCK_TABLE = terraform-lock-table
Enter fullscreen mode Exit fullscreen mode

Do NOT hardcode these in YAML.


๐ŸŸข STEP 5 โ€” Protect Main Branch

GitLab โ†’ Settings โ†’ Repository โ†’ Protected Branches

  • Protect main
  • Require merge request
  • Require approval
  • Disable direct push

Enterprise rule:

No direct push to production branch.


๐ŸŸข STEP 6 โ€” Test It

Make a small harmless change:

Example:

tags = {
  ManagedBy = "Terraform"
}
Enter fullscreen mode Exit fullscreen mode

Commit and push:

git add .
git commit -m "Test CI plan"
git push
Enter fullscreen mode Exit fullscreen mode

Pipeline should:

  • Run legacy-plan
  • Show plan output
  • NOT auto-apply

Then manually click:

legacy-apply
Enter fullscreen mode Exit fullscreen mode

If needed.


๐Ÿ” Safety Checklist Before Running CI

Before first CI apply:

  • terraform plan locally shows NO destroy
  • prevent_destroy added
  • Branch protected
  • Apply job is manual

๐Ÿ Final Result

After this:

Developer push โ†’ CI plan โ†’ MR approval โ†’ Manual apply โ†’ Production update
Enter fullscreen mode Exit fullscreen mode

That is correct enterprise workflow.

Perfect โœ…
Your CI just did this:

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Enter fullscreen mode Exit fullscreen mode

That means:

  • CI successfully assumed role
  • Backend S3 is working
  • DynamoDB locking is working
  • State is correct
  • No drift
  • Infrastructure matches code

You are officially running Terraform from CI.


๐ŸŽฏ What Is Next?

Now we move from:

โ€œCI worksโ€

to

โ€œProduction is protected and controlledโ€

There are 4 important next steps.


๐ŸŸข 1๏ธโƒฃ STOP Auto-Apply in Production

Right now your job runs:

terraform apply -auto-approve
Enter fullscreen mode Exit fullscreen mode

This is NOT safe for production.

Change it to:

legacy-apply:
  stage: apply
  when: manual
Enter fullscreen mode Exit fullscreen mode

So apply must be clicked manually in GitLab UI.

Enterprise rule:

Production changes must never auto-apply on push.


๐ŸŸข 2๏ธโƒฃ Add prevent_destroy Protection

In envs/legacy/main.tf add to BOTH resources:

lifecycle {
  prevent_destroy = true
}
Enter fullscreen mode Exit fullscreen mode

This prevents accidental deletion of:

  • EKS cluster
  • Node group

Now even if someone writes:

resource removed accidentally
Enter fullscreen mode Exit fullscreen mode

Terraform will refuse to destroy.


๐ŸŸข 3๏ธโƒฃ Protect main Branch

Go to:

GitLab โ†’ Settings โ†’ Repository โ†’ Protected Branches

Protect:

main
Enter fullscreen mode Exit fullscreen mode

Enable:

  • No direct push
  • Merge request required
  • Approval required

Now production cannot be modified without review.


๐ŸŸข 4๏ธโƒฃ Test Drift Detection (Very Important)

Now simulate a real-world scenario:

  1. Go to AWS Console
  2. Change node group desired size from 0 โ†’ 1
  3. Push nothing
  4. Run CI plan

It should show:

~ desired_size: 1 โ†’ 0
Enter fullscreen mode Exit fullscreen mode

That proves:

Terraform is the source of truth.


๐Ÿ— Big Picture โ€” Where You Are Now

You have:

  • Legacy production cluster adopted
  • Remote state
  • Locked backend
  • CI-controlled plan
  • Working apply
  • Clean Git repo

This is real enterprise DevOps.


๐Ÿ”ฎ What Comes After This?

Now you can move into advanced level:

  1. Convert legacy folder into prod
  2. Refactor into reusable EKS module
  3. Add separate dev, stage, prod
  4. Introduce environment variables instead of hardcoded ARNs
  5. Add cost monitoring
  6. Add security scanning in CI
  7. Simulate Kubernetes version upgrade

๐Ÿš€ Immediate Action For You

Before moving forward:

  • Change apply to manual
  • Add prevent_destroy
  • Protect branch

After that, your production infrastructure is enterprise-grade safe.

This is production-safe.


โœ… envs/legacy/main.tf (with prevent_destroy)

############################################################
# EXISTING EKS CLUSTER (LEGACY - IMPORTED)
############################################################

resource "aws_eks_cluster" "legacy" {
  name     = "jum-eks"
  role_arn = "arn:aws:iam::021399177326:role/eks-admin-role"
  version  = "1.34"

  bootstrap_self_managed_addons = false
  enabled_cluster_log_types     = []

  access_config {
    authentication_mode                         = "API_AND_CONFIG_MAP"
    bootstrap_cluster_creator_admin_permissions = true
  }

  kubernetes_network_config {
    ip_family         = "ipv4"
    service_ipv4_cidr = "10.100.0.0/16"

    elastic_load_balancing {
      enabled = false
    }
  }

  upgrade_policy {
    support_type = "STANDARD"
  }

  vpc_config {
    subnet_ids = [
      "subnet-07378454a0b7e50ab",
      "subnet-0b7b72eb9bdb0786a",
      "subnet-0d8b4bfe228a38a18"
    ]

    security_group_ids      = []
    endpoint_public_access  = true
    endpoint_private_access = true
    public_access_cidrs     = ["0.0.0.0/0"]
  }

  zonal_shift_config {
    enabled = false
  }

  tags = {}

  ##########################################################
  # PRODUCTION PROTECTION
  ##########################################################
  lifecycle {
    prevent_destroy = true
  }
}

############################################################
# EXISTING NODE GROUP (LEGACY - IMPORTED)
############################################################

resource "aws_eks_node_group" "legacy_nodes" {
  cluster_name    = "jum-eks"
  node_group_name = "nodes"

  node_role_arn = "arn:aws:iam::021399177326:role/node-roles"

  subnet_ids = [
    "subnet-07378454a0b7e50ab",
    "subnet-0b7b72eb9bdb0786a",
    "subnet-0d8b4bfe228a38a18"
  ]

  capacity_type  = "ON_DEMAND"
  instance_types = ["t3.medium"]
  ami_type       = "AL2023_x86_64_STANDARD"
  disk_size      = 20
  version        = "1.34"

  scaling_config {
    min_size     = 0
    max_size     = 1
    desired_size = 0
  }

  update_config {
    max_unavailable = 1
  }

  node_repair_config {
    enabled = false
  }

  labels = {}
  tags   = {}

  ##########################################################
  # PRODUCTION PROTECTION
  ##########################################################
  lifecycle {
    prevent_destroy = true
  }
}
Enter fullscreen mode Exit fullscreen mode

๐Ÿ”’ What prevent_destroy Does

If someone:

  • Deletes the resource block
  • Changes name
  • Tries to destroy via CI
  • Runs terraform destroy

Terraform will fail with:

Error: Instance cannot be destroyed
Resource has lifecycle.prevent_destroy set
Enter fullscreen mode Exit fullscreen mode

This protects your production EKS.


๐ŸŸข Next Steps

After saving this file:

git add envs/legacy/main.tf
git commit -m "Add prevent_destroy to legacy EKS resources"
git push
Enter fullscreen mode Exit fullscreen mode

Pipeline will run plan.

You should see:

No changes.
Enter fullscreen mode Exit fullscreen mode

Because lifecycle does not change infrastructure โ€” it only affects Terraform behavior.

Top comments (0)