๐ฏ Goal
When you push to GitLab:
- CI runs
terraform planforenvs/legacy - It uses remote S3 backend
- It assumes AWS role (OIDC)
- It does NOT auto-apply
- Apply is manual + protected
๐ Current Structure (Correct)
infra-live/
โโโ envs/
โ โโโ legacy/
โ โ โโโ backend.tf
โ โ โโโ main.tf
โ โ โโโ providers.tf
โ โ โโโ variables.tf
โ โ โโโ terraform.tfvars
โ โ โโโ .terraform.lock.hcl
โโโ scripts/
โ โโโ assume_role.sh
โโโ .gitlab-ci.yml
โโโ .gitignore
Backend already configured:
backend "s3" {}
State is already in S3. Good.
๐ข STEP 1 โ Make Sure CI Has AWS Access
Your pipeline already uses OIDC:
scripts/assume_role.sh
It should:
- Assume IAM role
- Export AWS credentials
- Print caller identity
In CI logs you should see:
aws sts get-caller-identity
If that works โ AWS access is ready.
๐ข STEP 2 โ Add Legacy Plan Job in .gitlab-ci.yml
Add this block:
stages:
- plan
- apply
legacy-plan:
stage: plan
image: hashicorp/terraform:1.7
before_script:
- apk add --no-cache bash curl jq aws-cli
- . scripts/assume_role.sh
script:
- cd envs/legacy
- terraform init \
-backend-config="bucket=$TF_STATE_BUCKET" \
-backend-config="key=legacy/terraform.tfstate" \
-backend-config="region=$AWS_REGION" \
-backend-config="dynamodb_table=$TF_LOCK_TABLE" \
-backend-config="encrypt=true"
- terraform plan -var="aws_region=$AWS_REGION"
only:
- main
๐ข STEP 3 โ Add Manual Apply (Protected)
legacy-apply:
stage: apply
image: hashicorp/terraform:1.7
before_script:
- apk add --no-cache bash curl jq aws-cli
- . scripts/assume_role.sh
script:
- cd envs/legacy
- terraform init \
-backend-config="bucket=$TF_STATE_BUCKET" \
-backend-config="key=legacy/terraform.tfstate" \
-backend-config="region=$AWS_REGION" \
-backend-config="dynamodb_table=$TF_LOCK_TABLE" \
-backend-config="encrypt=true"
- terraform apply -auto-approve -var="aws_region=$AWS_REGION"
when: manual
only:
- main
Important:
when: manual
This prevents automatic production changes.
๐ข STEP 4 โ Add CI Variables in GitLab
Go to:
GitLab โ Settings โ CI/CD โ Variables
Add:
AWS_REGION = us-east-2
TF_STATE_BUCKET = jumptotech-terraform-state-021399177326
TF_LOCK_TABLE = terraform-lock-table
Do NOT hardcode these in YAML.
๐ข STEP 5 โ Protect Main Branch
GitLab โ Settings โ Repository โ Protected Branches
- Protect
main - Require merge request
- Require approval
- Disable direct push
Enterprise rule:
No direct push to production branch.
๐ข STEP 6 โ Test It
Make a small harmless change:
Example:
tags = {
ManagedBy = "Terraform"
}
Commit and push:
git add .
git commit -m "Test CI plan"
git push
Pipeline should:
- Run
legacy-plan - Show plan output
- NOT auto-apply
Then manually click:
legacy-apply
If needed.
๐ Safety Checklist Before Running CI
Before first CI apply:
-
terraform planlocally shows NO destroy -
prevent_destroyadded - Branch protected
- Apply job is manual
๐ Final Result
After this:
Developer push โ CI plan โ MR approval โ Manual apply โ Production update
That is correct enterprise workflow.
Perfect โ
Your CI just did this:
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
That means:
- CI successfully assumed role
- Backend S3 is working
- DynamoDB locking is working
- State is correct
- No drift
- Infrastructure matches code
You are officially running Terraform from CI.
๐ฏ What Is Next?
Now we move from:
โCI worksโ
to
โProduction is protected and controlledโ
There are 4 important next steps.
๐ข 1๏ธโฃ STOP Auto-Apply in Production
Right now your job runs:
terraform apply -auto-approve
This is NOT safe for production.
Change it to:
legacy-apply:
stage: apply
when: manual
So apply must be clicked manually in GitLab UI.
Enterprise rule:
Production changes must never auto-apply on push.
๐ข 2๏ธโฃ Add prevent_destroy Protection
In envs/legacy/main.tf add to BOTH resources:
lifecycle {
prevent_destroy = true
}
This prevents accidental deletion of:
- EKS cluster
- Node group
Now even if someone writes:
resource removed accidentally
Terraform will refuse to destroy.
๐ข 3๏ธโฃ Protect main Branch
Go to:
GitLab โ Settings โ Repository โ Protected Branches
Protect:
main
Enable:
- No direct push
- Merge request required
- Approval required
Now production cannot be modified without review.
๐ข 4๏ธโฃ Test Drift Detection (Very Important)
Now simulate a real-world scenario:
- Go to AWS Console
- Change node group desired size from
0โ1 - Push nothing
- Run CI plan
It should show:
~ desired_size: 1 โ 0
That proves:
Terraform is the source of truth.
๐ Big Picture โ Where You Are Now
You have:
- Legacy production cluster adopted
- Remote state
- Locked backend
- CI-controlled plan
- Working apply
- Clean Git repo
This is real enterprise DevOps.
๐ฎ What Comes After This?
Now you can move into advanced level:
- Convert legacy folder into
prod - Refactor into reusable EKS module
- Add separate
dev,stage,prod - Introduce environment variables instead of hardcoded ARNs
- Add cost monitoring
- Add security scanning in CI
- Simulate Kubernetes version upgrade
๐ Immediate Action For You
Before moving forward:
- Change apply to manual
- Add prevent_destroy
- Protect branch
After that, your production infrastructure is enterprise-grade safe.
This is production-safe.
โ
envs/legacy/main.tf (with prevent_destroy)
############################################################
# EXISTING EKS CLUSTER (LEGACY - IMPORTED)
############################################################
resource "aws_eks_cluster" "legacy" {
name = "jum-eks"
role_arn = "arn:aws:iam::021399177326:role/eks-admin-role"
version = "1.34"
bootstrap_self_managed_addons = false
enabled_cluster_log_types = []
access_config {
authentication_mode = "API_AND_CONFIG_MAP"
bootstrap_cluster_creator_admin_permissions = true
}
kubernetes_network_config {
ip_family = "ipv4"
service_ipv4_cidr = "10.100.0.0/16"
elastic_load_balancing {
enabled = false
}
}
upgrade_policy {
support_type = "STANDARD"
}
vpc_config {
subnet_ids = [
"subnet-07378454a0b7e50ab",
"subnet-0b7b72eb9bdb0786a",
"subnet-0d8b4bfe228a38a18"
]
security_group_ids = []
endpoint_public_access = true
endpoint_private_access = true
public_access_cidrs = ["0.0.0.0/0"]
}
zonal_shift_config {
enabled = false
}
tags = {}
##########################################################
# PRODUCTION PROTECTION
##########################################################
lifecycle {
prevent_destroy = true
}
}
############################################################
# EXISTING NODE GROUP (LEGACY - IMPORTED)
############################################################
resource "aws_eks_node_group" "legacy_nodes" {
cluster_name = "jum-eks"
node_group_name = "nodes"
node_role_arn = "arn:aws:iam::021399177326:role/node-roles"
subnet_ids = [
"subnet-07378454a0b7e50ab",
"subnet-0b7b72eb9bdb0786a",
"subnet-0d8b4bfe228a38a18"
]
capacity_type = "ON_DEMAND"
instance_types = ["t3.medium"]
ami_type = "AL2023_x86_64_STANDARD"
disk_size = 20
version = "1.34"
scaling_config {
min_size = 0
max_size = 1
desired_size = 0
}
update_config {
max_unavailable = 1
}
node_repair_config {
enabled = false
}
labels = {}
tags = {}
##########################################################
# PRODUCTION PROTECTION
##########################################################
lifecycle {
prevent_destroy = true
}
}
๐ What prevent_destroy Does
If someone:
- Deletes the resource block
- Changes name
- Tries to destroy via CI
- Runs
terraform destroy
Terraform will fail with:
Error: Instance cannot be destroyed
Resource has lifecycle.prevent_destroy set
This protects your production EKS.
๐ข Next Steps
After saving this file:
git add envs/legacy/main.tf
git commit -m "Add prevent_destroy to legacy EKS resources"
git push
Pipeline will run plan.
You should see:
No changes.
Because lifecycle does not change infrastructure โ it only affects Terraform behavior.
Top comments (0)