Series: From "Just Put It on a Server" to Production DevOps
Reading time: 18 minutes
Level: Intermediate
The ClickOps Problem
Your Kubernetes cluster is running perfectly. Traffic is growing. Your CTO says:
"We need a disaster recovery environment in another region."
Your thought process:
- Log into Linode dashboard
- Click "Create Cluster"
- Fill out form (region, node count, instance type)
- Wait 10 minutes
- Download kubeconfig
- Apply all Kubernetes manifests
- Update DNS
- Configure monitoring
- Set up backups
- Repeat for staging environment
- Realize you made a typo in production and it's different from staging
Time: 2-3 hours
Error-prone: ✅ (what settings did I use for production?)
Reproducible: ❌ (did I enable autoscaling? what was the node size?)
Documented: ❌ (it's in my head)
Your manager: "Can you create a dev cluster for the new engineer?"
You: [internal screaming]
This is called "ClickOps"—managing infrastructure through web UI clicks.
Problems:
- ❌ Not reproducible - Can't recreate exact environment
- ❌ No version control - No history of changes
- ❌ No code review - No approval process
- ❌ Hard to scale - Can't manage 10+ environments
- ❌ No automation - Manual work for every change
Solution: Infrastructure as Code (IaC)
What is Infrastructure as Code?
Infrastructure as Code (IaC): Define infrastructure in code files, apply with CLI tools.
Instead of clicking:
# Clicks in Linode dashboard:
1. Create Kubernetes Cluster
2. Choose region: us-east
3. Select nodes: 3x g6-standard-4
4. Enable autoscaling: 3-10 nodes
5. Click "Create"
You write code:
# infrastructure/terraform/main.tf
resource "linode_lke_cluster" "sspp_prod" {
label = "sspp-prod"
k8s_version = "1.28"
region = "us-east"
pool {
type = "g6-standard-4"
count = 3
autoscaler {
min = 3
max = 10
}
}
}
Then apply:
terraform apply
Benefits:
- ✅ Version controlled - Git tracks all changes
- ✅ Reproducible - Spin up identical environments
- ✅ Code review - PRs for infrastructure changes
- ✅ Automated - CI/CD can apply changes
- ✅ Self-documenting - Code is the documentation
Why Terraform?
IaC Tools:
- Terraform (HashiCorp) - Multi-cloud, largest ecosystem
- Pulumi - Use real programming languages (TypeScript, Python)
- CloudFormation - AWS-only, YAML
- Ansible - Configuration management, also does provisioning
- CDK (Cloud Development Kit) - AWS, defines CloudFormation
We're using Terraform because:
- ✅ Multi-cloud (Linode, AWS, GCP, Azure)
- ✅ Declarative (describe desired state)
- ✅ Large provider ecosystem
- ✅ Industry standard
- ✅ Free and open source
Terraform Fundamentals
Core Concepts
1. Providers: Plugins for cloud platforms
terraform {
required_providers {
linode = {
source = "linode/linode"
version = "~> 2.9.0"
}
}
}
provider "linode" {
token = var.linode_token
}
2. Resources: Infrastructure components
resource "linode_instance" "web" {
label = "web-server"
region = "us-east"
type = "g6-standard-1"
image = "linode/ubuntu22.04"
}
3. Variables: Parameterize configuration
variable "environment" {
type = string
default = "production"
}
resource "linode_lke_cluster" "cluster" {
label = "sspp-${var.environment}"
}
4. Outputs: Extract values
output "cluster_id" {
value = linode_lke_cluster.cluster.id
}
output "kubeconfig" {
value = linode_lke_cluster.cluster.kubeconfig
sensitive = true
}
5. State: Terraform tracks what it created
terraform.tfstate # Current infrastructure state
Terraform Workflow
# 1. Write configuration
vim main.tf
# 2. Initialize (download providers)
terraform init
# 3. Preview changes
terraform plan
# 4. Apply changes
terraform apply
# 5. Destroy infrastructure (cleanup)
terraform destroy
Building SSPP Infrastructure with Terraform
Project Structure
infrastructure/terraform/
├── main.tf # Main configuration
├── variables.tf # Input variables
├── outputs.tf # Output values
├── versions.tf # Provider versions
├── terraform.tfvars # Variable values (gitignored)
├── backend.tf # Remote state config
├── modules/
│ ├── lke-cluster/ # Kubernetes cluster module
│ ├── networking/ # VPC, firewall rules
│ └── monitoring/ # Monitoring setup
└── environments/
├── dev/
├── staging/
└── prod/
Step 1: Provider Configuration
# infrastructure/terraform/versions.tf
terraform {
required_version = ">= 1.6.0"
required_providers {
linode = {
source = "linode/linode"
version = "~> 2.9.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.24.0"
}
}
}
provider "linode" {
token = var.linode_token
}
provider "kubernetes" {
host = base64decode(linode_lke_cluster.sspp.kubeconfig)
cluster_ca_certificate = base64decode(linode_lke_cluster.sspp.api_endpoints[0].ca_certificate)
token = linode_lke_cluster.sspp.api_endpoints[0].token
}
Step 2: Variables
# infrastructure/terraform/variables.tf
variable "linode_token" {
description = "Linode API token"
type = string
sensitive = true
}
variable "environment" {
description = "Environment name (dev, staging, prod)"
type = string
default = "prod"
}
variable "region" {
description = "Linode region"
type = string
default = "us-east"
}
variable "k8s_version" {
description = "Kubernetes version"
type = string
default = "1.28"
}
variable "node_type" {
description = "Linode instance type for nodes"
type = string
default = "g6-standard-4"
}
variable "node_count" {
description = "Initial node count"
type = number
default = 3
}
variable "autoscaler_min" {
description = "Minimum nodes for autoscaling"
type = number
default = 3
}
variable "autoscaler_max" {
description = "Maximum nodes for autoscaling"
type = number
default = 10
}
variable "tags" {
description = "Tags for resources"
type = list(string)
default = ["sspp", "production"]
}
Step 3: LKE Cluster
# infrastructure/terraform/main.tf
resource "linode_lke_cluster" "sspp" {
label = "sspp-${var.environment}"
k8s_version = var.k8s_version
region = var.region
tags = var.tags
pool {
type = var.node_type
count = var.node_count
autoscaler {
min = var.autoscaler_min
max = var.autoscaler_max
}
}
control_plane {
high_availability = var.environment == "prod" ? true : false
}
}
# Save kubeconfig to file
resource "local_file" "kubeconfig" {
content = base64decode(linode_lke_cluster.sspp.kubeconfig)
filename = "${path.module}/kubeconfig-${var.environment}"
file_permission = "0600"
}
Step 4: NodeBalancer for LoadBalancer Services
# infrastructure/terraform/nodebalancer.tf
resource "linode_nodebalancer" "sspp_api" {
label = "sspp-api-${var.environment}"
region = var.region
tags = var.tags
}
resource "linode_nodebalancer_config" "sspp_api_https" {
nodebalancer_id = linode_nodebalancer.sspp_api.id
port = 443
protocol = "https"
check = "http"
check_path = "/api/v1/health"
check_attempts = 3
check_timeout = 5
ssl_cert = var.ssl_certificate
ssl_key = var.ssl_private_key
}
# Output NodeBalancer IP
output "api_load_balancer_ip" {
value = linode_nodebalancer.sspp_api.ipv4
description = "Load balancer IP for API service"
}
Step 5: Object Storage for Backups
# infrastructure/terraform/object-storage.tf
resource "linode_object_storage_bucket" "backups" {
cluster = "us-east-1"
label = "sspp-backups-${var.environment}"
lifecycle_rule {
enabled = true
expiration {
days = 30
}
}
}
resource "linode_object_storage_key" "backup_access" {
label = "sspp-backup-access-${var.environment}"
bucket_access {
bucket_name = linode_object_storage_bucket.backups.label
cluster = linode_object_storage_bucket.backups.cluster
permissions = "read_write"
}
}
output "backup_bucket_url" {
value = "https://${linode_object_storage_bucket.backups.cluster}.linodeobjects.com/${linode_object_storage_bucket.backups.label}"
}
Step 6: Firewall Rules
# infrastructure/terraform/firewall.tf
resource "linode_firewall" "cluster" {
label = "sspp-cluster-${var.environment}"
tags = var.tags
inbound {
label = "allow-https"
action = "ACCEPT"
protocol = "TCP"
ports = "443"
ipv4 = ["0.0.0.0/0"]
ipv6 = ["::/0"]
}
inbound {
label = "allow-http"
action = "ACCEPT"
protocol = "TCP"
ports = "80"
ipv4 = ["0.0.0.0/0"]
ipv6 = ["::/0"]
}
inbound {
label = "allow-k8s-api"
action = "ACCEPT"
protocol = "TCP"
ports = "6443"
ipv4 = var.allowed_ip_ranges
}
outbound_policy = "ACCEPT"
linodes = [for node in linode_lke_cluster.sspp.pool[0].nodes : node.instance_id]
}
Step 7: DNS Records
# infrastructure/terraform/dns.tf
resource "linode_domain" "sspp" {
domain = var.domain_name
soa_email = var.admin_email
type = "master"
tags = var.tags
}
resource "linode_domain_record" "api" {
domain_id = linode_domain.sspp.id
name = "api"
record_type = "A"
target = linode_nodebalancer.sspp_api.ipv4
ttl_sec = 300
}
resource "linode_domain_record" "wildcard" {
domain_id = linode_domain.sspp.id
name = "*"
record_type = "A"
target = linode_nodebalancer.sspp_api.ipv4
ttl_sec = 300
}
Step 8: Outputs
# infrastructure/terraform/outputs.tf
output "cluster_id" {
description = "LKE cluster ID"
value = linode_lke_cluster.sspp.id
}
output "cluster_endpoint" {
description = "Kubernetes API endpoint"
value = linode_lke_cluster.sspp.api_endpoints[0].endpoint
}
output "kubeconfig_path" {
description = "Path to kubeconfig file"
value = local_file.kubeconfig.filename
}
output "api_endpoint" {
description = "API public endpoint"
value = "https://api.${var.domain_name}"
}
output "backup_bucket" {
description = "S3-compatible backup bucket URL"
value = "https://${linode_object_storage_bucket.backups.cluster}.linodeobjects.com/${linode_object_storage_bucket.backups.label}"
}
output "backup_access_key" {
description = "Backup bucket access key"
value = linode_object_storage_key.backup_access.access_key
sensitive = true
}
output "backup_secret_key" {
description = "Backup bucket secret key"
value = linode_object_storage_key.backup_access.secret_key
sensitive = true
}
Remote State Management
Problem: terraform.tfstate is stored locally. If you lose it, Terraform can't manage your infrastructure.
Solution: Store state remotely (S3, Terraform Cloud).
# infrastructure/terraform/backend.tf
terraform {
backend "s3" {
bucket = "sspp-terraform-state"
key = "prod/terraform.tfstate"
region = "us-east-1"
encrypt = true
# State locking with DynamoDB
dynamodb_table = "terraform-state-lock"
}
}
Benefits:
- ✅ Team collaboration (shared state)
- ✅ State locking (prevents concurrent modifications)
- ✅ Encrypted at rest
- ✅ Version history
Alternative: Terraform Cloud
terraform {
cloud {
organization = "sspp"
workspaces {
name = "sspp-prod"
}
}
}
Using Terraform
Initialize
cd infrastructure/terraform
# Download providers
terraform init
Output:
Initializing the backend...
Initializing provider plugins...
- Finding linode/linode versions matching "~> 2.9.0"...
- Installing linode/linode v2.9.3...
Terraform has been successfully initialized!
Create Variable File
# infrastructure/terraform/terraform.tfvars
linode_token = "YOUR_LINODE_API_TOKEN"
environment = "prod"
region = "us-east"
domain_name = "sspp.example.com"
admin_email = "admin@example.com"
node_type = "g6-standard-4"
node_count = 3
autoscaler_min = 3
autoscaler_max = 10
tags = ["sspp", "production", "managed-by-terraform"]
Add to .gitignore:
echo "terraform.tfvars" >> .gitignore
echo "*.tfstate*" >> .gitignore
echo "kubeconfig-*" >> .gitignore
Plan (Preview Changes)
terraform plan
Output:
Terraform will perform the following actions:
# linode_lke_cluster.sspp will be created
+ resource "linode_lke_cluster" "sspp" {
+ id = (known after apply)
+ label = "sspp-prod"
+ k8s_version = "1.28"
+ region = "us-east"
+ pool {
+ count = 3
+ type = "g6-standard-4"
+ autoscaler {
+ min = 3
+ max = 10
}
}
}
# linode_object_storage_bucket.backups will be created
+ resource "linode_object_storage_bucket" "backups" {
+ cluster = "us-east-1"
+ label = "sspp-backups-prod"
}
Plan: 8 to add, 0 to change, 0 to destroy.
Review carefully! This is your code review checkpoint.
Apply
terraform apply
Terraform asks for confirmation:
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
Provisioning:
linode_lke_cluster.sspp: Creating...
linode_object_storage_bucket.backups: Creating...
linode_lke_cluster.sspp: Still creating... [10s elapsed]
linode_lke_cluster.sspp: Still creating... [5m0s elapsed]
linode_lke_cluster.sspp: Creation complete after 8m23s
Apply complete! Resources: 8 added, 0 changed, 0 destroyed.
Outputs:
cluster_id = "12345"
api_endpoint = "https://api.sspp.example.com"
kubeconfig_path = "./kubeconfig-prod"
Your infrastructure is now live! 🎉
Configure kubectl
export KUBECONFIG=$(terraform output -raw kubeconfig_path)
kubectl get nodes
Output:
NAME STATUS ROLES AGE VERSION
lke12345-67890-abc123 Ready <none> 5m v1.28.3
lke12345-67890-def456 Ready <none> 5m v1.28.3
lke12345-67890-ghi789 Ready <none> 5m v1.28.3
Terraform Modules for Reusability
Problem: Duplicating configuration for dev, staging, prod.
Solution: Create reusable modules.
Module Structure
infrastructure/terraform/modules/lke-cluster/
├── main.tf
├── variables.tf
└── outputs.tf
Module definition:
# infrastructure/terraform/modules/lke-cluster/main.tf
resource "linode_lke_cluster" "cluster" {
label = var.cluster_name
k8s_version = var.k8s_version
region = var.region
tags = var.tags
pool {
type = var.node_type
count = var.node_count
autoscaler {
min = var.autoscaler_min
max = var.autoscaler_max
}
}
control_plane {
high_availability = var.high_availability
}
}
Using the module:
# infrastructure/terraform/environments/prod/main.tf
module "prod_cluster" {
source = "../../modules/lke-cluster"
cluster_name = "sspp-prod"
region = "us-east"
k8s_version = "1.28"
node_type = "g6-standard-4"
node_count = 3
autoscaler_min = 3
autoscaler_max = 10
high_availability = true
tags = ["sspp", "prod"]
}
# infrastructure/terraform/environments/dev/main.tf
module "dev_cluster" {
source = "../../modules/lke-cluster"
cluster_name = "sspp-dev"
region = "us-east"
k8s_version = "1.28"
node_type = "g6-standard-2" # Smaller nodes
node_count = 2
autoscaler_min = 2
autoscaler_max = 5
high_availability = false # Dev doesn't need HA
tags = ["sspp", "dev"]
}
Benefits:
- ✅ DRY (Don't Repeat Yourself)
- ✅ Consistent configuration
- ✅ Easy to update (change module, affects all environments)
Terraform Best Practices
1. Use Variables for Everything
# ❌ Bad
resource "linode_lke_cluster" "cluster" {
label = "sspp-prod"
region = "us-east"
}
# ✅ Good
resource "linode_lke_cluster" "cluster" {
label = "sspp-${var.environment}"
region = var.region
}
2. Tag All Resources
tags = [
"environment:${var.environment}",
"managed-by:terraform",
"project:sspp",
"owner:devops-team"
]
Why: Cost tracking, resource filtering, compliance.
3. Use Remote State
# Never commit terraform.tfstate to git!
terraform {
backend "s3" {
bucket = "terraform-state"
key = "prod/terraform.tfstate"
}
}
4. Lock Provider Versions
# ❌ Bad - Could break with provider updates
terraform {
required_providers {
linode = {
source = "linode/linode"
}
}
}
# ✅ Good - Explicit version
terraform {
required_providers {
linode = {
source = "linode/linode"
version = "~> 2.9.0" # Allow 2.9.x, not 2.10.x
}
}
}
5. Use terraform fmt and validate
# Format code
terraform fmt -recursive
# Validate syntax
terraform validate
# Check for issues
tflint
6. Plan Before Apply
# Always review plan
terraform plan -out=tfplan
# Review changes
less tfplan
# Apply only if looks good
terraform apply tfplan
7. Use Workspaces for Environments
# Create workspaces
terraform workspace new dev
terraform workspace new staging
terraform workspace new prod
# Switch workspace
terraform workspace select prod
# List workspaces
terraform workspace list
Different state per workspace.
Terraform in CI/CD
GitHub Actions Workflow
# .github/workflows/terraform.yml
name: Terraform Infrastructure
on:
push:
branches: [main]
paths:
- 'infrastructure/terraform/**'
pull_request:
branches: [main]
paths:
- 'infrastructure/terraform/**'
env:
TF_VERSION: 1.6.0
jobs:
terraform:
name: Terraform Plan & Apply
runs-on: ubuntu-latest
defaults:
run:
working-directory: infrastructure/terraform
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}
- name: Terraform Format Check
run: terraform fmt -check -recursive
- name: Terraform Init
env:
LINODE_TOKEN: ${{ secrets.LINODE_TOKEN }}
run: terraform init
- name: Terraform Validate
run: terraform validate
- name: Terraform Plan
env:
LINODE_TOKEN: ${{ secrets.LINODE_TOKEN }}
run: |
terraform plan -out=tfplan
terraform show -no-color tfplan > plan.txt
- name: Comment PR with Plan
if: github.event_name == 'pull_request'
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const plan = fs.readFileSync('infrastructure/terraform/plan.txt', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: '```
{% endraw %}
terraform\n' + plan + '\n
{% raw %}
```'
});
- name: Terraform Apply
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
env:
LINODE_TOKEN: ${{ secrets.LINODE_TOKEN }}
run: terraform apply -auto-approve tfplan
Workflow:
-
PR: Runs
terraform plan, comments on PR -
Merge to main: Runs
terraform apply - Infrastructure changes are code-reviewed!
Managing Kubernetes Resources with Terraform
You can also manage Kubernetes resources with Terraform:
# infrastructure/terraform/kubernetes.tf
provider "kubernetes" {
host = linode_lke_cluster.sspp.api_endpoints[0].endpoint
token = linode_lke_cluster.sspp.api_endpoints[0].token
cluster_ca_certificate = base64decode(linode_lke_cluster.sspp.api_endpoints[0].ca_certificate)
}
resource "kubernetes_namespace" "sspp_prod" {
metadata {
name = "sspp-prod"
labels = {
environment = "production"
managed-by = "terraform"
}
}
}
resource "kubernetes_secret" "sspp_secrets" {
metadata {
name = "sspp-secrets"
namespace = kubernetes_namespace.sspp_prod.metadata[0].name
}
data = {
DB_PASSWORD = var.db_password
JWT_SECRET = var.jwt_secret
}
type = "Opaque"
}
But should you?
Pros:
- Everything in one place
- Terraform manages both cluster and apps
Cons:
- Mixing infrastructure and application concerns
- kubectl is faster for iteration
- ArgoCD is better for GitOps
Best practice: Use Terraform for infrastructure (cluster, nodes), use Kubernetes manifests or ArgoCD for applications.
Real-World Example: Multi-Environment Setup
infrastructure/terraform/
├── modules/
│ └── lke-cluster/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── terraform.tfvars
│ │ └── backend.tf
│ ├── staging/
│ │ ├── main.tf
│ │ ├── terraform.tfvars
│ │ └── backend.tf
│ └── prod/
│ ├── main.tf
│ ├── terraform.tfvars
│ └── backend.tf
Deploy dev:
cd environments/dev
terraform init
terraform apply
Deploy staging:
cd environments/staging
terraform init
terraform apply
Deploy prod:
cd environments/prod
terraform init
terraform apply
Identical configuration, different parameters.
Cost Estimation with Infracost
How much will this cost?
# Install Infracost
brew install infracost
# Get cost estimate
infracost breakdown --path infrastructure/terraform
# Output:
# ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
# ┃ Resource ┃ Monthly cost ┃
# ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━━━━━┫
# ┃ linode_lke_cluster.sspp ┃ ┃
# ┃ └─ Node pool (3x g6-4) ┃ $120.00 ┃
# ┃ linode_nodebalancer ┃ $10.00 ┃
# ┃ object_storage_bucket ┃ $5.00 ┃
# ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━━━━━┫
# ┃ TOTAL ┃ $135.00 ┃
# ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━┛
Add to CI/CD:
- name: Cost estimate
run: |
infracost breakdown --path . --format json > cost.json
infracost comment github --path cost.json
PR comments show cost changes!
What We Solved
✅ Reproducible infrastructure - Code defines everything
✅ Version controlled - Git tracks all changes
✅ Code reviewed - PRs for infrastructure changes
✅ Multi-environment - Dev, staging, prod from same code
✅ Self-documenting - Terraform files are the docs
✅ Automated - CI/CD applies changes
✅ Cost visible - Infracost shows price before apply
What's Next?
We have infrastructure as code with Terraform. But application deployment is still manual:
kubectl apply -f k8s/
Problems:
- ❌ No rollback mechanism
- ❌ No deployment history
- ❌ Manual sync between Git and cluster
- ❌ No automatic sync when manifests change
In Part 9, we'll add GitOps with ArgoCD:
- Git as single source of truth
- Automatic sync (Git → Cluster)
- Deployment history and rollback
- Multi-cluster management
- Self-service deployments
Push to Git → ArgoCD deploys automatically.
Try It Yourself
Challenge: Create complete Terraform infrastructure:
- Create Linode account (free $100 credit)
- Get API token
- Write Terraform configuration for LKE cluster
- Add NodeBalancer, object storage, firewall
- Create dev, staging, prod environments
- Apply with Terraform
- Deploy SSPP to each environment
- Add cost estimation with Infracost
- Set up CI/CD for Terraform
Bonus: Manage Kubernetes namespaces with Terraform.
Discussion
Do you use Terraform? Pulumi? CloudFormation? What's your IaC tool of choice?
Share on GitHub Discussions.
Previous: Part 6: Kubernetes Without Magic
Next: Part 8: Helm - Packaging Kubernetes Applications
About the Author
Building this series for my Proton.ai application to demonstrate real DevOps thinking.
- GitHub: @daviesbrown
- LinkedIn: David Nwosu
Top comments (0)