For infra engineers tired of overengineering. Real-world infra-as-code setups that scale without eating your weekends. Hooks in CI/CD and container orchestration too.
Press enter or click to view image in full size
Introduction: DevOps shouldn’t feel like defusing a bomb
There was a point when I dreaded opening Terraform files.
One module update would cascade into a 200-line plan, followed by a 45-minute CI run, and then a production deployment that broke staging for reasons no one could explain.
Sound familiar?
DevOps tools are supposed to make our lives easier, not trap us in YAML dungeons or TypeScript inferno. But when every deploy feels like walking through a minefield of hand-rolled scripts, snowflake infra, and tribal CI rituals… burnout isn’t far behind.
This article isn’t a “which tool is best” kind of post. It’s about the stack that actually let me sleep on weekends.
The combo of Terraform, Pulumi, and Ansible wired up with clean CI/CD and container deployment patterns gave me a setup I could trust, scale, and explain to a junior without writing a 90-page wiki.
If you’re tired of being a YAML archaeologist or a Jenkins hostage, this one’s for you.
Table of contents
- Terraform: My go-to for infrastructure, but with guardrails
- Pulumi: TypeScript over YAML? Yes, please
- Ansible: Still unbeatable for config management
- CI/CD glue: GitHub Actions + sane workflows
- Container orchestration without Kubernetes-induced stress
- What not to do: Anti-patterns I’ve painfully debugged
- Conclusion: The stack that didn’t burn me out
- Helpful resources: Docs, examples, and communities worth bookmarking
1. Terraform: My go-to for infrastructure, but with guardrails
Terraform is like that friend who’s incredibly helpful but only if you set boundaries.
It’ll happily spin up your entire cloud infra VPCs, EC2s, RDS instances, IAM roles, DNS records and then try to delete half of it if you sneeze in the wrong main.tf
.
That’s why I keep my Terraform usage simple, modular (not overmodular), and always behind a CI step that yells at me if I mess up formatting, validation, or state.
The rule of thumb: Minimal modules, maximum sanity
You don’t need a network-vpc-shared-core
module that depends on another terraform-aws-vpc-core-enhanced
module that calls a child_private_subnet
module with 23 inputs.
This is enough to get an EC2 instance up in a dedicated VPC:
hcl
provider "aws" {
region = "us-east-1"
}
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
}
resource "aws_subnet" "main" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.1.0/24"
}
resource "aws_instance" "web" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.micro"
subnet_id = aws_subnet.main.id
tags = {
Name = "web-server"
}
}
That’s it. You can add modules later. But this? This deploys and works.
State management: remote backend or bust
If you’re still committing terraform.tfstate
to Git... please stop. You’re not just versioning infra, you're playing Russian roulette with prod.
Use a remote backend S3 with DynamoDB locking is the classic AWS-safe combo:
hcl
terraform {
backend "s3" {
bucket = "my-tf-state-bucket"
key = "envs/prod/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "tf-state-lock"
}
}
This prevents coworkers (or your future self) from stepping on your plans at the worst moment.
CI sanity: Format it, validate it, plan it
My go-to terraform
steps in GitHub Actions:
- name: Terraform Format
run: terraform fmt -check
- name: Terraform Init
run: terraform init
- name: Terraform Validate
run: terraform validate
- name: Terraform Plan
run: terraform plan -out=tfplan
You can skip the apply in CI unless you trust your team like family. And even then… maybe not.
Terraform’s not perfect. Sometimes the plan lies. Sometimes the diff gaslights you. But when it’s kept in check no overmodularization, no spaghetti outputs it’s still the most stable way I’ve found to manage core infra across AWS, Azure, and GCP.
2. Pulumi: TypeScript over YAML? Yes, please
I love Terraform, but sometimes I just want to use real programming logic without abusing count
or for_each
like some sort of HCL warlock. That’s where Pulumi comes in it lets you use TypeScript, Python, Go, or even .NET to define infrastructure.
And no, this isn’t “just another YAML templating thing.” It’s actual code. With variables. And loops. And functions. Like a grown-up.
Why Pulumi clicks for app-level infra
When I’m building something app-centric like spinning up a web API, wiring S3 buckets, or defining scalable cloud-native workloads Pulumi’s flexibility lets me move faster and debug with the same tools I use for dev.
Here’s an example: deploying a DigitalOcean app with just TypeScript.
import * as pulumi from "@pulumi/pulumi";
import * as digitalocean from "@pulumi/digitalocean";
const app = new digitalocean.App("my-app", {
spec: {
name: "my-ts-app",
region: "nyc",
services: [{
name: "frontend",
runCommand: "npm start",
github: {
repo: "username/my-app",
branch: "main",
},
envs: [
{ key: "NODE_ENV", value: "production" }
],
}],
},
});
That’s it. You get full type safety, use familiar package managers, and it works like code not some cursed DSL designed by an intern with Stockholm syndrome.
CI/CD is cleaner too
Pulumi integrates beautifully into GitHub Actions or any pipeline. And there’s no need for a separate state management backend if you use Pulumi Cloud. But you can store state in S3 or anywhere else if you want full control.
A basic CI/CD job:
- uses: actions/setup-node@v3
with:
node-version: '18'
- run: npm install
- run: pulumi login
- run: pulumi preview
- run: pulumi up --yes
Oh, and you can mix in other logic API calls, loops, filters like you’d do in any app.
When do I use Pulumi vs Terraform?
- Terraform: for core infra, especially across clouds (VPCs, networking, IAM, etc.)
- Pulumi: for developer-owned infra apps, CI/CD glue, feature infra (like SQS queues, CloudFront, etc.)
Use both. Or either. The key is to not write 800 lines of YAML when 40 lines of TS would do.
Press enter or click to view image in full size
3. Ansible: Still unbeatable for config management
Ansible is like Vim. Old, powerful, and everyone’s got an opinion about it.
And just like Vim, once you learn a few keystrokes or in this case, a few playbooks
it’s shockingly effective.
Yes, it’s 2025 and we’re still SSH-ing into boxes sometimes. No, that’s not a failure it’s just reality when you’re not running everything in Kubernetes (bless your soul if you are).
When I still reach for Ansible
If I need to:
- Bootstrap a fresh server
- Install Docker and add a non-root user
- Harden a Linux box for production
- Sync dotfiles across VMs like it’s 2012
…I’m reaching for Ansible. Every. Time.
It’s declarative. It uses plain ol’ SSH. It doesn’t need a central daemon or weird config servers. And once your playbooks work, they stay working.
Here’s an example that installs Docker and adds a deploy user:
- name: Setup basic server
hosts: all
become: yes
tasks:
- name: Install required packages
apt:
name:
- docker.io
- git
- curl
update_cache: yes
- name: Create deploy user
user:
name: deploy
groups: docker
shell: /bin/bash
create_home: yes
Run it with:
ansible-playbook -i inventory setup.yml
You can version it. Re-run it. Share it. It’s reproducible and doesn’t rely on mystery bash scripts you wrote during a caffeine binge in 2018.
When not to use Ansible
If you’re trying to deploy apps every 30 minutes, Ansible is not your guy.
It’s better for provisioning, not for CI/CD-level frequency.
Instead of having Ansible copy over app binaries and restart services every deploy, let Docker (or Pulumi, or CI/CD tools) handle that. Use Ansible once to get the host ready, then let other tools do their thing.
Also: don’t abuse shell:
tasks. If 60% of your playbook is shell: "apt-get install..."
, you’re probably doing it wrong.
Where it fits in my stack
- I use Ansible once per host, early in the lifecycle
- It sets up the base OS, security hardening, monitoring agents, users, and sometimes Docker
- After that, everything else is handled by containers, CI/CD, or cloud-native tools
Ansible’s not dead. It’s just graduated to that stable, quiet part of your toolchain that never pages you at 2am.
4. CI/CD glue: GitHub Actions + sane workflows
You don’t need a Jenkins tower built on 300 Groovy scripts and duct tape.
Unless you’re deploying rockets or banks, GitHub Actions (or something similarly straightforward) will do just fine.
CI/CD is glue. Not your infra engine, not your production brain glue.
Its job is to run terraform plan
, test your code, build your container, deploy your app, and get out of the way.
Let’s keep it that way.
My real-world GitHub Actions setup
Here’s a simple but solid workflow I’ve used in production:
name: Deploy to staging
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Node
uses: actions/setup-node@v3
with:
node-version: '18'
- name: Install deps
run: npm install
- name: Lint and test
run: |
npm run lint
npm test
- name: Set up Pulumi
uses: pulumi/actions@v4
with:
command: up
stack-name: org/project/staging
work-dir: ./infra
env:
PULUMI_ACCESS_TOKEN: ${{ secrets.PULUMI_ACCESS_TOKEN }}
This is clean, readable, and works for deploying apps and infra in one go. Add caching if needed. Add secret scanning if you’re feeling spicy. But don’t turn it into Jenkins 2.0.
Secret management 101
Don’t commit secrets. Don’t even commit fake secrets. Just don’t.
Use GitHub Secrets, or even better tools like:
And inject them only at runtime. Never store them in state, in logs, or in plain text outputs.
Caching for speed, but keep it readable
A 45-minute CI job is a cry for help.
Use basic caching for node_modules
, venv
, .terraform
, and Docker layers.
GitHub Actions makes this easy:
- name: Cache node_modules
uses: actions/cache@v3
with:
path: ~/.npm
key: ${{ runner.os }}-npm-${{ hashFiles('/package-lock.json') }}
restore-keys: |
${{ runner.os }}-npm-
One line of caching can cut minutes off your job and hours off your life frustration.
Press enter or click to view image in full size
5. Container orchestration without Kubernetes-induced stress
I respect Kubernetes.
I also respect my free time, my weekend sleep, and my sanity.
And sometimes, those two things don’t align.
Kubernetes is great if you actually need it like if you’re running a fleet of microservices, autoscaling globally, or working at a company with a dedicated Platform team.
But if you’re a solo engineer or small team deploying a few apps? Kubernetes might be complete overkill. It’s the “buying a tank to commute to work” situation.
When I don’t use Kubernetes
Let’s kill the myth: “You’re not a real DevOps engineer unless you use Kubernetes.”
Here’s when I skip it:
- I have <10 containers per environment
- No need for complex autoscaling logic
- CI/CD already handles rolling deploys
- I don’t want to maintain Helm charts in my nightmares
Instead, I reach for lighter tools that Just Work™.
The real MVPs: Docker Compose, ECS Fargate, and Nomad
1. Docker Compose
- Local dev and small prod setups
- Great for one-box deploys
- Dead simple: define services, volumes, networks run it
Example:
version: '3'
services:
web:
image: nginx
ports:
- "80:80"
api:
build: ./api
environment:
- NODE_ENV=production
You can run this locally, or even wrap it in systemd
or a simple ansible
service file for production.
2. ECS Fargate
- AWS-managed containers without managing nodes
- Use Terraform or Pulumi to define services
- No VMs, no patching just define task + container + CPU/mem and go
hcl
resource "aws_ecs_task_definition" "app" {
family = "my-app"
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
container_definitions = jsonencode([
{
name = "api",
image = "myrepo/api:latest",
portMappings = [{ containerPort = 3000 }]
}
])
cpu = "256"
memory = "512"
}
It’s clean, serverless (sorta), and battle-tested.
3. Nomad
- From HashiCorp like Terraform’s cousin who likes containers
- Great for hybrid or on-prem workloads
- Easier learning curve than Kubernetes
- Uses HCL like Terraform
If you’re already using HashiCorp tooling, Nomad might be the easiest orchestration layer you ever try.
Helm: Useful, until it isn’t
Helm is the duct tape of Kubernetes.
Need to deploy PostgreSQL? There’s a chart for that. Redis? Chart. Full app with ingress and autoscaling? Yup, chart.
But be warned:
- Helm charts can drift from your actual app logic
- Debugging failed releases is sometimes a nightmare
- It’s another templating language to learn… inside a YAML DSL… inside Kubernetes
Use Helm for quick setups or third-party apps. But if your own app has 1,200 lines of Helm templating? Rethink your life choices.
My rule: Use the lightest tool that gets the job done
If I can get by with Compose + systemd, I’ll do it.
If I need managed deploys with cloud networking, ECS or Cloud Run is my go-to.
Kubernetes? I’ll reach for it only when I know the complexity is worth the benefits and I’ve got the team to maintain it.
6. What not to do: Anti-patterns I’ve painfully debugged
Let’s be real most infra horror stories aren’t about tools.
They’re about how we use them. Or abuse them. Or duct tape them together in 2AM Slack-fueled chaos.
I’ve learned more from debugging DevOps disasters than from any tutorial. So here’s a short list of anti-patterns that turned into long weekends.
Terraform modules from hell
Modular Terraform is good.
But 9 layers of nested modules with variables passed like Matryoshka dolls?
Not good.
I once inherited a repo where a simple change like adding a new tag required modifying a root main.tf
, a child module, an inner module, a reused VPC module, and an external terraform-aws-tags
wrapper.
Terraform Plan: 180 resources
Terraform Apply: broke staging and created a duplicate NAT gateway for no reason
Fix: Keep your modules shallow. Use composition, not recursion. And don’t fear repetition 3 clean resource blocks are better than 1 magical abstracted mess.
CI/CD pipelines that plan for 30 minutes
CI/CD should be your ally, not your trial-by-fire. But I’ve seen:
- Terraform
init
andget
run in every single step - Separate jobs for plan, format, validate per environment
- Parallel jobs that bottleneck on secrets or state locks
These pipelines make you afraid to merge code.
Fix: Cache intelligently. Combine steps when it makes sense. Only run Terraform plan if code changes in infra/
. Use paths
filters to avoid over-triggering:
on:
push:
paths:
- 'infra/'
Pulumi projects with 6000 lines of TS spaghetti
Yes, Pulumi lets you use real programming logic.
No, that doesn’t mean you should recreate an ORM, a routing engine, and an AI assistant in your infra code.
I’ve seen Pulumi stacks with:
- 8 nested loops over S3 buckets
- Conditional logic based on Git branches
- Abstractions that look like React components with props
Fix: Use TypeScript to simplify, not complicate. Group infra by domain (api
, db
, frontend
). Use helper functions, not frameworks. Keep logic readable by humans, not just VS Code intellisense.
Overusing Ansible when containers would’ve been better
Yes, Ansible is great. But no, you shouldn’t use it to:
- Copy your app code to servers every 10 minutes
- Restart services with
systemctl
in 15 playbooks - Handle rolling updates manually
One time I debugged an Ansible deploy system where every service had its own inventory file, .env
, and post-deploy SSH hook.
Docker Compose could’ve done the same thing with one YAML file and a volume mount.
Fix: Use Ansible to set up hosts, not deploy apps. Containers were invented for this reason.
Avoiding these patterns isn’t just about code quality it’s about your life quality.
Nobody wants to be on-call for a Terraform plan.
Nobody wants to explain why terraform destroy
wiped out prod but left the RDS instance somehow.
Make your infra predictable, debuggable, and most of all boring.
7. Conclusion: The stack that didn’t burn me out
There’s a quiet beauty in infrastructure that just works no surprise deploys, no midnight alerts, no mysterious YAML incantations.
This stack Terraform + Pulumi + Ansible, held together with clean CI/CD and container orchestration that fits the job didn’t just get the job done. It gave me mental space to focus on features, not firefighting.
Not because it’s the trendiest setup. Not because I saw it in a conference talk. But because:
- It’s composable each tool does one thing well
- It’s debuggable I can reason through issues without praying to Stack Overflow
- It’s boring and boring is beautiful when your pager’s on
Would this stack work for every team? No.
But if you’re an infra engineer or solo dev tired of gluing together overengineered systems for basic needs this setup might just be your way out of burnout land.
The best infra isn’t the one that makes you feel like a DevOps god.
It’s the one that lets you forget about it until it needs a tweak.
8. Helpful resources
Here’s a list of docs, examples, and communities that helped me build (and survive) this stack:
Terraform
Pulumi
Ansible
- Ansible Galaxy
- Ansible Examples Repo
CI/CD
Container Orchestration
Communities worth lurking in:
Press enter or click to view image in full size
Top comments (0)