Introduction
"Should we use Ansible or Terraform?" is one of the most common questions teams ask when building their infrastructure automation. The honest answer is: you'll probably end up using both, and that's fine.
But understanding when to reach for each tool - and where they overlap and diverge - saves you from building brittle automation that fights against the tool's design. Terraform is a provisioning tool. Ansible is a configuration management tool. They solve different problems, and using the wrong one for the job leads to hacks, workarounds, and unmaintainable code.
This article breaks down the real differences, shows concrete examples of each tool doing what it does best, and presents a battle-tested pattern for combining them in production.
The Fundamental Difference
Terraform is declarative infrastructure provisioning. You describe what you want to exist, and Terraform figures out how to get there. It maintains a state file that tracks what it has created, and on each run it calculates the diff between desired state and actual state.
Ansible is procedural configuration management. You describe steps to execute on machines, and Ansible runs them in order. It's agentless (connects via SSH or WinRM) and idempotent (most modules check before acting).
Here's the clearest way to think about it:
- Terraform: Creates the infrastructure (VPCs, EC2 instances, RDS databases, load balancers, DNS records)
- Ansible: Configures what's running on that infrastructure (installs packages, deploys applications, manages config files, sets up users)
There's overlap in the middle - Terraform can run shell scripts via provisioners, and Ansible can create AWS resources via its cloud modules. But using them in their sweet spot produces much cleaner automation.
Terraform: What It Does Best
Terraform excels at managing cloud resources with complex dependency graphs. Its state management and plan/apply workflow give you confidence that changes won't surprise you.
# Production VPC with public and private subnets
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
tags = {
Name = "production"
Environment = "prod"
}
}
resource "aws_subnet" "private" {
count = 3
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index + 1}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = {
Name = "private-${count.index + 1}"
Tier = "private"
}
}
resource "aws_db_instance" "postgres" {
identifier = "app-db"
engine = "postgres"
engine_version = "16.4"
instance_class = "db.r6g.large"
allocated_storage = 100
storage_encrypted = true
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.db.id]
username = "admin"
password = var.db_password
backup_retention_period = 14
multi_az = true
deletion_protection = true
}
# Output the database endpoint for Ansible to use
output "db_endpoint" {
value = aws_db_instance.postgres.endpoint
}
Key Terraform strengths:
-
terraform planshows you exactly what will change before you apply - State file tracks resource dependencies and enables safe destruction
- Provider ecosystem covers every major cloud and SaaS platform
- Modules enable reusable infrastructure patterns
- Import existing resources into state management
Where Terraform struggles:
- Configuring the OS inside a VM (it can do it via provisioners, but it's hacky)
- Managing application deployments and rolling updates
- Anything that requires imperative logic or conditional execution chains
- Secrets management within state files (use a backend with encryption)
Ansible: What It Does Best
Ansible shines at configuring systems, deploying applications, and orchestrating multi-step workflows across fleets of machines.
# playbooks/configure-web-servers.yml
---
- name: Configure web servers
hosts: web_servers
become: yes
vars:
app_user: deploy
app_dir: /opt/myapp
node_version: "20"
tasks:
- name: Create application user
user:
name: "{{ app_user }}"
shell: /bin/bash
create_home: yes
- name: Install Node.js repository
shell: |
curl -fsSL https://deb.nodesource.com/setup_{{ node_version }}.x | bash -
args:
creates: /etc/apt/sources.list.d/nodesource.list
- name: Install system packages
apt:
name:
- nodejs
- nginx
- certbot
- python3-certbot-nginx
state: present
update_cache: yes
- name: Deploy Nginx configuration
template:
src: templates/nginx.conf.j2
dest: /etc/nginx/sites-available/myapp
mode: "0644"
notify: Reload Nginx
- name: Deploy application
synchronize:
src: "{{ playbook_dir }}/../dist/"
dest: "{{ app_dir }}/"
delete: yes
become_user: "{{ app_user }}"
notify: Restart Application
- name: Install npm dependencies
npm:
path: "{{ app_dir }}"
production: yes
become_user: "{{ app_user }}"
- name: Configure systemd service
template:
src: templates/myapp.service.j2
dest: /etc/systemd/system/myapp.service
notify:
- Reload systemd
- Restart Application
handlers:
- name: Reload Nginx
service:
name: nginx
state: reloaded
- name: Reload systemd
systemd:
daemon_reload: yes
- name: Restart Application
service:
name: myapp
state: restarted
Key Ansible strengths:
- Agentless - just needs SSH access
- Massive module library (3,000+ modules)
- Jinja2 templating for dynamic configuration files
- Rolling deployments with
serialkeyword - Vault for encrypting secrets in your repo
Where Ansible struggles:
- No state file - it doesn't know what it created previously
- Drift detection is limited (it checks current state per-task, not holistically)
- Cloud resource provisioning works but lacks Terraform's plan/apply safety
- Performance degrades with large inventories without optimization (pipelining, mitogen)
The Combined Pattern: Terraform Provisions, Ansible Configures
Here's the pattern that works in production. Terraform creates infrastructure and outputs connection details. Ansible picks up those outputs and configures the machines.
Step 1: Terraform creates the instances
resource "aws_instance" "web" {
count = 3
ami = data.aws_ami.ubuntu.id
instance_type = "t3.medium"
key_name = aws_key_pair.deploy.key_name
subnet_id = aws_subnet.private[count.index].id
vpc_security_group_ids = [aws_security_group.web.id]
tags = {
Name = "web-${count.index + 1}"
Role = "web_server"
}
}
# Generate Ansible inventory from Terraform state
resource "local_file" "ansible_inventory" {
content = templatefile("templates/inventory.tftpl", {
web_servers = aws_instance.web[*].private_ip
db_endpoint = aws_db_instance.postgres.endpoint
})
filename = "${path.module}/../ansible/inventory/production"
}
inventory.tftpl template:
[web_servers]
%{ for ip in web_servers ~}
${ip}
%{ endfor ~}
[web_servers:vars]
db_host=${db_endpoint}
Step 2: Ansible configures the instances
# After terraform apply completes:
cd ../ansible
ansible-playbook -i inventory/production playbooks/configure-web-servers.yml
Step 3: Wrap it in a Makefile or CI pipeline
.PHONY: deploy
deploy: infra configure
infra:
cd terraform && terraform apply -auto-approve
configure:
cd ansible && ansible-playbook -i inventory/production playbooks/site.yml
Common Anti-Patterns to Avoid
1. Using Terraform provisioners for configuration management
# DON'T do this
resource "aws_instance" "web" {
ami = "ami-123456"
instance_type = "t3.medium"
provisioner "remote-exec" {
inline = [
"sudo apt-get update",
"sudo apt-get install -y nginx",
"sudo systemctl start nginx",
# ... 50 more lines of shell commands
]
}
}
Provisioners run once at creation time, aren't tracked in state, can't be re-run easily, and have no idempotency guarantees. Use them only for bootstrapping (installing Ansible, setting up SSH keys) if you must.
2. Using Ansible to create cloud resources at scale
# This works but is painful to maintain
- name: Create VPC
amazon.aws.ec2_vpc_net:
name: production
cidr_block: 10.0.0.0/16
region: us-east-1
register: vpc
- name: Create subnets
amazon.aws.ec2_vpc_subnet:
vpc_id: "{{ vpc.vpc.id }}"
cidr: "10.0.{{ item }}.0/24"
az: "us-east-1{{ ['a','b','c'][item - 1] }}"
loop: [1, 2, 3]
register: subnets
This works for simple setups, but without a state file you can't easily destroy resources, detect drift, or see a plan before applying changes. For anything beyond a handful of resources, use Terraform.
3. Duplicating logic between both tools
If Terraform creates a security group allowing port 443, don't also have Ansible configuring ufw to allow port 443 unless you have a specific reason (defense in depth). Duplicated rules lead to confusion about which tool is the source of truth.
Real-World Decision Framework
Ask these questions when deciding which tool to use for a specific task:
| Question | Terraform | Ansible |
|---|---|---|
| Am I creating a cloud resource? | Yes | No |
| Am I installing software on a server? | No | Yes |
| Do I need to see a plan before executing? | Yes | N/A |
| Am I managing config files on servers? | No | Yes |
| Do I need to track resource lifecycle? | Yes | N/A |
| Am I orchestrating a multi-step deployment? | No | Yes |
| Is this a one-time setup or ongoing? | Both | Both |
For Kubernetes specifically: use Terraform to create the cluster (EKS, GKE, AKS), then use kubectl/Helm/ArgoCD to manage workloads inside it. Ansible can work here but Kubernetes has its own declarative model that's better suited.
Directory Structure for Combined Projects
infrastructure/
├── terraform/
│ ├── environments/
│ │ ├── production/
│ │ │ ├── main.tf
│ │ │ ├── variables.tf
│ │ │ └── terraform.tfvars
│ │ └── staging/
│ ├── modules/
│ │ ├── vpc/
│ │ ├── compute/
│ │ └── database/
│ └── templates/
│ └── inventory.tftpl
├── ansible/
│ ├── inventory/
│ │ ├── production # Generated by Terraform
│ │ └── staging # Generated by Terraform
│ ├── playbooks/
│ │ ├── site.yml
│ │ ├── web-servers.yml
│ │ └── monitoring.yml
│ ├── roles/
│ │ ├── common/
│ │ ├── nginx/
│ │ ├── nodejs/
│ │ └── monitoring/
│ └── group_vars/
│ ├── all.yml
│ └── web_servers.yml
├── Makefile
└── .github/workflows/deploy.yml
This structure keeps concerns separated while maintaining a clear flow: Terraform creates, Ansible configures.
Need Help with Your DevOps?
Building infrastructure automation that scales requires knowing when to use provisioning tools, configuration management, or both. At InstaDevOps, we design and implement your complete automation stack - Terraform, Ansible, CI/CD, and everything in between - starting at $2,999/month.
Book a free 15-minute consultation to discuss your infrastructure automation: https://calendly.com/instadevops/15min
Top comments (0)