π Introduction: The Tale of Two Revolutions
When you first provision a cloud VM, it's like moving into a bare house. The structure is there (thanks, Terraform), but it's empty. Someone still needs to paint the walls, install the plumbing, and wire the electricity. That someone, in the DevOps world, is Ansible.
This post marks Week 8 of our 12-week DevOps Micro Internship Cohort, where I journeyed through HandsOn assignment projects β that collectively taught me how to go from clicking buttons in Azure Portal to orchestrating thousands of infrastructure deployments with a few lines of YAML.
Let me ask myself the critical questions first, then unfold the answers:
β What's the real difference between Terraform and Ansible?
β Why do we need ad-hoc commands when we have playbooks?
β How do Ansible roles turn chaos into reusable components?
β What causes 70% of real-world DevOps failures?
By the end of this post, you'll have answers to all of these questions βbacked by real errors I faced, how I debugged, and what lessons I learned from 'em.
ποΈ PART 1: The Foundation β Infrastructure as Code with Terraform
Section 1.1: From Portal Clicks to Code
Question: Why does clicking buttons in Azure Portal feel like the enemy?
Answer: Because it doesn't scale. Imagine deploying 100 identical servers. Clicking a button 100 times is human; clicking it 1 time to provision 100 servers is DevOps.
π’ Provision 4 Azure VMs using Terraform.
The naive approach would be writing 4 separate azurerm_linux_virtual_machine resource blocks. The smarter approach? Use the count parameter:
resource "azurerm_public_ip" "vms" {
count = var.vm_count
name = "pip-vm${count.index}-${var.environment}"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
allocation_method = "Static"
sku = "Standard"
}
resource "azurerm_linux_virtual_machine" "vms" {
count = var.vm_count
name = "vm-${var.environment}-${count.index}"
# ... rest of config
}
This single block, repeated via count, creates N VMs dynamically. Change vm_count from 4 to 3, and Terraform automatically adds/removes resources while maintaining state. That's not just convenienceβthat's reproducibility at scale.
Section 1.2: The Azure Quota Lesson
The Error I Hit:
Error: "PublicIPCountLimitReached": Cannot create more than 3 public IP addresses
for this subscription in this region.
What Happened:
I provisioned 4 VMs, each with a public IP. Azure Free Tier allows only 3 public IPs per region. My infrastructure demand exceeded the platform's constraints.
The Debugging Process:
# Step 1: Validate quota
az network list-usages --location centralindia \
--query "[?localName=='Public IP Addresses']"
# Output showed:
# "name": { "value": "PublicIPAddresses" },
# "limit": 3,
# "currentValue": 3
The Fix:
Edited variables.tf:
variable "vm_count" {
default = 3 # Changed from 4 to 3
}
Reapplied:
terraform plan
terraform apply
π‘ Key Learning: Infrastructure constraints are real. They're not bugs; they're limits. Always read your platform's quotas before coding. Free tiers teach discipline.
π PART 2: Connecting Machines β The SSH Key Saga
Section 2.1: Why Passwordless SSH Matters
Question: Why do we obsess over SSH keys in DevOps?
Answer: Because they're the gateway to everything. If you can SSH to a machine, you can deploy, configure, or destroy it. Passwords are dinosaurs; keys are present-day security.
π’ Set up passwordless SSH to 4 Azure VMs.
The naive approach: Generate a key, hope it works.
The professional approach: Generate a key, validate the format, ensure the VM accepts it, test the connection.
Section 2.2: The SSH Key Format Error
The Error I Hit:
Error: "admin_ssh_key.0.public_key" is not a complete SSH2 Public Key
Root Cause:
I passed the file path to Terraform instead of the file contents.
# WRONG:
terraform apply -var="ssh_public_key=~/.ssh/id_rsa.pub"
# RIGHT:
terraform apply -var="ssh_public_key=$(cat ~/.ssh/id_rsa.pub)"
The Debugging Process:
# Step 1: Inspect what Terraform received
terraform console
> var.ssh_public_key
=> "~/.ssh/id_rsa.pub" # <- TILDE NOT EXPANDED!
# Step 2: Generate the actual key content
cat ~/.ssh/id_rsa.pub
# Output: ssh-rsa AAAAB3NzaC1yc2E... user@machine
# Step 3: Use $(...) to inject the content
terraform apply -var="ssh_public_key=$(cat ~/.ssh/id_rsa.pub)"
π‘ Key Learning: Variables are strings until evaluated. Terraform won't expand ~ unless told explicitly. Always inspect your variable values in terraform console.
Section 2.3: Testing SSH Connectivity
Once deployed, I verified access:
# Get public IP
PUBLIC_IP=$(terraform output -raw public_ip)
# Test SSH (no password!)
ssh azureuser@$PUBLIC_IP hostname
# Output: vm-demo-0
π This single command confirms: VM is reachable, SSH key is accepted, and we have command execution. It's the foundation for all automation that follows.
π PART 3: Ad-Hoc Automation β The Speed Advantage
Section 3.1: What Are Ad-Hoc Commands?
Question: Why write a playbook when you can run a single command?
Answer: Sometimes you don't. Ad-hoc commands are your rapid-response team for:
- Quick checks (is nginx running?)
- Emergency fixes (restart a service)
- One-time deployments (install a package)
π’ Ansible ad-hoc commands on 3 VMs:
# Update all packages
ansible web -i inventory.ini -m ansible.builtin.apt \
-a "update_cache=yes" --become
# Output:
# [OK] vm-demo-0
# [OK] vm-demo-1
# [OK] vm-demo-2
In 5 seconds, I updated package caches on 3 servers. Try doing that manually. I'll wait. π
Section 3.2: Building inventory.ini for Dynamic Hosts
The magic link between Terraform and Ansible is inventory.ini:
[web]
52.172.1.10
52.172.1.11
52.172.1.12
[all:vars]
ansible_user=azureuser
ansible_ssh_private_key_file=~/.ssh/id_rsa
ansible_ssh_common_args='-o StrictHostKeyChecking=accept-new'
This file tells Ansible:
-
Which servers to target (
[web]group) - How to connect (SSH with this user and key)
- Trust new hosts automatically (first connection safety)
I generated this dynamically:
# Get IPs from Terraform output
PUBLIC_IPS=$(terraform output -json public_ips)
# Build inventory
cat > inventory.ini << EOF
[web]
$(echo $PUBLIC_IPS | jq -r '.[]')
[all:vars]
ansible_user=azureuser
ansible_ssh_private_key_file=~/.ssh/id_rsa
EOF
π‘ Key Learning: Repeatable automation starts with repeatable data. Inventory files are that data. Generate them programmatically; never hardcode IPs.
Section 3.3: Ad-Hoc Commands for System Discovery
I used ad-hoc commands to validate my infrastructure:
# Check uptime
ansible web -i inventory.ini -m ansible.builtin.command -a "uptime"
# Check disk usage
ansible web -i inventory.ini -m ansible.builtin.command -a "df -h"
# Check listening ports
ansible web -i inventory.ini -m ansible.builtin.command -a "netstat -tlnp"
π These one-liners became my infrastructure audit. Before writing any playbooks, I knew:
- All servers are online
- They have enough disk space
- No unexpected services are running
π PART 4: Multi-Play Orchestration β The Playbook Revolution
Section 4.1: Why Playbooks Beat Ad-Hoc Commands
Question: If ad-hoc is faster, why do playbooks exist?
Answer: Reusability. A playbook written today runs identically tomorrow, on 3 VMs or 3,000 VMs, with zero drift.
π’ Multi-play playbooks for static web deployment:
---
# Play 1: Install and configure Nginx
- name: "Install Nginx"
hosts: web
become: true
tasks:
- name: Update package lists
ansible.builtin.apt:
update_cache: true
- name: Install Nginx
ansible.builtin.apt:
name: nginx
state: present
# Play 2: Deploy content
- name: "Deploy Static Content"
hosts: web
become: true
tasks:
- name: Copy index.html
ansible.builtin.copy:
src: files/index.html
dest: /var/www/html/index.html
owner: www-data
group: www-data
mode: '0644'
# Play 3: Verify
- name: "Verify Deployment"
hosts: localhost
tasks:
- name: Test HTTP connectivity
ansible.builtin.uri:
url: "http://{{ item }}"
status_code: 200
loop: "{{ groups['web'] }}"
Why Three Plays?
- Play 1 runs on all web servers, installs nginx
- Play 2 runs on all web servers, deploys content
- Play 3 runs on localhost (your machine), verifies each server responds with HTTP 200
π If Play 1 fails, Play 2 never runs. This fail-fast approach prevents half-deployed states. This is reliability.
Section 4.2: The strftime Filter Saga
The Error:
Error while resolving value for 'msg': The filter plugin 'ansible.builtin.strftime'
failed: Invalid value for epoch value (%Y-%m-%d %H:%M:%S)
What Happened:
I tried to format a file's modification time (mtime) using the strftime filter:
- name: Print file stats
ansible.builtin.debug:
msg: "Modified: {{ file_stat.stat.mtime | int | strftime('%Y-%m-%d %H:%M:%S') }}"
The issue: file_stat.stat.mtime returned a value that couldn't be cleanly cast to an integer for strftime. Different Linux distributions return mtime in different formats.
The Fix:
Added defensive filtering:
- name: Print file stats
ansible.builtin.debug:
msg: |
β File deployed:
Path: {{ file_stat.stat.path | default('unknown') }}
Modified (epoch): {{ file_stat.stat.mtime | default('unknown') }}
By using default(), I ensured the playbook wouldn't fail if mtime was missing or malformed. The output might say "unknown," but the playbook continues.
π‘ Key Learning: Production code assumes inputs may be invalid. Always handle edge cases gracefully.
π PART 5: Git-Based Deployment β The Dynamic Content Era
Section 5.1: Cloning Repositories During Deployment
Question: How do we deploy applications that live in GitHub?
Answer: We let Ansible clone them directly during provisioning.
π’ Terraform + Ansible for end-to-end deployment:
# Step 1: Provision VM with Terraform
terraform apply
# Step 2: Get IP and build inventory
PUBLIC_IP=$(terraform output -raw public_ip)
# Step 3: Deploy app with Ansible
ansible-playbook -i inventory.ini site.yml
The Ansible playbook included a git clone task:
- name: Clone Mini Finance Repository
ansible.builtin.git:
repo: "https://github.com/suvrajeetbanerjee/mini_finance.git"
dest: /var/www/html
version: main
depth: 1
force: true
Parameters:
-
repo: GitHub URL (public repo, no auth needed) -
dest: Where to clone on the target VM -
version: Branch or tag to checkout -
depth: 1: Shallow clone (latest commit only, faster) -
force: true: Overwrite if directory exists
Section 5.2: The Permission Boundary Problem
The Error:
fatal: [74.225.240.117]: FAILED! =>
{
"msg": "Unexpected AnsibleActionFail error: Could not find or access '/tmp/epicbook_clone/'
on the Ansible Controller."
}
Root Cause:
I was trying to use the copy module to move files from /tmp/epicbook_clone/ (on the remote VM) to /var/www/html/ (also on the remote VM). But copy by default operates controller β remote; it looked for the source on my local machine.
Initial (Wrong) Solution:
- name: Copy App Content
ansible.builtin.copy:
src: "/tmp/epicbook_clone/"
dest: "/var/www/html/"
owner: www-data
mode: '0755'
become: true
This failed because Ansible couldn't find /tmp/epicbook_clone/ locally.
Correct Solution:
Use the command module to copy on the remote machine:
- name: Copy App Content
ansible.builtin.command: >
cp -r /tmp/epicbook_clone/* {{ epicbook_app_path }}/
become: true
- name: Set Permissions Recursively
ansible.builtin.file:
path: "{{ epicbook_app_path }}"
owner: www-data
group: www-data
mode: '0755'
recurse: true
π‘ Key Learning: Understand the execution context. Some Ansible modules run on the controller; others run on the remote. Know the difference, and you save hours of debugging.
π PART 6: Ansible Roles β The Modularization Game-Changer
Section 6.1: What Are Ansible Roles?
Question: How do we scale from 1 playbook to 100 playbooks without chaos?
Answer: We decompose them into reusable, self-contained roles.
A role is a collection of:
- tasks/ β What to do
- handlers/ β React to changes
- templates/ β Configuration files with variables
- files/ β Static content to copy
- vars/ β Default variables
- defaults/ β Overridable defaults
π Learning to structure deployment as 3 independent roles:
roles/
βββ common/
β βββ tasks/main.yml # System updates, baseline packages
βββ nginx/
β βββ tasks/main.yml # Install, configure Nginx
β βββ templates/epicbook.conf.j2 # Nginx site config with Jinja2
βββ epicbook/
βββ tasks/main.yml # Clone repo, deploy app
Section 6.2: Role Syntax and Execution
The site.yml that orchestrates all roles:
---
- name: "Play 1 - Common Role"
hosts: web
become: true
roles:
- common
- name: "Play 2 - Nginx Role"
hosts: web
become: true
roles:
- nginx
- name: "Play 3 - EpicBook Role"
hosts: web
become: true
roles:
- epicbook
Execution Flow:
Play 1: Run all tasks in roles/common/tasks/main.yml on [web] hosts
β
Play 2: Run all tasks in roles/nginx/tasks/main.yml on [web] hosts
β
Play 3: Run all tasks in roles/epicbook/tasks/main.yml on [web] hosts
π Each role is isolated, testable, and reusable. If I need the "common" role on a database server, I simply add it to that server's group.
Section 6.3: Inside roles/common/tasks/main.yml
---
- name: Update Package Lists
ansible.builtin.apt:
update_cache: true
cache_valid_time: 3600
- name: Install Essential Packages
ansible.builtin.apt:
name:
- curl
- wget
- git
- python3
state: present
- name: Set Hostname
ansible.builtin.hostname:
name: "epicbook-server"
Why separate this into a role?
- Every infrastructure deployment needs baseline updates
- Every new server needs curl, git, Python
- Other projects can reuse this exact role without modification
- Changes to common baseline propagate everywhere
Section 6.4: Inside roles/nginx/tasks/main.yml
---
- name: Install Nginx
ansible.builtin.apt:
name: nginx
state: present
- name: Create App Directory
ansible.builtin.file:
path: "{{ epicbook_app_path }}"
state: directory
owner: "{{ nginx_user }}"
group: "{{ nginx_group }}"
mode: '0755'
- name: Copy Nginx Configuration
ansible.builtin.template:
src: epicbook.conf.j2
dest: /etc/nginx/sites-available/epicbook
owner: root
group: root
mode: '0644'
- name: Enable Nginx Site
ansible.builtin.file:
src: /etc/nginx/sites-available/epicbook
dest: /etc/nginx/sites-enabled/epicbook
state: link
- name: Start And Enable Nginx
ansible.builtin.service:
name: nginx
state: started
enabled: true
Notice the use of variables like {{ epicbook_app_path }} and {{ nginx_user }}. These are defined in group_vars/web.yml:
---
epicbook_app_repo: "https://github.com/pravinmishraaws/theepicbook.git"
epicbook_app_path: /var/www/epicbook
nginx_user: www-data
nginx_group: www-data
Why? DRY (Don't Repeat Yourself). If I need to change the app path, I change it once in group_vars, and all 3 roles automatically use the new value.
Section 6.5: The Template Game β Jinja2 in roles/nginx/templates/epicbook.conf.j2
server {
listen 80 default_server;
listen [::]:80 default_server;
server_name _;
root {{ epicbook_app_path }};
index index.html index.htm;
location / {
try_files $uri $uri/ =404;
}
error_page 404 /404.html;
error_page 500 502 503 504 /50x.html;
}
π‘ The {{ epicbook_app_path }} variable is interpolated at deployment time. So if I deploy to /var/www/app1 on server-A and /var/www/app2 on server-B, the template automatically adapts for each server. That's the power of templates.
π― PART 7: Handlers β The Change-Reactive Automation
Section 7.1: What Are Handlers?
Question: How do we ensure services reload only when configuration changes?
Answer: Handlers. They're tasks that run only if a prior task reports a change.
Example from roles/nginx/tasks/main.yml:
- name: Copy Nginx Configuration
ansible.builtin.template:
src: epicbook.conf.j2
dest: /etc/nginx/sites-available/epicbook
notify: Reload Nginx
- name: Enable Nginx Site
ansible.builtin.file:
src: /etc/nginx/sites-available/epicbook
dest: /etc/nginx/sites-enabled/epicbook
state: link
notify: Reload Nginx
- name: Reload Nginx
ansible.builtin.service:
name: nginx
state: reloaded
Flow:
- Copy template task runs β config file changes β task reports
changed: true - Task sends
notify: Reload Nginx - At end of play, Nginx reload handler runs (once, even if notified multiple times)
Idempotency Example:
# First run: config changed, handler fires
CHANGED [web] β Reload Nginx [web]
# Second run: config unchanged, handler doesn't fire
OK [web] β (no reload)
π‘ This is intelligent automationβreload only when needed, not on every run.
β PART 8: Real-World Errors and How I Fixed Them
Error 1: Parser Error from YAML Syntax
The Error:
unexpected parameter type in action: <class 'ansible.module_utils._internal._datatag._AnsibleTaggedList'>
Cause:
I accidentally used list syntax where a string was expected:
# WRONG:
state: [link] # or state: ['link']
# RIGHT:
state: link
Fix:
Reviewed entire role with ansible-lint, corrected all such instances.
ansible-lint roles/nginx/tasks/main.yml
# Errors reported with line numbers; fixed each one
Error 2: Risky File Permissions
The Error:
risky-file-permissions: File permissions unset or incorrect.
Cause:
File operations lacked explicit mode parameter:
# WRONG:
- name: Copy config
ansible.builtin.template:
src: config.j2
dest: /etc/nginx/sites-available/config
# RIGHT:
- name: Copy config
ansible.builtin.template:
src: config.j2
dest: /etc/nginx/sites-available/config
mode: '0644'
Fix:
Added explicit mode to every file operation.
Error 3: Missing index.html (The 403 Forbidden)
The Error (Browser):
403 Forbidden
nginx/1.18.0 (Ubuntu)
Cause:
Deployment succeeded, but /var/www/epicbook/index.html didn't exist. Nginx had no content to serve.
Debug:
ssh azureuser@VM_IP
ls -l /var/www/epicbook/
# Empty directory!
Fix:
Verified upstream repo structure, ensured git clone captured all files:
ssh azureuser@VM_IP
find /tmp/epicbook_clone -name index.html
# Found index.html in repo root, so copy succeeded
Created placeholder for testing:
echo '<h1>EpicBook Deployed!</h1>' | sudo tee /var/www/epicbook/index.html
sudo chown www-data:www-data /var/www/epicbook/index.html
sudo chmod 644 /var/www/epicbook/index.html
sudo systemctl reload nginx
Browser now showed content β
π PART 9: Ansible Galaxy and Extending Your Roles
Section 9.1: What Is Ansible Galaxy?
Question: Do I need to write every role from scratch?
Answer: No. Ansible Galaxy is the package manager for rolesβlike npm for Node, pip for Python.
Accessing Galaxy:
# Search for nginx role
ansible-galaxy search nginx
# Install a community role
ansible-galaxy install geerlingguy.nginx
# Use it in a playbook
- hosts: web
roles:
- geerlingguy.nginx # Uses defaults, or override with vars
Section 9.2: When to Use Galaxy Roles vs. Custom Roles
Use Galaxy roles when:
- β Community role exactly matches your needs (nginx, docker, postgresql)
- β Role is well-maintained (100+ GitHub stars, recent updates)
- β You want to avoid reinventing the wheel
Write custom roles when:
- β Your deployment is unique (custom app, specific company standards)
- β You need tight control over every detail
- β Team knowledge is in-house code
My Approach:
I wrote custom roles because:
- This is a learning assignmentβI needed to understand every line
- My requirements are specific (EpicBook, not generic nginx)
- Custom roles document my infrastructure decisions for future reference
π PART 10: Documentation and References
Section 10.1: Where to Learn Ansible
Official Resources:
- Ansible Documentation β Comprehensive module reference
- Ansible Galaxy β Community roles and collections
- Ansible Best Practices β Production-grade guidelines
For This HandOn Assignments Demonstration:
-
Modules Used:
ansible.builtin.apt,ansible.builtin.template,ansible.builtin.file,ansible.builtin.service,ansible.builtin.git - apt Module Docs β Package management
- template Module Docs β Jinja2 file generation
- file Module Docs β File/directory permissions
- service Module Docs β Service lifecycle
Section 10.2: Ansible Execution End-to-End Flow Diagram
[Control Node]
β
1. Load playbook (site.yml)
β
2. Parse variables (group_vars/web.yml)
β
3. Resolve inventory (inventory.ini)
β
4. For each play:
ββ For each host in group:
β ββ Connect via SSH
β ββ Execute tasks sequentially
β ββ If task notifies β trigger handler
β ββ Collect results
β ββ Close connection
ββ After all hosts complete:
β ββ Execute handlers (once per notified handler)
β ββ Move to next play
β
5. Generate play recap (OK/CHANGED/FAILED)
β
[Fully Deployed, Idempotent Infrastructure]
What This Shows:
- Plays execute sequentially
- Hosts within a play execute in parallel (faster!)
- Handlers run after all tasks, not immediately
- If any task fails, subsequent tasks/plays may be skipped (depending on
failed_when,ignore_errors)
π PART 11: Advanced Concepts I Learned
Section 11.1: Variable Precedence and Scoping
Question: If I define a variable in multiple places, which one wins?
Answer: Ansible has strict precedence (highest to lowest):
- Extra vars (
--extra-vars) - Task vars
- Block vars
- Play vars
- Group vars (specific group first)
- Host vars
- Default vars
- Discovered vars
Example:
# This wins (highest precedence)
ansible-playbook site.yml --extra-vars "epicbook_app_path=/custom/path"
# Falls back to group_vars/web.yml
epicbook_app_path: /var/www/epicbook
# Falls back to role defaults/
epicbook_app_path: /var/www/html
Section 11.2: Conditionals and Loops
When to conditionally run tasks:
- name: Install packages if not exists
ansible.builtin.apt:
name: nginx
state: present
when: ansible_distribution == "Ubuntu"
When to loop over lists:
- name: Install multiple packages
ansible.builtin.apt:
name: "{{ item }}"
state: present
loop:
- curl
- wget
- git
Section 11.3: Error Handling and Recovery
- name: Deploy app
ansible.builtin.command: /opt/deploy.sh
register: deploy_result
ignore_errors: true # Don't stop if this fails
- name: Notify on failure
ansible.builtin.debug:
msg: "Deployment failed: {{ deploy_result.stderr }}"
when: deploy_result.rc != 0
π PART 12: Lessons from the Week
Section 12.1: The 70% Rule
My biggest realization: 70% of DevOps is not writing codeβit's debugging permissions, networking, and assumptions.
In this week alone:
- 30% time: Writing Terraform, Ansible, YAML
- 70% time: Fixing SSH key formats, file permissions, missing files, YAML syntax errors
π² This is the craft. Embrace it.
Section 12.2: IaC is About Control, Not Typing
Terraform and Ansible aren't about saving keystrokes. They're about:
- Reproducibility: Same code β same result, always
- Auditability: Git history shows who changed what and when
- Scalability: 1 server or 1,000; same codebase
-
Safety: Mistakes are caught in
terraform plan, not in production
Section 12.3: Ansible Roles Are Career Capital
Understanding roles is a superpower because:
- Every production Ansible deployment uses them
- They're portable across companies, industries, tech stacks
- Once you master 1 role (nginx), you can architect 50 (docker, mysql, postgresql, etc.)
π PART 13: Visual Diagram of Complete Ansible Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONTROL NODE (Your Machine) β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β site.yml (Main Playbook) β β
β β - Play 1: common role β β
β β - Play 2: nginx role β β
β β - Play 3: epicbook role β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ β
β β inventory.ini β group_vars/web.yml β β
β β [web] β epicbook_app_path: /var.. β β
β β 52.172.1.10 β nginx_user: www-data β β
β β 52.172.1.11 β epicbook_app_repo: https..β β
β β 52.172.1.12 β β β
β ββββββββββββββββββββ΄ββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Ansible Parser β β
β β - Load playbook β β
β β - Resolve variables β β
β β - Parse roles β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
SSH Connections (Port 22)
β β β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β REMOTE VM1 β β REMOTE VM2 β β REMOTE VM3 β
β 52.172.1.10 β β 52.172.1.11 β β 52.172.1.12 β
β β β β β β
β Execute: β β Execute: β β Execute: β
β ββcommon β β ββcommon β β ββcommon β
β ββnginx β β ββnginx β β ββnginx β
β ββepicbook β β ββepicbook β β ββepicbook β
β β β β β β
β Result: β β Result: β β Result: β
β β App β β β App β β β App β
β β Running β β β Running β β β Running β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
π¬ Conclusion: From Learner to Production Engineer
This week transformed my understanding of automation. I didn't just learn tools; I learned the mindset:
Start simple. Ad-hoc commands teach you what's possible.
Scale progressively. Playbooks teach you repeatability.
Decompose ruthlessly. Roles teach you sustainability.
Automate everything. Terraform + Ansible together teach you true IaC.
The errors I facedβSSH key formats, file permissions, missing filesβaren't bugs. They're lessons. Every error message is Ansible telling you exactly what went wrong. Read it, fix it, commit the lesson.
π Learning Outcomes
By completing HandsOn Assignments Practicals, I can now:
β
Provision cloud infrastructure using Terraform with zero manual clicks
β
Connect to servers via passwordless SSH with validated key formats
β
Use Ansible ad-hoc commands for rapid system checks and updates
β
Orchestrate multi-play deployments with handlers and idempotency
β
Deploy applications from GitHub repositories automatically
β
Architect reusable Ansible roles for any infrastructure need
β
Debug Ansible errors methodically and fix them with confidence
β
Scale from 1 VM to 1,000 VMs with identical, version-controlled code
π Reflection and Next Steps
This journey is week 8 of 12 of our free DevOps Micro Internship Cohort, organized by Pravin Mishra sir π, in continuation of π Terraform Production Battle-Tested: Remote State, Workspaces & Full-Stack AWS Deployment [Week-7βP2] π.
Week 9 will dive into Azure DevOps. The skills built this weekβroles, modular architecture, error handlingβare the foundation for enterprise deployments.
To anyone reading this: If you can write a 3-role Ansible deployment, you've climbed the steepest hill. Everything else is refinement.
π Resources for Further Learning
- Ansible Official Documentation β Go-to reference
- Ansible Galaxy β Pre-built roles marketplace
- Terraform + Ansible Integration β IaC orchestration
- YouTube Channel: Ansible Tutorials β Video learning
- Dev.to Ansible Tag β Community articles
Thank you for reading! Drop your questions and learnings in the comments. Let's automate the world together! π
π·οΈ Tags:
#Ansible #Terraform #DevOps #Azure #IaC #ConfigurationManagement #Automation #LearningJourney #AWS









Top comments (0)