Table Of Contents.
- Introduction
- VMSS and its components.
- What is Terraform?
- Prerequisites
- Prepare environment & authenticate to Azure.
- Create a local Terraform project with modular .tf files.
- Initialize / Plan / Apply (local)
- Verify
- Conclusion
๐ Introduction
Deploying an Azure Virtual Machine Scale Set Using Terraform
In todayโs cloud-first world, scalability and automation are non-negotiable. This article explores how to deploy Azure Virtual Machine Scale Sets using Terraform, a powerful Infrastructure as Code (IaC) tool that enables consistent, repeatable, and efficient cloud provisioning. Whether building resilient applications or optimizing resource management, this guide walks you through the essentials of defining, deploying, and managing VM scale sets with Terraform, unlocking the full potential of Azureโs elastic compute capabilities.
๐VMSS and its components
Virtual Machine Scale Sets (VMSS) allow you to create and manage a group of identical,
load-balanced virtual machines (VMs). Think of them as a way to automatically increase or
decrease computing power based on demandโsimilar to how supermarkets bring in more cashiers
when there are many customers, and fewer when the store is quiet.
Example: Imagine a movie theater. On a normal weekday afternoon, only one or two
ticket counters may be open. But on Friday night, when crowds rush in, more counters open
automatically. VM Scale Sets work in the same wayโautomatically adding or removing 'counters'
(VMs) depending on the number of 'customers' (requests) arriving.

Key Benefits of VM Scale Sets: 
- Automatic Scaling: Handles traffic spikes without manual intervention.
- Load Balancing: Ensures no single VM is overloaded.
- High Availability: Keeps your app running even if some VMs fail. - **Cost Efficiency: You only pay for what you use.
Another Analogy: Think of VM Scale Sets like ride-hailing services (e.g., Uber). During peak
hours, more drivers appear on the road (scaling out). Fewer drivers are available at night when demand is low (scaling in). This ensures efficiency and availability without wasting resources.
In summary, Virtual Machine Scale Sets are like having a flexible team that grows or shrinks
automatically, ensuring smooth operations and optimized costsโwhether you're running a small
website or a large-scale application.
What is Terraform?
Terraform is an open-source Infrastructure as Code (IaC) tool developed by HashiCorp. It allows you to define and provision cloud infrastructure using declarative configuration files. Instead of manually creating resources through a cloud portal, you write code that describes the desired infrastructure state, and Terraform handles the deployment and updates.

Why Terraform?
We use Terraform to deploy a Virtual Machine Scale Set (VMSS) in Azure because it offers:
- ๐ Repeatability: You can deploy identical environments (e.g., dev, test, prod) using the same configuration.
- โ๏ธ Automation: Terraform automates the creation, scaling, and management of VM instances.
- ๐ฆ Version Control: Infrastructure changes are tracked in source control (e.g., GitHub), enabling collaboration and rollback.
- ๐ Scalability: VMSS can automatically adjust the number of VMs based on demand, and Terraform makes configuring autoscale rules seamless.
Letโs turn infrastructure into codeโand scale like a pro.
๐ Prerequisites
The following are the prerequisites needed for this task. 
- VsCode
- Azure subscription and permissions to create resources (or ask an admin).
- Azure CLI installed (az) and logged in, or Service Principal credentials.
- Terraform CLI (>= 1.4 recommended).
- Git and (optionally) GitHub CLI (gh).
- An SSH key pair for VM admin access (~/.ssh/id_rsa.pub).
๐ Prepare environment & authenticate to Azure.
A. Authenticate to Azure
Interactive (dev machine)
Open your VSCode, create your project name first as "VMSS"




mkdir vmss-terraform        #this creates a directory.

cd vmss-terraform           #this gets you into the root directory

Then, log in to Azure
az login                    #this pops your AZ account.

B. Service Principal (recommended for CI)
Run the following, and replace the  with yours. 
az ad sp create-for-rbac --name "http://tf-vmss-sp" \
  --role "Contributor" \
  --scopes "/subscriptions/"

Save the output (appId, password, tenant).
{
  "appId": "6d509bb1-981f-4193-ad42-05bf45e72f12",
  "displayName": "http://tf-vmss-sp",
  "password": "epe8Q~hRGUg7f5hrBK_Pl~4N3wml1PutgPSiqaMn",
  "tenant": "01d4fe9c-2658-4141-be2f-b1f11eca9673"
}
Then export env vars for Terraform by replacing them with the actual values, and run them collectively in the terminal.
export ARM_SUBSCRIPTION_ID="14e8ed5c-bc41-4d1b-b5b7-ff9419e2f0d6"
export ARM_CLIENT_ID="6d509bb1-981f-4193-ad42-05bf45e72f12"
export ARM_CLIENT_SECRET="epe8Q~hRGUg7f5hrBK_Pl~4N3wml1PutgPSiqaMn"
export ARM_TENANT_ID="01d4fe9c-2658-4141-be2f-b1f11eca9673"

๐Create a local Terraform project with modular .tf files.
- Create project directory & layout setup-backend.sh vmss-terraform/ โโ setup-backend.sh โโ backend.tf โโ frontend.tf โโ main.tf # provider + top-level resources (or split) โโ variables.tf โโ network.tf โโ compute.tf โโ autoscale.tf โโ outputs.tf โโ user-data.sh # cloud-init to install nginx (optional) โโ versions.tf โโ .gitignore
touch setup-backend.sh (paste the script below into it)
!/bin/bash
------------------------------------------------------------------
setup-backend.sh
Creates an Azure Storage backend for Terraform remote state
------------------------------------------------------------------
====== CONFIGURABLE VARIABLES ======
RESOURCE_GROUP_NAME="terraform-state-rg"
LOCATION="westus3"
STORAGE_ACCOUNT_NAME="tfstate$(openssl rand -hex 3 | tr -d '\n' | tr '[:upper:]' '[:lower:]')"  # ensures uniqueness
CONTAINER_NAME="tfstate"
KEY_NAME="vmss.terraform.tfstate"
====================================
echo "๐ Checking Azure CLI login..."
if ! az account show >/dev/null 2>&1; then
echo "โ You are not logged in to Azure CLI. Run: az login"
exit 1
fi
SUBSCRIPTION_ID=$(az account show --query id -o tsv)
echo "โ
 Using Azure Subscription: $SUBSCRIPTION_ID"
echo "๐ Creating resource group if not exists..."
az group create --name "$RESOURCE_GROUP_NAME" --location "$LOCATION" >/dev/null
echo "๐พ Creating storage account: $STORAGE_ACCOUNT_NAME ..."
az storage account create 
--name "$STORAGE_ACCOUNT_NAME" 
--resource-group "$RESOURCE_GROUP_NAME" 
--location "$LOCATION" 
--sku Standard_LRS 
--encryption-services blob >/dev/null
echo "๐ฆ Creating blob container: $CONTAINER_NAME ..."
az storage container create 
--name "$CONTAINER_NAME" 
--account-name "$STORAGE_ACCOUNT_NAME" >/dev/null
Print summary
ACCOUNT_KEY=$(az storage account keys list 
--resource-group "$RESOURCE_GROUP_NAME" 
--account-name "$STORAGE_ACCOUNT_NAME" 
--query "[0].value" -o tsv)
echo ""
echo "โ
 Terraform backend storage created successfully!"
echo ""
echo "๐งฉ Use the following values in your backend.tf file:"
echo "---------------------------------------------------"
echo "resource_group_name  = "$RESOURCE_GROUP_NAME""
echo "storage_account_name = "$STORAGE_ACCOUNT_NAME""
echo "container_name       = "$CONTAINER_NAME""
echo "key                  = "$KEY_NAME""
echo "---------------------------------------------------"
echo ""
echo "๐ Storage Account Key (keep secret):"
echo "$ACCOUNT_KEY"
echo ""
echo "๐ก Example: Run 'terraform init -reconfigure' after updating backend.tf."

Make it executable by running the code below 
chmod +x setup-backend.sh

Run it with the code below;
./setup-backend.sh


Update your backend.tf file with the actual values below;
(resource_group_name  = terraform-state-rg
storage_account_name = tfstatee0133f
container_name       = tfstate
key                  = vmss.terraform.tfstate)
touch backend.tf (paste the script below into it)
terraform {
  backend "azurerm" {
    resource_group_name  = "terraform-state-rg"
    storage_account_name = "tfstatee0133f" 
    container_name       = "tfstate"
    key                  = "vmss.terraform.tfstate"
  }
}

Run 'terraform init -reconfigure' after updating backend.tf

Re-run
chmod +x setup-backend.sh && ./setup-backend.sh

The script has some formatting issues with line breaks in the Azure CLI commands. Let's manually create the storage account with the name it generated (tfstateb3ecc2):
az storage container create --name tfstate --account-name tfstateb3ecc2

Now, let's create the container
az storage container create --name tfstate --account-name tfstateb3ecc2

Run the code below 
terraform init -reconfigure

touch frontend.tf      (paste the code below into it)
frontend.tf
resource "azurerm_public_ip" "frontend_pip" {
  name                = "vmss-frontend-pip"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  allocation_method   = "Static"
  domain_name_label   = "vmss-frontend-${random_string.fqdn.result}"
  sku                 = "Standard"
  tags                = var.tags
}
resource "azurerm_lb" "frontend_lb" {
  name                = "vmss-frontend-lb"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  sku                 = "Standard"
frontend_ip_configuration {
    name                 = "PublicFrontendIP"
    public_ip_address_id = azurerm_public_ip.frontend_pip.id
  }
tags = var.tags
}
resource "azurerm_lb_backend_address_pool" "frontend_bepool" {
  loadbalancer_id = azurerm_lb.frontend_lb.id
  name            = "FrontendBackendPool"
}
resource "azurerm_lb_probe" "frontend_probe" {
  loadbalancer_id = azurerm_lb.frontend_lb.id
  name            = "http-probe"
  protocol        = "Tcp"
  port            = var.application_port
}
resource "azurerm_lb_rule" "frontend_rule" {
  loadbalancer_id                = azurerm_lb.frontend_lb.id
  name                           = "http-rule"
  protocol                       = "Tcp"
  frontend_port                  = var.application_port
  backend_port                   = var.application_port
  frontend_ip_configuration_name = "PublicFrontendIP"
  backend_address_pool_ids       = [azurerm_lb_backend_address_pool.frontend_bepool.id]
  probe_id                       = azurerm_lb_probe.frontend_probe.id
}

touch main.tf        (paste the code below into it)
provider "azurerm" { 
    features {} 
    } 
resource "random_string" "fqdn" {
  length  = 6
  upper   = false
  special = false
  numeric = false
} 

touch variables.tf               (paste the code below into it)
variable "resource_group_name" {
  type    = string
  default = "terraform-vmss-rg"
}
variable "location" {
  type    = string
  default = "westus3"
}
variable "admin_user" {
  type    = string
  default = "azureuser"
}
variable "ssh_public_key_path" {
  type    = string
  default = "~/.ssh/id_rsa.pub"
}
variable "instances" {
  type    = number
  default = 2
}
variable "vm_size" {
  type    = string
  default = "Standard_DS1_v2"
}
variable "application_port" {
  type    = number
  default = 80
}
variable "tags" {
  type = map(string)
  default = {
    env     = "dev"
    project = "vmss-demo"
  }
}
touch network.tf       (paste the code below into it) 
resource "azurerm_resource_group" "rg" {
  name     = var.resource_group_name
  location = var.location
  tags     = var.tags
}
resource "azurerm_virtual_network" "vnet" {
  name                = "vmss-vnet"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  address_space       = ["10.0.0.0/16"]
  tags                = var.tags
}
resource "azurerm_subnet" "subnet" {
  name                 = "vmss-subnet"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
  address_prefixes     = ["10.0.2.0/24"]
}
resource "azurerm_public_ip" "vmss_pip" {
  name                = "vmss-public-ip"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  allocation_method   = "Static"
  domain_name_label   = "vmss-demo-${random_string.fqdn.result}"
  tags                = var.tags
}
resource "azurerm_lb" "lb" {
  name                = "vmss-lb"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
frontend_ip_configuration {
    name                 = "PublicIPAddress"
    public_ip_address_id = azurerm_public_ip.vmss_pip.id
  }
tags = var.tags
}
resource "azurerm_lb_backend_address_pool" "bpepool" {
  loadbalancer_id = azurerm_lb.lb.id
  name            = "BackEndPool"
}
resource "azurerm_lb_probe" "http_probe" {
  loadbalancer_id = azurerm_lb.lb.id
  name            = "http-probe"
  port            = var.application_port
}
resource "azurerm_lb_rule" "http" {
  loadbalancer_id                = azurerm_lb.lb.id
  name                           = "http"
  protocol                       = "Tcp"
  frontend_port                  = var.application_port
  backend_port                   = var.application_port
  frontend_ip_configuration_name = "PublicIPAddress"
  backend_address_pool_ids       = [azurerm_lb_backend_address_pool.bpepool.id]
  probe_id                       = azurerm_lb_probe.http_probe.id
}
touch compute.tf              ( paste the code below into it) 
resource "azurerm_virtual_machine_scale_set" "vmss" {
  name                = "tf-vmss"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  upgrade_policy_mode = "Manual"
sku {
    name     = var.vm_size
    tier     = "Standard"
    capacity = var.instances
  }
storage_profile_image_reference {
    publisher = "Canonical"
    offer     = "UbuntuServer"
    sku       = "22_04-lts"
    version   = "latest"
  }
storage_profile_os_disk {
    caching           = "ReadWrite"
    create_option     = "FromImage"
    managed_disk_type = "Standard_LRS"
  }
os_profile {
    computer_name_prefix = "vmss"
    admin_username       = var.admin_user
    # NOTE: for demo simplicity MS sample uses a password. For production, prefer SSH keys or orchestrated VMSS with ssh keys.
    admin_password = "ReplaceThisWithASecurePassword123!"
  }
os_profile_linux_config {
    disable_password_authentication = false
  }
network_profile {
    name    = "terraformnetworkprofile"
    primary = true
ip_configuration {
  name                                   = "IPConfiguration"
  subnet_id                              = azurerm_subnet.subnet.id
  load_balancer_backend_address_pool_ids = [azurerm_lb_backend_address_pool.bpepool.id]
  primary                                = true
}
}
tags = var.tags
}
touch autoscale.tf                 (paste the code below into it)
resource "azurerm_monitor_autoscale_setting" "autoscale" {
  name                = "vmss-autoscale"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  target_resource_id  = azurerm_virtual_machine_scale_set.vmss.id
  enabled             = true
profile {
    name = "autoscale"
capacity {
  default = var.instances
  minimum = 1
  maximum = 5
}
rule {
  metric_trigger {
    metric_name        = "Percentage CPU"
    metric_resource_id = azurerm_virtual_machine_scale_set.vmss.id
    operator           = "GreaterThan"
    statistic          = "Average"
    time_aggregation   = "Average"
    time_window        = "PT2M"
    time_grain         = "PT1M"
    threshold          = 75
  }
  scale_action {
    direction = "Increase"
    type      = "ChangeCount"
    value     = "1"
    cooldown  = "PT5M"
  }
}
rule {
  metric_trigger {
    metric_name        = "Percentage CPU"
    metric_resource_id = azurerm_virtual_machine_scale_set.vmss.id
    operator           = "LessThan"
    statistic          = "Average"
    time_aggregation   = "Average"
    time_window        = "PT2M"
    time_grain         = "PT1M"
    threshold          = 25
  }
  scale_action {
    direction = "Decrease"
    type      = "ChangeCount"
    value     = "1"
    cooldown  = "PT5M"
  }
}
}
}

touch outputs.tf              (paste the code below into it)
output "public_ip" {
  description = "The public IP address of the load balancer"
  value       = azurerm_public_ip.vmss_pip.ip_address
}
touch .gitignore                (paste the code below into the file)
Terraform
.terraform/
*.tfstate
*.tfstate.backup
*.tfvars
crash.log
override.tf
override.tf.json
.terraform.lock.hcl
OS generated files
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db
IDE files
.vscode/
.idea/
*.swp
*.swo
*~
Environment variables
.env
.env.local
.env.*.local
touch versions.tf               (paste the code below into it)
terraform {
required_version = ">= 1.4.0"
required_providers {
azurerm = {
source  = "hashicorp/azurerm"
"~> 3.0" means any 3.x version โ 3.0 through <4.0
version = "~> 3.0"
}
random = {
  source  = "hashicorp/random"
  version = "~> 3.0"
}
}
}

(NB; Remove the duplicate public IP and load balancer blocks from the network.tf (theyโre now in frontend.tf).
In compute.tf, change this reference line: 
load_balancer_backend_address_pool_ids = [azurerm_lb_backend_address_pool.bpepool.id]
TO;
load_balancer_backend_address_pool_ids = [azurerm_lb_backend_address_pool.frontend_bepool.id]

๐Initialize / Plan / Apply (local)
initialize providers by running
review plan by running
terraform plan -out main.tfplan

apply (use the plan you reviewed)
terraform apply main.tfplan
This returned an error, as seen in the screenshot

In your compute.tf file, update the image reference to use a valid Ubuntu 22.04 LTS image below;
storage_profile_image_reference {
  publisher = "Canonical"
  offer     = "0001-com-ubuntu-server-focal"
  sku       = "20_04-lts-gen2"
  version   = "latest"
}


Run terraform apply

Copy and save your public IP
(public_ip = "4.227.9.177")
๐Verify
Azure CLI: Run the code below;
az vm list --resource-group  --output table 
Replace  with your resource group
az vm list --resource-group terraform-state-rg --output table

The command returned no results, meaning no VMs exist in the terraform-state-rg resource group. This resource group likely only contains your Terraform state storage account.
Now, run
az vm list --resource-group terraform-vmss-rg --output table 
โ You should see VM instances created by the VMSS.

Correct - terraform-state-rg only contains your Terraform state storage account. Your actual VMs/VMSS are in a different resource group.
List all resource groups to find where your VMs are deployed
List all Virtual Machine Scale Sets
az vmss list --output table

Run az vmss list --resource-group terraform-vmss-rg --output table

Now let's check the actual instances running in your scale set:
az vmss list-instances --resource-group terraform-vmss-rg --name tf-vmss --output table

Check instance connection info:
az vmss list-instance-connection-info --resource-group terraform-vmss-rg --name tf-vmss --output table

The command returned no output, meaning no connection endpoints are configured.
Check if instances exist in the scale set:
az vmss list-instances --resource-group terraform-vmss-rg --name tf-vmss --output table

Great. We have one instance (tf-vmss_0) running successfully.
Use the public_ip output or curl the domain from azurerm_public_ip to see the nginx page. (public_ip = "4.227.9.177")
You can access the nginx page using the public IP address. Here are the ways to do it:
http://4.227.9.177 on a browser or curl --connect-timeout 15 -m 30 -v http://4.227.9.177 on your terminal. 
Conclusion
Deploying an Azure Virtual Machine Scale Set using Terraform and pushing it to GitHub exemplifies modern DevOps excellenceโcombining infrastructure as code with version control for scalability, repeatability, and team synergy. With Terraformโs declarative power and GitHubโs collaborative backbone, you've laid the foundation for resilient cloud architecture and agile development workflows. This approach not only accelerates deployment but also ensures traceability, security, and continuous improvement across environments.
 
 
              


 
    
Top comments (0)