DEV Community: Emmanuel Chukwudi

Agile, Scrum, and Azure Boards: The Theory Behind the Tool

Emmanuel Chukwudi — Thu, 18 Jun 2026 14:15:50 +0000

Most of us land in Azure Boards, Jira, or Trello before we ever read the Agile Manifesto. We learn to drag cards across columns and fill in story points because that's what the team does not because we understand why the framework is shaped that way. This post works backward: starting from the theory (Agile, Scrum, Sprints, Backlogs) and ending at how Azure Boards actually implements it, so the tool stops feeling like a checkbox exercise and starts making sense as a system.

What Agile Actually Is

Agile isn't a process it's a set of values for how software gets built. The Agile Manifesto (2001) boils down to four preferences:

Individuals and interactions over rigid processes and tools
Working software over comprehensive documentation
Customer collaboration over fixed contracts
Responding to change over following a fixed plan

None of that says "use two-week sprints" or "hold a daily standup." Those are implementation details that came later, baked into specific frameworks. Agile itself is just the philosophy: build in small increments, get feedback constantly, and stay willing to change direction. Scrum, Kanban, and XP are different ways of operationalizing that philosophy Scrum is just the one that became the default in most companies, including the one Azure Boards is modeled around.

Scrum: Roles, Events, Artifacts

Scrum structures Agile into a repeatable rhythm built from three categories.

Roles

Product Owner owns the backlog, decides what gets built and in what order, represents the business/customer side.
Scrum Master owns the process, not the product. Removes blockers, protects the team from scope creep mid-sprint, facilitates the events below.
Development Team the engineers actually building the thing. Cross-functional and self-organizing; nobody outside the team assigns individual tasks.

Events

Sprint Planning: the team pulls items from the product backlog into the sprint backlog and commits to a sprint goal.
Daily Scrum (Standup): a short daily sync, traditionally answering: what did I do, what will I do, what's blocking me.
Sprint Review: a demo of what was actually completed, shown to stakeholders, at the end of the sprint.
Sprint Retrospective: the team reflects on how the sprint went and agrees on one or two process improvements for next time.

Artifacts

Product Backlog: the full, ever-evolving list of everything that might get built, ranked by priority.
Sprint Backlog: the slice of that list the team has committed to for the current sprint.
Increment: the actual working, shippable output of the sprint.

What a Sprint Actually Is

A Sprint is a fixed-length time box usually one, two, or four weeks, and it doesn't change once a team picks a length. Inside that window the team plans, builds, and reviews one slice of work toward a sprint goal. The fixed length is the point: it forces scope to flex around time instead of time flexing around scope, which is what keeps a project from quietly sliding into "we'll ship it when it's done."

Backlogs: Product vs. Sprint

These two get conflated constantly, so it's worth being precise.

The Product Backlog is the master list every feature, bug fix, and technical debt item the team might ever do, ranked roughly by priority. It's never "finished"; it's a living document that the Product Owner continuously reorders as priorities shift.

The Sprint Backlog is a temporary, much smaller subset: the items pulled out of the product backlog during sprint planning that the team has actually committed to delivering in the current sprint. Once a sprint starts, the sprint backlog is meant to stay stable; new work doesn't get added mid-sprint just because it seems urgent (that's actually one of the more common ways teams sabotage their own velocity).

Where Azure Boards Fits In

Azure Boards is Microsoft's implementation of all of the above, and once you know the theory, the tool stops feeling arbitrary.

The work item hierarchy mirrors how Agile thinking breaks down scope:

Epic        → large business objective, spans months/quarters
  Feature   → a shippable slice of that objective
    Story   → a single piece of user-facing value (or a Bug, same level)
      Task  → the technical steps an engineer actually executes

A concrete example from a typical DevOps backlog: an Epic like "Migrate to GitOps delivery," a Feature underneath it like "Integrate ArgoCD with AKS," a Story like "As a DevOps engineer, I want ArgoCD to auto-sync manifest changes," and Tasks like "Install ArgoCD via Helm" or "Configure the Application CRD."

Sprint backlog in Azure Boards is the practical manifestation of the Scrum artifact: stories get pulled into the current sprint, estimated in story points, and broken into tasks with hour estimates. The Capacity tab compares planned hours against each person's actual availability, and the Taskboard lets the team drag tasks across To Do → In Progress → Done daily — which is what generates the Burndown chart, a declining line tracking remaining work against the sprint timeline.

Velocity is the historical record of completed story points per sprint, shown as a bar chart over recent sprints. It exists purely for forecasting if a team averages 30 points a sprint, that becomes a sane ceiling for planning the next one. It's a trailing indicator of real throughput, not a target to optimize; teams that start inflating point estimates to chase a "better" velocity number just make the number meaningless.

Kanban boards with WIP limits are Azure Boards' answer to continuous flow rather than time-boxed sprints. Columns represent actual process states (Backlog → Dev → Code Review → Testing → Done), and a WIP limit caps how many items can sit in a column at once. Hit the limit, and the team has to clear existing work before pulling anything new which is the entire mechanism that turns a Kanban board into a flow-management tool instead of a glorified to-do list. The Cumulative Flow Diagram stacks item counts per state over time, making bottlenecks visible as a widening band before anyone has to notice manually.

Scrum vs. Kanban, in One Line

Scrum optimizes for predictable, time-boxed delivery with a fixed commitment per cycle. Kanban optimizes for continuous flow with no fixed cycle, using WIP limits instead of sprints to control pace. Azure Boards doesn't force a choice most real teams run sprints for planning cadence and a Kanban board for the daily execution view of that same backlog.

Takeaway

The tool only makes sense once you see it as a direct implementation of the theory: Agile sets the values, Scrum operationalizes them into roles/events/artifacts, the Sprint is the time box that everything else hangs off of, and the Backlog is the prioritized list that feeds it. Azure Boards just gives all of that a UI Epics and Features for scope, Sprint Backlogs and Capacity views for commitment, Velocity for forecasting, and Kanban WIP limits for flow control.

Azure Virtual Machine Scale Sets (VMSS): A Complete Guide

Emmanuel Chukwudi — Sun, 14 Jun 2026 10:16:23 +0000

Learn how to deploy, configure, and auto-scale fleets of identical VMs on Azure from zero to production-ready.

Introduction

Imagine your application suddenly gets a traffic spike maybe a product launch, a viral post, or a scheduled batch job. Without automation, you're either over-provisioned (paying for idle VMs) or scrambling to manually spin up instances while users hit errors.

Azure Virtual Machine Scale Sets (VMSS) solve this. They let you deploy and manage a group of identical, load-balanced VMs that automatically scale in or out based on demand all from a single configuration.

In this guide, we'll cover:

What VMSS is and how it works under the hood
Orchestration modes: Uniform vs Flexible
How to create a VMSS via the Portal and Azure CLI
Configuring autoscaling rules
Integrating with a Load Balancer
Updating your Scale Set (rolling upgrades)
Real-world best practices

Let's build.

What is a Virtual Machine Scale Set?

A Virtual Machine Scale Set is an Azure compute resource that lets you create and manage a group of load-balanced VMs. Key characteristics:

All VMs in a scale set are created from the same base image and configuration
The set can automatically increase or decrease the number of VM instances based on demand or a schedule
VMs are distributed across Availability Zones or Fault/Update Domains for high availability
Integrates natively with Azure Load Balancer, Application Gateway, and Azure Monitor

How It Works

                        ┌─────────────────────────────────────┐
                        │         Azure Load Balancer          │
                        └────────────────┬────────────────────┘
                                         │
                   ┌─────────────────────┼─────────────────────┐
                   │                     │                       │
            ┌──────▼──────┐      ┌───────▼─────┐      ┌────────▼────┐
            │   VM #1      │      │    VM #2     │      │   VM #3     │
            │ (instance 0) │      │ (instance 1) │      │ (instance 2)│
            └─────────────┘      └─────────────┘      └────────────-┘
                   │                     │                       │
                   └─────────────────────┼─────────────────────┘
                                         │
                              ┌──────────▼──────────┐
                              │   Autoscale Engine   │
                              │  (Azure Monitor)     │
                              └─────────────────────┘

When CPU crosses a threshold (or any metric you define), Azure's autoscale engine fires and adds or removes VM instances automatically. The Load Balancer redistributes traffic across the new fleet.

Orchestration Modes: Uniform vs Flexible

Before creating a VMSS, you need to choose an orchestration mode. This is one of the most important decisions and a common source of confusion.

Uniform Orchestration (Classic)

All VMs are identical same size, same image, same config
Azure manages the VMs as a fleet; you interact with the scale set, not individual VMs
Best for stateless workloads: web servers, API backends, batch processing
Supports up to 1,000 VM instances (with platform images)
Built-in integration with autoscale

Flexible Orchestration (Modern — Recommended)

VMs can have different sizes and configurations within the same scale set
You get full VM-level control SSH, unique managed identities, individual updates
Supports mixing spot and on-demand instances in the same set
Works across Availability Zones with zone balancing
Supports up to 1,000 instances
Microsoft's recommended mode for new workloads

Feature	Uniform	Flexible
VM customization	All identical	Individual VM control
Max instances	1,000	1,000
Autoscale	✅	✅
Spot + On-demand mix	❌	✅
Availability Zones	✅	✅
Use case	Stateless fleets	General purpose

For new projects, default to Flexible orchestration unless you have a specific reason for Uniform.

Prerequisites

Before creating a VMSS, make sure you have:

An Azure subscription
Azure CLI installed: az --version (install from aka.ms/installazurecliwindows)
A Resource Group and Virtual Network ready (we'll create these below)

# Login to Azure
az login

# Set your subscription
az account set --subscription "your-subscription-id"

# Create a resource group
az group create \
  --name vmss-demo-rg \
  --location eastus

# Create a VNet and subnet
az network vnet create \
  --resource-group vmss-demo-rg \
  --name vmss-vnet \
  --address-prefix 10.0.0.0/16 \
  --subnet-name vmss-subnet \
  --subnet-prefix 10.0.1.0/24

Part 1: Create a VMSS via the Azure Portal

Step 1: Navigate to Virtual Machine Scale Sets

In the Azure Portal, search for "Virtual machine scale sets" in the top search bar
Click + Create

Step 2: Basics Tab

Fill in the following:

Field	Value
Subscription	Your subscription
Resource group	`vmss-demo-rg`
Virtual machine scale set name	`my-app-vmss`
Region	East US
Availability zone	Zones 1, 2, 3 (select all for HA)
Orchestration mode	Flexible (recommended)
Security type	Standard
Image	Ubuntu Server 22.04 LTS
VM architecture	x64
Size	Standard_B2s (2 vCPUs, 4 GB RAM)
Authentication type	SSH public key
Username	`azureuser`
SSH public key	Paste your public key

Tip on Availability Zones: Selecting all three zones means Azure will spread your VMs across three physically separate datacenters. If one zone goes down, the others keep serving traffic.

Step 3: Disks Tab

OS disk type: Premium SSD (for production) or Standard SSD (for dev/test)
Encryption: Platform-managed key (default) or Customer-managed key

Step 4: Networking Tab

Virtual network: Select vmss-vnet
Subnet: vmss-subnet
Load balancing: Select Azure load balancer
Click Create a load balancer:
- Name: my-lb
- Type: Public (for internet-facing) or Internal (for private)
- Protocol: TCP
- Frontend port: 80
- Backend port: 80
Public IP address: Create new → my-app-lb-pip
NIC network security group: Advanced → Create NSG
- Add inbound rule: Allow TCP 80 from Internet
- Add inbound rule: Allow TCP 22 from your IP (for SSH)

Step 5: Scaling Tab

This is where VMSS gets powerful.

Field	Value
Initial instance count	2
Scaling policy	Autoscale

Configure autoscale:

Click Configure
Set minimum instances: 2
Set maximum instances: 10
Set default instance count: 2

Add a scale-out rule:

Metric: Percentage CPU
Operator: Greater than
Threshold: 75%
Duration: 5 minutes
Action: Increase count by 2
Cool down: 5 minutes

Add a scale-in rule:

Metric: Percentage CPU
Operator: Less than
Threshold: 25%
Duration: 5 minutes
Action: Decrease count by 1
Cool down: 5 minutes

Cool down periods prevent flapping where the scale set rapidly adds and removes instances in response to brief spikes. Always set a cool down of at least 5 minutes.

Step 6: Health Tab

Enable Application health monitoring:

Extension: Application Health Extension
Protocol: HTTP
Port: 80
Path: /health (or / if you don't have a health endpoint)

This lets Azure know whether individual VM instances are actually serving traffic successfully — not just whether the VM is running.

Step 7: Advanced Tab

Custom data (cloud-init): You can pass a startup script that runs on every new instance:

#cloud-config
package_update: true
packages:
  - nginx
runcmd:
  - systemctl enable nginx
  - systemctl start nginx
  - echo "Hello from $(hostname)" > /var/www/html/index.html

Paste this in the Custom data field (base64 encoding is handled automatically by the portal).

Step 8: Review + Create

Review all settings, then click Create. Azure will:

Provision the Load Balancer and Public IP
Create the initial VM instances (2 in our case)
Register them with the load balancer backend pool
Apply the autoscale policy

The deployment typically takes 3–5 minutes.

Part 2: Create a VMSS via Azure CLI

The CLI approach is faster, scriptable, and version-controllable ideal for CI/CD pipelines and IaC workflows.

Create the VMSS

az vmss create \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --image Ubuntu2204 \
  --vm-sku Standard_B2s \
  --instance-count 2 \
  --admin-username azureuser \
  --generate-ssh-keys \
  --vnet-name vmss-vnet \
  --subnet vmss-subnet \
  --public-ip-address my-app-lb-pip \
  --lb my-app-lb \
  --backend-pool-name my-app-backend \
  --lb-sku Standard \
  --zones 1 2 3 \
  --orchestration-mode Flexible \
  --upgrade-policy-mode Rolling \
  --custom-data cloud-init.yaml

The --lb flag automatically creates an Azure Load Balancer and wires the VMSS backend pool to it.

Open Port 80 on the Load Balancer

# Create a load balancer rule for HTTP traffic
az network lb rule create \
  --resource-group vmss-demo-rg \
  --lb-name my-app-lb \
  --name http-rule \
  --protocol tcp \
  --frontend-port 80 \
  --backend-port 80 \
  --frontend-ip-name loadBalancerFrontEnd \
  --backend-pool-name my-app-backend \
  --probe-name healthProbe

# Create a health probe
az network lb probe create \
  --resource-group vmss-demo-rg \
  --lb-name my-app-lb \
  --name healthProbe \
  --protocol http \
  --port 80 \
  --path /

Configure Autoscale Rules

# Get the VMSS resource ID
VMSS_ID=$(az vmss show \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --query id \
  --output tsv)

# Create the autoscale profile
az monitor autoscale create \
  --resource-group vmss-demo-rg \
  --resource $VMSS_ID \
  --resource-type Microsoft.Compute/virtualMachineScaleSets \
  --name my-app-autoscale \
  --min-count 2 \
  --max-count 10 \
  --count 2

# Add scale-out rule (CPU > 75% → add 2 instances)
az monitor autoscale rule create \
  --resource-group vmss-demo-rg \
  --autoscale-name my-app-autoscale \
  --condition "Percentage CPU > 75 avg 5m" \
  --scale out 2 \
  --cooldown 5

# Add scale-in rule (CPU < 25% → remove 1 instance)
az monitor autoscale rule create \
  --resource-group vmss-demo-rg \
  --autoscale-name my-app-autoscale \
  --condition "Percentage CPU < 25 avg 5m" \
  --scale in 1 \
  --cooldown 5

Verify the Deployment

# List all instances in the scale set
az vmss list-instances \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --output table

# Check instance health
az vmss get-instance-view \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --instance-id 0

# Get the public IP of the load balancer
az network public-ip show \
  --resource-group vmss-demo-rg \
  --name my-app-lb-pip \
  --query ipAddress \
  --output tsv

Open a browser and navigate to the public IP — you should see your Nginx page.

Part 3: Upgrade Policies... how to Update Your Fleet

One of the trickiest parts of managing a scale set is rolling out updates (new OS image, new app version) without downtime. VMSS supports three upgrade modes:

Automatic Upgrades

Azure automatically upgrades VM instances as soon as the scale set model is updated. No manual intervention needed, but instances may restart without warning.

az vmss update \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --set upgradePolicy.mode=Automatic

Best for: Dev/test environments.

Rolling Upgrades (Recommended for Production)

Azure upgrades VMs in batches, validating health before moving to the next batch. Requires health probes to be configured.

az vmss update \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --set upgradePolicy.mode=Rolling \
  --set upgradePolicy.rollingUpgradePolicy.maxBatchInstancePercent=20 \
  --set upgradePolicy.rollingUpgradePolicy.maxUnhealthyInstancePercent=20 \
  --set upgradePolicy.rollingUpgradePolicy.maxUnhealthyUpgradedInstancePercent=5 \
  --set upgradePolicy.rollingUpgradePolicy.pauseTimeBetweenBatches=PT30S

With this config, Azure upgrades 20% of VMs at a time, waits 30 seconds between batches, and stops if more than 5% of upgraded instances become unhealthy.

Manual Upgrades

The scale set model updates, but instances are only upgraded when you explicitly tell Azure to do so. Maximum control, but requires operator action.

# Update the model (e.g., new image version)
az vmss update \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --set upgradePolicy.mode=Manual

# Manually upgrade specific instances
az vmss update-instances \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --instance-ids 0 1 2

Part 4: Scaling Operations

Manual Scaling

# Scale to 5 instances manually
az vmss scale \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --new-capacity 5

Schedule-Based Scaling

Useful for predictable traffic patterns (e.g., scale up at 8am, scale down at 8pm).

# Scale up at 8am UTC on weekdays
az monitor autoscale profile create \
  --resource-group vmss-demo-rg \
  --autoscale-name my-app-autoscale \
  --name weekday-peak \
  --min-count 4 \
  --max-count 10 \
  --count 4 \
  --recurrence week mon tue wed thu fri \
  --timezone "UTC" \
  --start 08:00 \
  --end 20:00

SSH Into a Specific Instance

# Get the NAT rules to find which port maps to which instance
az network lb inbound-nat-rule list \
  --resource-group vmss-demo-rg \
  --lb-name my-app-lb \
  --output table

# SSH using the NAT port (e.g., port 50000 maps to instance 0)
ssh -p 50000 azureuser@<load-balancer-public-ip>

Part 5: Monitoring Your Scale Set

View Autoscale Activity

az monitor activity-log list \
  --resource-group vmss-demo-rg \
  --max-events 20 \
  --query "[?contains(operationName.value, 'autoscale')]" \
  --output table

Key Metrics to Monitor in Azure Monitor

Metric	Description	Alert threshold
Percentage CPU	Average CPU across all instances	> 80% for 10 min
Network In/Out	Traffic volume	Spike detection
Disk Read/Write	Storage I/O	> 90% of provisioned IOPS
VmAvailabilityMetric	Instance health status	Any unhealthy
Autoscale Scale Actions	Scale in/out events	Alert on unexpected scale-in

# Create a CPU alert
az monitor metrics alert create \
  --resource-group vmss-demo-rg \
  --name high-cpu-alert \
  --scopes $VMSS_ID \
  --condition "avg Percentage CPU > 85" \
  --window-size 5m \
  --evaluation-frequency 1m \
  --action my-action-group \
  --description "VMSS CPU exceeded 85% for 5 minutes"

Part 6: Using a Custom Image with VMSS

Remember the Azure Custom Image we built in the previous article? Here's how to plug it into a VMSS.

# Get your gallery image version ID
IMAGE_ID=$(az sig image-version show \
  --resource-group my-resource-group \
  --gallery-name MyAppGallery \
  --gallery-image-definition MyAppImage \
  --gallery-image-version 1.0.0 \
  --query id \
  --output tsv)

# Create VMSS using your custom image
az vmss create \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --image $IMAGE_ID \
  --vm-sku Standard_B2s \
  --instance-count 2 \
  --admin-username azureuser \
  --generate-ssh-keys \
  --lb my-app-lb \
  --zones 1 2 3 \
  --orchestration-mode Flexible

This is the golden image pattern in action — every instance spins up from your pre-configured, pre-hardened image.

Architecture: Production-Ready VMSS Setup

Here's what a production VMSS deployment typically looks like:

                         Internet
                            │
                   ┌────────▼────────┐
                   │  Azure Front    │
                   │  Door / WAF     │
                   └────────┬────────┘
                            │
                   ┌────────▼────────┐
                   │  App Gateway /  │
                   │  Load Balancer  │
                   └────────┬────────┘
                            │
              ┌─────────────┼─────────────┐
              │ Zone 1      │ Zone 2       │ Zone 3
         ┌────▼────┐   ┌────▼────┐   ┌────▼────┐
         │  VM #1  │   │  VM #2  │   │  VM #3  │
         │ (VMSS)  │   │ (VMSS)  │   │ (VMSS)  │
         └────┬────┘   └────┬────┘   └────┬────┘
              │              │              │
              └──────────────┼──────────────┘
                             │
                    ┌────────▼────────┐
                    │  Azure Monitor  │
                    │  + Autoscale    │
                    └────────┬────────┘
                             │
              ┌──────────────┼───────────────┐
              │              │               │
      ┌───────▼──┐   ┌───────▼──┐   ┌───────▼──┐
      │ Azure DB │   │Key Vault │   │  Storage  │
      └──────────┘   └──────────┘   └───────────┘

Key components:

Azure Front Door or WAF: DDoS protection and global routing at the edge
Application Gateway or Load Balancer: Layer 7 or Layer 4 traffic distribution
VMSS across 3 Availability Zones: High availability against datacenter failures
Azure Monitor + Autoscale: Reactive and scheduled scaling
Azure Key Vault: Secrets injected at runtime, never baked into images
Managed Identity: VM instances authenticate to Azure services without credentials

Best Practices

Design for Statelessness

VMs in a scale set can be added or removed at any time. Your application should:

Store session data in Azure Cache for Redis, not in-memory
Write files to Azure Blob Storage or a shared file system, not local disk
Use Azure Service Bus or Event Hub for message queuing

Use Spot Instances for Cost Savings

For fault-tolerant, interruptible workloads (batch jobs, rendering, CI runners), mix spot instances with on-demand:

az vmss create \
  --resource-group vmss-demo-rg \
  --name my-batch-vmss \
  --priority Spot \
  --eviction-policy Deallocate \
  --max-price 0.05 \
  --image Ubuntu2204 \
  --vm-sku Standard_D4s_v3 \
  --instance-count 0

Spot instances can save up to 90% compared to on-demand pricing — with the tradeoff that Azure can evict them when capacity is needed.

Always Configure Health Probes

Without health probes, Azure doesn't know if your application is actually working. A VM could be running but serving 500 errors, and autoscale would keep it in the pool.

az vmss update \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --set virtualMachineProfile.extensionProfile.extensions='[{
    "name": "HealthExtension",
    "properties": {
      "publisher": "Microsoft.ManagedServices",
      "type": "ApplicationHealthLinux",
      "typeHandlerVersion": "1.0",
      "settings": {
        "protocol": "http",
        "port": 80,
        "requestPath": "/health"
      }
    }
  }]'

Protect Against Accidental Scale-In

In production, you may want to prevent certain instances from being terminated during a scale-in event (e.g., an instance running a long job).

# Protect a specific instance from scale-in
az vmss update-instances \
  --resource-group vmss-demo-rg \
  --name my-app-vmss \
  --instance-ids 2 \
  --protect-from-scale-in true

Quick Reference — Common CLI Commands

# Create VMSS
az vmss create --resource-group <rg> --name <name> --image <image> --instance-count <n>

# List instances
az vmss list-instances --resource-group <rg> --name <name> --output table

# Scale manually
az vmss scale --resource-group <rg> --name <name> --new-capacity <n>

# Update instances to latest model
az vmss update-instances --resource-group <rg> --name <name> --instance-ids "*"

# Reimage an instance (fresh OS disk)
az vmss reimage --resource-group <rg> --name <name> --instance-id <id>

# Delete a specific instance
az vmss delete-instances --resource-group <rg> --name <name> --instance-ids <id>

# Show autoscale settings
az monitor autoscale show --resource-group <rg> --name <autoscale-name>

# Delete the entire VMSS
az vmss delete --resource-group <rg> --name <name>

Conclusion

Azure Virtual Machine Scale Sets are one of the most powerful tools in a cloud engineer's toolkit. Once you understand the orchestration modes, upgrade policies, and autoscale configuration, you can build infrastructure that handles anything from a quiet weekend to a viral traffic spike without manual intervention.

Recap of what we covered:

Uniform vs Flexible orchestration modes
Creating a VMSS via Portal and Azure CLI
Wiring up a Load Balancer with health probes
Configuring CPU-based and schedule-based autoscaling
Rolling upgrade strategies for zero-downtime deployments
Using custom images from Azure Compute Gallery
Production architecture patterns and best practices

Found this helpful? Drop a ❤️ and share it with your team. Questions or corrections? Leave a comment below.

Deploying Your First App on Kubernetes: A Beginner's Guide (Minikube & Kind)

Emmanuel Chukwudi — Mon, 25 May 2026 11:44:23 +0000

If you've just learned the basics of Kubernetes Pods, Deployments, ReplicaSets, and Services the best next step is to actually use them. Reading about self-healing and rolling updates is one thing; watching Kubernetes recreate a deleted Pod in real time is another.

In this guide, you'll deploy a simple Node.js app on a local Kubernetes cluster. We'll cover both Minikube and Kind (Kubernetes in Docker), so you can follow along whichever tool you prefer.

By the end, you'll have:

A containerised Node.js app running in Kubernetes
3 replicas managed by a Deployment and ReplicaSet
A Service exposing the app to your browser
Hands-on experience with self-healing and scaling

Prerequisites

Before we start, make sure you have these installed:

Docker required by both Minikube and Kind
kubectl the Kubernetes CLI
Either Minikube or Kind (installation covered below)

Part 1: Setting Up Your Local Cluster

You only need one of these. If you're not sure which to pick:

Minikube: slightly friendlier for beginners, has a built-in way to open services in the browser
Kind: lighter, faster, great if you already have Docker set up

Option A: Minikube

Install Minikube

macOS (Homebrew):

brew install minikube

Linux:

curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube

Windows (via winget):

winget install Kubernetes.minikube

Start your cluster:

minikube start

Verify it's running:

kubectl get nodes
# NAME       STATUS   ROLES           AGE   VERSION
# minikube   Ready    control-plane   10s   v1.x.x

Option B: Kind (Kubernetes in Docker)

Install Kind

macOS (Homebrew):

brew install kind

Linux:

curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.22.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind

Windows (via Chocolatey):

choco install kind

Create your cluster:

kind create cluster --name hello-cluster

Verify it's running:

kubectl get nodes
# NAME                         STATUS   ROLES           AGE   VERSION
# hello-cluster-control-plane  Ready    control-plane   10s   v1.x.x

Part 2: Build the Node.js App

Create a new folder for the project:

mkdir k8s-hello && cd k8s-hello

Create app.js:

nano/vim app.js

const http = require('http');
const os = require('os');

const server = http.createServer((req, res) => {
  res.end(`Hello from Pod: ${os.hostname()}\n`);
});

server.listen(3000, () => console.log('Running on port 3000'));

Why os.hostname()? In Kubernetes, each Pod gets a unique hostname. When the Service load-balances traffic across multiple Pods, you'll see different hostnames on each refresh proving which Pod served you.

Create Dockerfile:

FROM node:18-alpine
WORKDIR /app
COPY app.js .
CMD ["node", "app.js"]

Part 3: Build and Load the Docker Image

This step differs between Minikube and Kind pay attention here.

Minikube

Minikube runs its own Docker daemon inside a VM. Point your local Docker CLI at it so your build lands inside Minikube directly:

eval $(minikube docker-env)
docker build -t hello-app:v1 .

From this point, Minikube can see the image locally without needing Docker Hub.

Kind

Kind doesn't share a Docker daemon. You build the image normally, then explicitly load it into the cluster:

docker build -t hello-app:v1 .
kind load docker-image hello-app:v1 --name hello-cluster

Skipping kind load is the most common beginner mistake with Kind. Without it, your Pods will get stuck in ImagePullBackOff because Kind can't find the image.

Part 4: Write the Kubernetes YAML

Create deployment.yaml in your project folder:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: hello
  template:
    metadata:
      labels:
        app: hello
    spec:
      containers:
        - name: hello
          image: hello-app:v1
          imagePullPolicy: Never   # use local image, don't pull from Docker Hub
          ports:
            - containerPort: 3000
---
apiVersion: v1
kind: Service
metadata:
  name: hello-service
spec:
  type: NodePort
  selector:
    app: hello          # matches the Pod label above this is how Services find Pods
  ports:
    - port: 80
      targetPort: 3000
      nodePort: 30080

What's happening here:

The Deployment tells Kubernetes to keep 3 replicas of our Pod running at all times
It automatically creates a ReplicaSet to enforce that replica count
The Service uses the app: hello label selector to find all matching Pods and route traffic to them
imagePullPolicy: Never tells Kubernetes to use the locally available image instead of going to Docker Hub

Part 5: Deploy It

kubectl apply -f deployment.yaml

You should see:

deployment.apps/hello-deployment created
service/hello-service created

Check your Pods are coming up:

kubectl get pods

Wait until all three show Running:

NAME                                READY   STATUS    RESTARTS   AGE
hello-deployment-57c4d87bf-abc12    1/1     Running   0          15s
hello-deployment-57c4d87bf-def34    1/1     Running   0          15s
hello-deployment-57c4d87bf-ghi56    1/1     Running   0          15s

Check your ReplicaSet and Service too:

kubectl get replicaset
kubectl get service hello-service

Part 6: Open the App in Your Browser

Minikube

minikube service hello-service

Minikube opens the URL in your browser automatically.

Kind

Kind doesn't expose NodePort services directly, so use port-forwarding:

kubectl port-forward service/hello-service 8080:80

Then open http://localhost:8080.

Hit refresh a few times. You'll see the Pod hostname change the Service is load-balancing across your 3 Pods.

Hello from Pod: hello-deployment-57c4d87bf-abc12
Hello from Pod: hello-deployment-57c4d87bf-ghi56
Hello from Pod: hello-deployment-57c4d87bf-def34

Part 7: Experiments (The Real Learning)

Now that everything is running, try these one by one. Each one demonstrates a core Kubernetes behaviour.

1. Self-healing...delete a Pod manually

# grab any pod name
kubectl get pods

# delete it
kubectl delete pod hello-deployment-57c4d87bf-abc12

# watch what happens
kubectl get pods -w

Kubernetes detects the replica count dropped to 2 and immediately creates a new Pod. This is the ReplicaSet controller doing its job.

2. Scaling up

kubectl scale deployment hello-deployment --replicas=5
kubectl get pods

Two new Pods appear almost instantly.

3. Scaling down

kubectl scale deployment hello-deployment --replicas=1
kubectl get pods

Four Pods terminate gracefully, one remains.

4. Rolling update with zero downtime

Edit app.js to change the response message:

res.end(`Hello from Pod v2: ${os.hostname()}\n`);

Build a new image:

# Minikube
eval $(minikube docker-env)
docker build -t hello-app:v2 .

# Kind
docker build -t hello-app:v2 .
kind load docker-image hello-app:v2 --name hello-cluster

Update the Deployment:

kubectl set image deployment/hello-deployment hello=hello-app:v2

Watch the rolling update:

kubectl rollout status deployment/hello-deployment

Kubernetes replaces Pods one at a time, keeping the app available throughout.

5. Inspect a Pod

kubectl describe pod <pod-name>

This shows you the Pod's IP, which Node it's on, its labels, and a full event log — useful for debugging.

6. Roll back

If something goes wrong with an update:

kubectl rollout undo deployment/hello-deployment

Kubernetes switches back to the previous ReplicaSet.

Part 8: Clean Up

kubectl delete -f deployment.yaml

Minikube:

minikube stop

Kind:

kind delete cluster --name hello-cluster

Note for Kind users: Kind clusters don't survive a machine restart. If you reboot and come back to this project, run kind create cluster --name hello-cluster and kind load docker-image hello-app:v1 --name hello-cluster before applying your YAML again.

What You Just Built

Here's what was happening under the hood the whole time:

Your Browser
     ↓
 [Service]             ← watched for Pods with label app: hello
     ↓
 [ReplicaSet]          ← enforced 3 running replicas at all times
  ↓     ↓     ↓
[Pod] [Pod] [Pod]      ← each ran your Node.js container

Every concept from the Kubernetes basics maps to something you just did:

Concept	What you observed
Pod	The unit running your container, with a unique hostname
ReplicaSet	Recreated a Pod immediately after you deleted one
Deployment	Managed the rolling update and rollback
Service	Load-balanced traffic across all 3 Pods using label selectors

What's Next?

Now that you have the fundamentals working, here are good next topics to explore:

Namespaces isolate workloads for different teams or environments
ConfigMaps & Secrets externalise config and credentials from your container
Ingress a cleaner alternative to NodePort for routing external traffic
Persistent Volumes attach storage that survives Pod restarts
Liveness & Readiness Probes teach Kubernetes when your Pod is actually healthy

If you ran into issues or have questions, drop them in the comments. The most common problems are forgetting kind load docker-image (Kind) or not running eval $(minikube docker-env) before building (Minikube).

How to Read Any GitHub Repo and Write Its Dockerfile From Scratch

Emmanuel Chukwudi — Sun, 17 May 2026 14:13:07 +0000

A DevOps Engineer's Evidence-Based Approach

This guide walks through a real open-source project Ridan Express and shows you exactly how to analyze a repo's files, understand what the app needs, and write a production-ready Dockerfile without guessing.

Who This Is For

You're learning DevOps. You clone a repo, stare at it, and freeze you don't know where to start. Should you Dockerize it? What base image do you use? What commands do you run inside the container?

This article teaches you the mental model professionals use: reading a project's files as clues and letting the evidence tell you what to build.

The Project: Ridan Express

Ridan Express is a React frontend for a ride/delivery platform (think Uber or DoorDash). It uses:

React 18 + Vite as the build tool
Tailwind CSS + Material UI for styling
Redux for state management
socket.io-client for real-time features (live tracking)
mapbox-gl for maps
Stripe for payments
Google OAuth for login

Currently deployed on Vercel. No Dockerfile exists. That's your job.

Step 1: Read the Files Before You Write Anything

The golden rule: never write a Dockerfile cold. Always read these files first:

File	What it tells you
`package.json`	Language, runtime version, dependencies, build commands
`package-lock.json`	Exact locked dependency versions
`vite.config.js` / `webpack.config.js`	Build tool and output configuration
`vercel.json` / `netlify.toml`	How it's currently deployed (big clue)
`build/` or `dist/` folder	Where compiled output lands
`.env.example`	Environment variables the app needs

Let's walk through each clue in Ridan Express.

Clue 1: `package.json` → Choose Your Base Image

Opening package.json, the first thing to notice is this:

"engines": {
  "node": "22.x"
}

The developer told you exactly which Node.js version this app needs. This directly maps to your Dockerfile's first line:

FROM node:22-alpine

Why alpine? The Alpine variant of Node is a minimal Linux distribution around 50MB instead of 900MB+ for the full image. Always prefer Alpine for production unless you need specific system libraries.

The rule: "engines" in package.json → your FROM node:X-alpine version.

Clue 2: `package-lock.json` → Use `npm ci`, Not `npm install`

The presence of package-lock.json alongside package.json is a deliberate signal. Here's the critical distinction:

Command	Behavior
`npm install`	Installs dependencies, may update versions
`npm ci`	Installs exact versions from `package-lock.json`, fails if they don't match

In a Docker build, you always want npm ci. It's faster, deterministic, and prevents "it worked on my machine" bugs. Your Dockerfile should copy both files before installing:

COPY package*.json ./
RUN npm ci

The package*.json glob copies both package.json and package-lock.json in one line.

Why copy these before the rest of the code? Docker builds in layers and caches each one. If you copy everything first, any code change invalidates the dependency cache and forces a full npm ci on every build slow. Copying package*.json first means Docker only re-runs npm ci when dependencies actually change.

Clue 3: `vite.config.js` + Scripts → Your Build Command

In package.json, the scripts section reads:

"scripts": {
  "ridan": "vite",
  "build": "vite build"
}

vite build compiles your entire React app JSX, TypeScript, CSS modules, imports into plain HTML, CSS, and JavaScript files. No more React, no more JSX, no more Node.js required. Just static files a browser can load directly.

This translates to:

COPY . .
RUN npm run build

After this runs, a build/ folder appears containing your compiled app, ready to be served.

Clue 4: The `build/` Folder → You Don't Need Node Anymore

This is the insight that changes everything.

After npm run build finishes, the output in build/ is just:

build/
  index.html
  static/
    js/main.abc123.js
    css/main.def456.css
    media/logo.png

These are static files. A browser can load them directly. You no longer need Node.js, npm, React, or any of your dependencies. They were only needed during the build process.

So why keep a 500MB Node.js environment in your production image just to serve a few HTML files? You don't.

Clue 5: Static Output → Serve With Nginx

Since the output is static files, the right tool to serve them is Nginx a battle-tested, lightweight web server used by some of the highest-traffic sites in the world.

FROM nginx:alpine
COPY --from=builder /app/build /usr/share/nginx/html

/usr/share/nginx/html is Nginx's default document root the folder it serves files from. You're dropping your compiled app right there.

The nginx:alpine image is around 25MB total. Compare that to keeping Node.js around at 500MB+.

Putting It All Together: The Multi-Stage Dockerfile

Here's the complete Dockerfile with every line explained:

# ── STAGE 1: Build ─────────────────────────────────────────────
# Use Node 22 (matches "engines" in package.json), Alpine for small size
FROM node:22-alpine AS builder

# Set working directory inside the container
WORKDIR /app

# Copy dependency manifests FIRST (enables Docker layer caching)
# Only re-runs npm ci when package files change, not on every code change
COPY package*.json ./

# Install exact versions from package-lock.json (deterministic, production-safe)
RUN npm ci

# Copy the rest of the source code
COPY . .

# Compile React → static HTML/CSS/JS in the /app/build folder
RUN npm run build


# ── STAGE 2: Serve ─────────────────────────────────────────────
# Start fresh with a minimal Nginx image (~25MB vs 500MB+ for Node)
FROM nginx:alpine

# Copy ONLY the compiled output from Stage 1 nothing else
# Node.js, npm, node_modules, and source code are all left behind
COPY --from=builder /app/build /usr/share/nginx/html

# Tell Docker this container listens on port 80
EXPOSE 80

# Nginx starts automatically no CMD needed for the default config

Understanding Multi-Stage Builds Visually

┌─────────────────────────────┐       ┌──────────────────────────┐
│      Stage 1: builder       │       │     Stage 2: final       │
│      node:22-alpine         │       │      nginx:alpine        │
│                             │       │                          │
│  ✓ Node.js runtime          │  ──▶  │  ✓ Compiled HTML/CSS/JS  │
│  ✓ npm + package manager    │  only │  ✓ Nginx web server      │
│  ✓ 300MB node_modules       │  /build  │                       │
│  ✓ React source code        │       │  ✗ No Node.js            │
│  ✓ Vite build toolchain     │       │  ✗ No npm               │
│                             │       │  ✗ No source code        │
│  ← DISCARDED after build →  │       │  ← SHIPPED TO PROD →     │
│       ~600MB                │       │       ~30MB              │
└─────────────────────────────┘       └──────────────────────────┘

The first stage is a construction site. The second stage is the finished building. You ship the building, not the scaffolding.

The Evidence-to-Dockerfile Mental Map

Every line in the Dockerfile traces back to a file in the repo:

package.json
  ├── "engines": { "node": "22.x" }  ──────▶  FROM node:22-alpine
  └── "build": "vite build"  ────────────────▶  RUN npm run build

package-lock.json exists
  └──────────────────────────────────────────▶  RUN npm ci

vite.config.js exists
  └── output goes to /build folder  ─────────▶  COPY /app/build → nginx

Output is static files (no server-side rendering)
  └──────────────────────────────────────────▶  FROM nginx:alpine

Nothing is guessed. Everything is derived from evidence.

When to Use Nginx vs Keeping Node Running

You used a static Nginx serve here because Vite pre-compiles everything. But not all React apps work this way:

App type	Clue in repo	Serve with
Static React (Vite/CRA)	`vite build` or `react scripts build`	Nginx (static files)
Next.js with SSR	`next start` in scripts	Keep Node running
Express API backend	`server.js` or `app.js` at root	Keep Node running
Nuxt.js	`nuxt start` in scripts	Keep Node running

If npm start runs a server (not just opens a browser), keep Node. If npm run build produces a folder of files, use Nginx.

What Comes Next for This Project

The Dockerfile handles the frontend. But Ridan Express has more moving parts visible in package.json:

socket.io-client: there's a Socket.io server somewhere handling real-time ride tracking
stripe: there's a payment processing backend
mapbox-gl: likely server-side route calculations
@react-oauth/google a backend endpoint validates the Google token

To fully containerize this platform you'd eventually write:

# docker-compose.yml (when you find/build the backend)
services:
  frontend:
    build: ./frontend
    ports: ["3000:80"]

  backend:
    build: ./backend
    ports: ["5000:5000"]
    environment:
      - STRIPE_SECRET_KEY=${STRIPE_SECRET_KEY}

  socket:
    build: ./socket-server
    ports: ["4000:4000"]

  database:
    image: postgres:15-alpine
    volumes: [pgdata:/var/lib/postgresql/data]

And that's when Kubernetes becomes relevant when you have multiple services that need to scale independently.

Quick Reference Cheat Sheet

See this in the repo          →  Do this in Dockerfile
─────────────────────────────────────────────────────────
package.json only             →  npm install
package.json + lock file      →  npm ci
"engines": node X             →  FROM node:X-alpine
"build": "vite build"         →  RUN npm run build → serve with Nginx
"build": "next build"         →  RUN npm run build → keep Node + CMD next start
server.js at root             →  Keep Node, CMD ["node", "server.js"]
requirements.txt              →  FROM python:3.X-slim
go.mod                        →  FROM golang:X-alpine + multi-stage
pom.xml (Java/Maven)          →  FROM maven:X AS builder + FROM eclipse-temurin

Summary

Writing a Dockerfile isn't about memorizing syntax it's about reading the project. The files in every repo are instructions waiting to be translated:

package.json engines → tells you the runtime version
package-lock.json → tells you to use npm ci
Build scripts → tells you the compile command
Output type (static vs server) → tells you whether to use Nginx or keep Node
Multi-stage builds → keep images small by separating build from runtime

Next time you open a repo, don't stare at it blankly. Start with package.json, follow the clues, and let the evidence write the Dockerfile for you.

Found this useful? The same detective approach applies to CI/CD pipelines and Kubernetes manifests the repo always tells you what it needs. Follow for more DevOps breakdowns.

Validate JWTs from Multiple Issuers in kgateway

Emmanuel Chukwudi — Sun, 17 May 2026 07:52:37 +0000

Production APIs often need to accept tokens from more than one identity provider for example, a tenant's own Auth0 tenant and Google Workspace for internal tools. kgateway's JWTPolicy resource lets you declare multiple issuers in one policy and attach it to any HTTPRoute, so you don't need a separate gateway per IdP.

This guide walks through a working, reproducible configuration. By the end you will have a policy that validates tokens from two issuers, rejects mismatched audiences, and forwards selected claims as upstream headers.

What is a JWT?

A JSON Web Token (JWT) is a compact, self-contained credential that an identity provider (IdP) issues to a user or service after they authenticate. Instead of your API checking a username and password on every request, the client attaches a JWT and your API trusts it because it was cryptographically signed by someone it already trusts.

Think of it like a signed event wristband. The venue (IdP) checks your ID once at the gate and gives you a wristband. Staff inside the venue (your APIs) can verify the wristband is genuine without phoning the front gate again. The wristband also says which areas you can access and it expires at midnight.

Structure of a JWT

A JWT is three Base64URL-encoded JSON objects joined by dots:

eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9   ← Header
.
eyJzdWIiOiJ1c2VyXzEyMyIsImVtYWlsIjoiYWxpY2VAZXhhbXBsZS5jb20i...  ← Payload
.
SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c  ← Signature

Part	What it contains
Header	Algorithm (`RS256`) and token type. Tells the verifier how to check the signature.
Payload	Claims about the user and the token who issued it, who it's for, when it expires.
Signature	Cryptographic proof the token hasn't been tampered with. Verified against the IdP's public key.

You can paste any JWT into jwt.io to decode it instantly.

What's inside the payload?

The payload is a JSON object of claims statements about the token and its subject. Some are standard; some are custom fields added by your IdP.

{
  "iss": "https://my-tenant.auth0.com/",  // issuer — who created the token
  "sub": "user_123",                       // subject — the user's unique ID
  "aud": "my-api",                         // audience — which service this token is for
  "exp": 1717000000,                       // expiry — Unix timestamp
  "email": "alice@example.com",            // custom claim added by Auth0
  "roles": ["admin", "editor"]             // custom claim for RBAC
}

How signature verification works (JWKS)

JWTs signed with RS256 use asymmetric cryptography: the IdP signs tokens with a private key that only it holds, and publishes the corresponding public keys at a well known URL called the JWKS endpoint (JSON Web Key Set). Anyone including kgateway can fetch these public keys and verify that a token was genuinely issued by that IdP and hasn't been altered since.

This means kgateway never needs to call back to your IdP on every request. It fetches the JWKS once, caches the keys, and verifies signatures locally at the data plane making validation fast and offline capable.

Why this matters for multi-issuer setups: Each IdP has its own JWKS endpoint and its own signing keys. kgateway can hold keys from multiple providers simultaneously, matching each incoming token to the right key by checking its iss claim first.

What you'll build

An HTTPRoute on /api that:

Accepts RS256-signed JWTs from Auth0 and Google
Enforces aud: my-api on tokens from both providers
Forwards the sub and email claims as X-User-Id and X-User-Email headers to your upstream service

Before you begin

kgateway ≥ 1.2 installed in your cluster
kubectl access with permissions to create custom resources
An Auth0 tenant with an API audience configured
A Google OAuth 2.0 client ID

How kgateway validates JWTs

Validation happens in the Envoy data plane before a request ever reaches your upstream. On each request, kgateway:

Extracts the bearer token from the Authorization: Bearer <token> header (configurable to cookies or query params).
Resolves the matching issuer by comparing the token's iss claim against each issuer declared in JWTPolicy.spec.providers. The first match wins.
Fetches and caches JWKS from the provider's jwks_uri. Keys are cached per the cacheDuration you set and never re-fetched mid-request.
Validates claims and signature verifies exp, nbf, aud, and the cryptographic signature. Any failure returns 401 Unauthorized immediately.
Forwards claims as headers injects declared claims into request headers so your upstream can make authorization decisions without reparsing the JWT.

Step 1: Create the JWTPolicy

The JWTPolicy is a namespace scoped custom resource that declares which issuers to trust, where to fetch their public keys, and which claims to forward upstream. Create a file named jwt-policy.yaml:

apiVersion: gateway.kgateway.dev/v1alpha1
kind: JWTPolicy
metadata:
  name: multi-issuer-policy
  namespace: default
spec:
  providers:

    # Provider 1: Auth0 tenant
    - name: auth0
      issuer: https://my-tenant.auth0.com/     # note the trailing slash
      audiences:
        - my-api
      remoteJwks:
        uri: https://my-tenant.auth0.com/.well-known/jwks.json
        cacheDuration: 10m
      claimsToHeaders:
        - claim: sub
          header: X-User-Id
        - claim: email
          header: X-User-Email

    # Provider 2: Google
    - name: google
      issuer: https://accounts.google.com      # no trailing slash
      audiences:
        - my-api
      remoteJwks:
        uri: https://www.googleapis.com/oauth2/v3/certs
        cacheDuration: 5m
      claimsToHeaders:
        - claim: sub
          header: X-User-Id
        - claim: email
          header: X-User-Email

⚠️ Issuer strings must be exact. The issuer field is compared character-for-character against the token's iss claim. Auth0 includes a trailing slash in its tokens; Google does not. A mismatch here means every token from that provider will be rejected, even if the signature is valid.

Step 2: Attach the policy to your HTTPRoute

Reference the policy via an annotation on your HTTPRoute. You do not need to modify the route's rules:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: api-route
  namespace: default
  annotations:
    gateway.kgateway.dev/jwt-policy: multi-issuer-policy
spec:
  parentRefs:
    - name: my-gateway
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /api
      backendRefs:
        - name: my-service
          port: 8080

Step 3: Apply and verify

# Apply both resources
$ kubectl apply -f jwt-policy.yaml -f httproute.yaml

# Confirm the policy is accepted by the control plane
$ kubectl get jwtpolicy multi-issuer-policy \
    -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'
True

# Test with a valid Auth0 token
$ curl -H "Authorization: Bearer $AUTH0_TOKEN" https://my-gateway/api/health
200 OK

# Test rejection: no token → 401
$ curl https://my-gateway/api/health
401 Unauthorized

# Confirm upstream receives the forwarded headers
$ kubectl logs deploy/my-service | grep X-User-Id
X-User-Id: user_123

⚠️ JWKS caching on first request: kgateway fetches JWKS the first time a token from a given issuer arrives. If the jwks_uri is unreachable at that moment, the request fails with 503. Use a cacheDuration of at least 5m in production never 0s outside of development.

Claim validation reference

Claim	Validated automatically	Notes
`iss`	Yes	Must exactly match a declared provider's `issuer`. First match wins; no fallback.
`aud`	Yes, if configured	Token must contain at least one value from the `audiences` list. Omit the field to skip audience validation (not recommended in production).
`exp`	Yes	Expired tokens are rejected with 401. Clock skew tolerance is 60 s by default.
`nbf`	Yes	Tokens with a future `nbf` (not-before) are rejected.
`sub`, `email`, `roles`	No	kgateway does not validate custom claims. Use `claimsToHeaders` to forward them and enforce access rules in your upstream service.

Next steps

Use a local JWKS secret: Mount JWKS as a Kubernetes secret for air gapped or high security environments.
Claim based routing: Route requests to different backends based on forwarded claim headers.
Full OIDC with Auth0: Add the authorization code flow for browser facing applications.
Monitor validation errors: Surface JWT rejection rates in Prometheus and set alerting thresholds.