Latchu@DevOps

Posted on Nov 13, 2025

🧩Scenario #15 — Horizontal Pod Autoscaler (HPA) in Kubernetes

#kubernetes #cicd #devops #containers

🎯 Goal

Automatically scale an NGINX deployment based on CPU usage using Kubernetes Horizontal Pod Autoscaler.

🧰 Step 1 — Create a Deployment

Let’s create a simple NGINX deployment that runs a lightweight web app to simulate load.

cat <<EOF > nginx-hpa.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-hpa-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-hpa-demo
  template:
    metadata:
      labels:
        app: nginx-hpa-demo
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 100m
          limits:
            cpu: 200m
EOF

Apply it:

kubectl apply -f nginx-hpa.yaml

🧰 Step 2 — Expose the Deployment

Expose the deployment with a ClusterIP Service.

kubectl expose deployment nginx-hpa-demo --port=80 --target-port=80

Check:

kubectl get svc nginx-hpa-demo

⚙️ Step 3 — Enable Metrics Server

If you’re on GKE, it’s usually preinstalled.
But if not, install it:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Then verify it’s running:

kubectl get deployment metrics-server -n kube-system

🧠 Step 4 — Create an HPA

This HPA will maintain CPU usage at ~50% by scaling replicas automatically.

kubectl autoscale deployment nginx-hpa-demo \
  --cpu-percent=50 --min=1 --max=5

Check it:

kubectl get hpa

You’ll see something like:

NAME              REFERENCE                    TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx-hpa-demo    Deployment/nginx-hpa-demo    0%/50%     1         5         1          1m

🔥 Step 5 — Generate Load

Create a BusyBox pod to simulate CPU load:

kubectl run loader --image=busybox --restart=Never -it -- /bin/sh

Inside it, run this infinite loop to stress the CPU:

while true; do wget -q -O- http://nginx-hpa-demo.default.svc.cluster.local > /dev/null & done

Let it run for a few minutes.

📈 Step 6 — Watch Autoscaling in Action

Open another terminal and watch:

kubectl get hpa -w

You’ll see CPU usage rise and replica count increase (up to 5).
When you stop the load, replicas will eventually scale back down.

🧹 Step 7 — Clean Up

kubectl delete deployment nginx-hpa-demo
kubectl delete svc nginx-hpa-demo
kubectl delete hpa nginx-hpa-demo

🌟 Thanks for reading! If this post added value, a like ❤️, follow, or share would encourage me to keep creating more content.

— Latchu | Senior DevOps & Cloud Engineer

☁️ AWS | GCP | ☸️ Kubernetes | 🔐 Security | ⚡ Automation
📌 Sharing hands-on guides, best practices & real-world cloud solutions

Top comments (1)

Hashbyt • Nov 13 '25

Great guide on implementing HPA in Kubernetes! Your step-by-step approach makes it easy to follow and apply.