DEV Community

Cover image for 🧩Scenario #15 — Horizontal Pod Autoscaler (HPA) in Kubernetes
Latchu@DevOps
Latchu@DevOps

Posted on

🧩Scenario #15 — Horizontal Pod Autoscaler (HPA) in Kubernetes

🎯 Goal

Automatically scale an NGINX deployment based on CPU usage using Kubernetes Horizontal Pod Autoscaler.


🧰 Step 1 — Create a Deployment

Let’s create a simple NGINX deployment that runs a lightweight web app to simulate load.

cat <<EOF > nginx-hpa.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-hpa-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-hpa-demo
  template:
    metadata:
      labels:
        app: nginx-hpa-demo
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 100m
          limits:
            cpu: 200m
EOF
Enter fullscreen mode Exit fullscreen mode

Apply it:

kubectl apply -f nginx-hpa.yaml
Enter fullscreen mode Exit fullscreen mode

🧰 Step 2 — Expose the Deployment

Expose the deployment with a ClusterIP Service.

kubectl expose deployment nginx-hpa-demo --port=80 --target-port=80
Enter fullscreen mode Exit fullscreen mode

Check:

kubectl get svc nginx-hpa-demo
Enter fullscreen mode Exit fullscreen mode

1


⚙️ Step 3 — Enable Metrics Server

If you’re on GKE, it’s usually preinstalled.
But if not, install it:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Enter fullscreen mode Exit fullscreen mode

Then verify it’s running:

kubectl get deployment metrics-server -n kube-system
Enter fullscreen mode Exit fullscreen mode

2


🧠 Step 4 — Create an HPA

This HPA will maintain CPU usage at ~50% by scaling replicas automatically.

kubectl autoscale deployment nginx-hpa-demo \
  --cpu-percent=50 --min=1 --max=5
Enter fullscreen mode Exit fullscreen mode

Check it:

kubectl get hpa
Enter fullscreen mode Exit fullscreen mode

You’ll see something like:

NAME              REFERENCE                    TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx-hpa-demo    Deployment/nginx-hpa-demo    0%/50%     1         5         1          1m
Enter fullscreen mode Exit fullscreen mode

3


🔥 Step 5 — Generate Load

Create a BusyBox pod to simulate CPU load:

kubectl run loader --image=busybox --restart=Never -it -- /bin/sh
Enter fullscreen mode Exit fullscreen mode

Inside it, run this infinite loop to stress the CPU:

while true; do wget -q -O- http://nginx-hpa-demo.default.svc.cluster.local > /dev/null & done
Enter fullscreen mode Exit fullscreen mode

Let it run for a few minutes.

4


📈 Step 6 — Watch Autoscaling in Action

Open another terminal and watch:

kubectl get hpa -w
Enter fullscreen mode Exit fullscreen mode

You’ll see CPU usage rise and replica count increase (up to 5).
When you stop the load, replicas will eventually scale back down.


🧹 Step 7 — Clean Up

kubectl delete deployment nginx-hpa-demo
kubectl delete svc nginx-hpa-demo
kubectl delete hpa nginx-hpa-demo
Enter fullscreen mode Exit fullscreen mode

🌟 Thanks for reading! If this post added value, a like ❤️, follow, or share would encourage me to keep creating more content.


— Latchu | Senior DevOps & Cloud Engineer

☁️ AWS | GCP | ☸️ Kubernetes | 🔐 Security | ⚡ Automation
📌 Sharing hands-on guides, best practices & real-world cloud solutions

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.