DEV Community

Cover image for Scaling Up & Down, Horizontal Pod Autoscaling Kubernetes Deployments with Hands-on Samples
Ömer Berat Sezer
Ömer Berat Sezer

Posted on

1 1

Scaling Up & Down, Horizontal Pod Autoscaling Kubernetes Deployments with Hands-on Samples

K8s deployment scaling is the process of adjusting the number of pod replicas in a deployment to meet the application’s changing resource requirements or traffic load.

Scaling can be done in two ways: manually using commands like kubectl scale, or automatically through features such as Horizontal Pod Autoscaling (HPA). HPA adjusts the number of pod replicas based on metrics like CPU or memory usage, automatically responding to resource demand.

This flexibility allows K8s to dynamically allocate resources, ensuring optimal performance, availability, and cost efficiency in environments with fluctuating workloads.

The importance of manual scaling in K8s:

  • Immediate Control: It quickly scales up or down based on immediate needs, such as traffic spikes or maintenance.
  • Customization: It adjusts replica counts, independent of auto-scaling rules.
  • Testing: It helps to simulate different load conditions for stress testing or performance evaluations.
  • Quick Recovery: It instantly increases replicas to replace unresponsive pods or mitigate failures.
  • Operational Maintenance: It scales down during maintenance and scale back up after.
  • Resource Management: It adjusts based on specific resource needs, such as specialized workloads.
  • Quota Management: It ensures scaling stays within resource quotas or limits.
  • Scheduled Scaling: It provides automated scheduled manual scaling at specific times with cron jobs.

Use Cases for K8s Horizontal Pod Autoscaling (HPA)

  • E-commerce Traffic Surges: HPA scales pods during flash sales, holiday events, or promotional campaigns to handle high traffic volumes.

  • Streaming Platforms: HPA automatically scales video encoding/streaming services based on the number of active users.

  • Financial Applications: It scales trading systems during market opening/closing hours when transaction rates spike.

  • IoT Data Ingestion: HPA scales up processing services when devices send large amounts of telemetry data.

  • Gaming Servers: It handles fluctuating player counts by scaling game server pods dynamically.

  • Web and API Services: HPA scales backend services based on CPU, memory usage, or request rates to maintain response times.

  • Machine Learning Workloads: HPA scales model training or inference services dynamically based on queue sizes or resource utilization.

  • Log Aggregation Systems: HPA adjusts pod counts for tools like ELK/EFK stacks based on log ingestion rates.

Hands-on Samples

I've implemented two hands-on samples:

  • Hands-on Sample #1 for Manual Scaling
  • Hands-on Sample #2 for Auto Scaling with HPA

Hands-on Sample #1 for Manual Scaling

This scenario shows:

  • how to create deployment,
  • how to scale up/down of deployment manually with scale deployment,
  • how to connect to the one of the pods with bash,
  • how to show ethernet interfaces of the pod and ping other pods,


  • Run minikube:
omer@k8s:$ minikube start
😄  minikube v1.35.0 on Ubuntu 20.04
✨  Automatically selected the docker driver
📌  Using Docker driver with root privileges
👍  Starting "minikube" primary control-plane node in "minikube" cluster
🚜  Pulling base image v0.0.46 ...
🔥  Creating docker container (CPUs=2, Memory=3100MB) ...
❗  Failing to connect to from both inside the minikube container and host machine
💡  To pull new external images, you may need to configure a proxy:
🐳  Preparing Kubernetes v1.32.0 on Docker 27.4.1 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image
🌟  Enabled addons: storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
Enter fullscreen mode Exit fullscreen mode

YAML File Explanation:

  • selector: => deployment selector
  • matchLabels: => deployment selects "app:frontend" pods, monitors and traces these pods
  • app: frontend => if one of the pod is killed, K8s looks at the desire state (replica:3), it recreats another pods to protect number of replicas
  • labels: => pod labels, if the deployment selector is same with these labels, deployment follows pods that have these labels
  • app: frontend => key: value
  • image: nginx:latest => image download from DockerHub
  • containerPort: 80 => open following ports
apiVersion: apps/v1
kind: Deployment
  name: firstdeployment
    team: development
  replicas: 3
      app: frontend                
        app: frontend                 
      - name: nginx                
        image: nginx:latest       
        - containerPort: 80
Enter fullscreen mode Exit fullscreen mode
  • Create deployment and list the deployment's pods:
omer@k8s:$ kubectl apply -f deployment1.yaml
deployment.apps/firstdeployment created

omer@k8s:$ kubectl get pods -o wide
NAME                               READY   STATUS              RESTARTS   AGE   IP       NODE       NOMINATED NODE   READINESS GATES
firstdeployment-54758c4c55-dh98p   0/1     ContainerCreating   0          9s    <none>   minikube   <none>           <none>
firstdeployment-54758c4c55-pnz5c   0/1     ContainerCreating   0          9s    <none>   minikube   <none>           <none>
firstdeployment-54758c4c55-zbn7t   0/1     ContainerCreating   0          9s    <none>   minikube   <none>           <none>

omer@k8s:$ kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP           NODE       NOMINATED NODE   READINESS GATES
firstdeployment-54758c4c55-dh98p   1/1     Running   0          19s   minikube   <none>           <none>
firstdeployment-54758c4c55-pnz5c   1/1     Running   0          19s   minikube   <none>           <none>
firstdeployment-54758c4c55-zbn7t   1/1     Running   0          19s   minikube   <none>           <none>
Enter fullscreen mode Exit fullscreen mode
  • Delete one of the pod (e.g.-dh98p), then K8s automatically creates new pod (-jp69b):
omer@k8s:$ kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP           NODE       NOMINATED NODE   READINESS GATES
firstdeployment-54758c4c55-dh98p   1/1     Running   0          19s   minikube   <none>           <none>
firstdeployment-54758c4c55-pnz5c   1/1     Running   0          19s   minikube   <none>           <none>
firstdeployment-54758c4c55-zbn7t   1/1     Running   0          19s   minikube   <none>           <none>

omer@k8s:$ kubectl delete pod firstdeployment-54758c4c55-dh98p
pod "firstdeployment-54758c4c55-dh98p" deleted

omer@k8s:$ kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP           NODE       NOMINATED NODE   READINESS GATES
firstdeployment-54758c4c55-jp69b   1/1     Running   0          3s   minikube   <none>           <none>
firstdeployment-54758c4c55-pnz5c   1/1     Running   0          88s   minikube   <none>           <none>
firstdeployment-54758c4c55-zbn7t   1/1     Running   0          88s   minikube   <none>           <none>
Enter fullscreen mode Exit fullscreen mode
  • Scale up to 7 replicas:
omer@k8s:$ kubectl scale deployments firstdeployment --replicas=7
deployment.apps/firstdeployment scaled

omer@k8s:$ kubectl get deployments
firstdeployment   7/7     7            7           5m35s

omer@k8s:$ kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
firstdeployment-54758c4c55-8q2pz   1/1     Running   0          16s    minikube   <none>           <none>
firstdeployment-54758c4c55-d4lqh   1/1     Running   0          16s    minikube   <none>           <none>
firstdeployment-54758c4c55-jp69b   1/1     Running   0          4m13s    minikube   <none>           <none>
firstdeployment-54758c4c55-pnz5c   1/1     Running   0          5m38s    minikube   <none>           <none>
firstdeployment-54758c4c55-sbjbx   1/1     Running   0          16s    minikube   <none>           <none>
firstdeployment-54758c4c55-wxcvx   1/1     Running   0          16s   minikube   <none>           <none>
firstdeployment-54758c4c55-zbn7t   1/1     Running   0          5m38s    minikube   <none>           <none>
Enter fullscreen mode Exit fullscreen mode
  • Scale down to 3 replicas:
omer@k8s:$ kubectl scale deployments firstdeployment --replicas=3
deployment.apps/firstdeployment scaled

omer@k8s:$ kubectl get deployments
firstdeployment   3/3     3            3           8m27s

omer@k8s:$ kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE     IP           NODE       NOMINATED NODE   READINESS GATES
firstdeployment-54758c4c55-jp69b   1/1     Running   0          7m6s   minikube   <none>           <none>
firstdeployment-54758c4c55-pnz5c   1/1     Running   0          8m31s   minikube   <none>           <none>
firstdeployment-54758c4c55-zbn7t   1/1     Running   0          8m31s   minikube   <none>           <none>
Enter fullscreen mode Exit fullscreen mode
  • Connect one of the pod with bash:
omer@k8s:$ kubectl exec -it firstdeployment-54758c4c55-jp69b -- bash
root@firstdeployment-54758c4c55-jp69b:/# ping
bash: ping: command not found

root@firstdeployment-54758c4c55-jp69b:/# ifconfig
bash: ifconfig: command not found
Enter fullscreen mode Exit fullscreen mode
  • To install ifconfig, run: "apt update", "apt install net-tools"
  • To install ping, run: "apt install iputils-ping"

  • Show ethernet interfaces, ping other pod to show connectivity of Pods:

omer@k8s:$ kubectl exec -it firstdeployment-54758c4c55-jp69b -- bash
root@firstdeployment-54758c4c55-jp69b:/# apt update

root@firstdeployment-54758c4c55-jp69b:/# apt install iputils-ping

root@firstdeployment-54758c4c55-jp69b:/# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet  netmask  broadcast
        inet6 fe80::787b:28ff:fe4c:1782  prefixlen 64  scopeid 0x20<link>
        ether 7a:7b:28:4c:17:82  txqueuelen 0  (Ethernet)
        RX packets 2744  bytes 9833003 (9.3 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1335  bytes 101396 (99.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet  netmask
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@firstdeployment-54758c4c55-jp69b:/# ping
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=2 ttl=64 time=0.110 ms
--- ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1015ms
rtt min/avg/max/mdev = 0.110/0.114/0.119/0.004 ms

root@firstdeployment-54758c4c55-jp69b:/# ping
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=64 time=0.092 ms
--- ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1079ms
rtt min/avg/max/mdev = 0.049/0.070/0.092/0.021 ms
Enter fullscreen mode Exit fullscreen mode
  • Delete deployment:
omer@k8s:$ kubectl delete -f deployment1.yaml
deployment.apps "firstdeployment" deleted

omer@k8s:$ kubectl get pods -o wide
No resources found in default namespace.
Enter fullscreen mode Exit fullscreen mode

Hands-on Sample #2 for Manual Scaling with HPA

This scenario shows:

  • how to view HPA
  • how to trigger scaling up/down HPA
omer@k8s:$ minikube addons enable metrics-server
💡  metrics-server is an addon maintained by Kubernetes. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at:
    ▪ Using image
🌟  The 'metrics-server' addon is enabled

omer@k8s:$ minikube addons list
|         ADDON NAME          | PROFILE  |    STATUS    |           MAINTAINER           |
| ambassador                  | minikube | disabled     | 3rd party (Ambassador)         |
| amd-gpu-device-plugin       | minikube | disabled     | 3rd party (AMD)                |
| auto-pause                  | minikube | disabled     | minikube                       |
| cloud-spanner               | minikube | disabled     | Google                         |
| csi-hostpath-driver         | minikube | disabled     | Kubernetes                     |
| dashboard                   | minikube | disabled     | Kubernetes                     |
| default-storageclass        | minikube | enabled ✅   | Kubernetes                     |
| efk                         | minikube | disabled     | 3rd party (Elastic)            |
| freshpod                    | minikube | disabled     | Google                         |
| gcp-auth                    | minikube | disabled     | Google                         |
| gvisor                      | minikube | disabled     | minikube                       |
| headlamp                    | minikube | disabled     | 3rd party (         |
| inaccel                     | minikube | disabled     | 3rd party (InAccel             |
|                             |          |              | [])            |
| ingress                     | minikube | disabled     | Kubernetes                     |
| ingress-dns                 | minikube | disabled     | minikube                       |
| inspektor-gadget            | minikube | disabled     | 3rd party                      |
|                             |          |              | (          |
| istio                       | minikube | disabled     | 3rd party (Istio)              |
| istio-provisioner           | minikube | disabled     | 3rd party (Istio)              |
| kong                        | minikube | disabled     | 3rd party (Kong HQ)            |
| kubeflow                    | minikube | disabled     | 3rd party                      |
| kubevirt                    | minikube | disabled     | 3rd party (KubeVirt)           |
| logviewer                   | minikube | disabled     | 3rd party (unknown)            |
| metallb                     | minikube | disabled     | 3rd party (MetalLB)            |
| metrics-server              | minikube | enabled ✅   | Kubernetes                     |
| nvidia-device-plugin        | minikube | disabled     | 3rd party (NVIDIA)             |
| nvidia-driver-installer     | minikube | disabled     | 3rd party (NVIDIA)             |
| nvidia-gpu-device-plugin    | minikube | disabled     | 3rd party (NVIDIA)             |
| olm                         | minikube | disabled     | 3rd party (Operator Framework) |
| pod-security-policy         | minikube | disabled     | 3rd party (unknown)            |
| portainer                   | minikube | disabled     | 3rd party (       |
| registry                    | minikube | disabled     | minikube                       |
| registry-aliases            | minikube | disabled     | 3rd party (unknown)            |
| registry-creds              | minikube | disabled     | 3rd party (UPMC Enterprises)   |
| storage-provisioner         | minikube | enabled ✅   | minikube                       |
| storage-provisioner-gluster | minikube | disabled     | 3rd party (Gluster)            |
| storage-provisioner-rancher | minikube | disabled     | 3rd party (Rancher)            |
| volcano                     | minikube | disabled     | third-party (volcano)          |
| volumesnapshots             | minikube | disabled     | Kubernetes                     |
| yakd                        | minikube | disabled     | 3rd party (       |
💡  To see addons list for other profiles use: `minikube addons -p name list`
Enter fullscreen mode Exit fullscreen mode
  • Create nginx-deployment.yaml, service and deployment:
apiVersion: v1
kind: Service
  name: nginx-deployment
    app: nginx
  - protocol: TCP
    port: 80
    targetPort: 80
apiVersion: apps/v1
kind: Deployment
  name: nginx-deployment
  replicas: 1
      app: nginx
        app: nginx
      - name: nginx
        image: nginx
            cpu: "30m"
            cpu: "80m"
Enter fullscreen mode Exit fullscreen mode
  • Run nginx deployment and service:
omer@k8s:$ kubectl apply -f nginx-deployment.yaml
service/nginx-deployment created
deployment.apps/nginx-deployment created

omer@k8s:$ kubectl get pods -o wide -A
NAMESPACE     NAME                               READY   STATUS              RESTARTS      AGE    IP             NODE       NOMINATED NODE   READINESS GATES
default       nginx-deployment-d99898c47-2kdq5   0/1     Running   0             2s     <none>         minikube   <none>           <none>
kube-system   coredns-668d6bf9bc-7whqt           1/1     Running             0             23m     minikube   <none>           <none>
kube-system   etcd-minikube                      1/1     Running             0             23m   minikube   <none>           <none>
kube-system   kube-apiserver-minikube            1/1     Running             0             23m   minikube   <none>           <none>
kube-system   kube-controller-manager-minikube   1/1     Running             0             23m   minikube   <none>           <none>
kube-system   kube-proxy-z5ncc                   1/1     Running             0             23m   minikube   <none>           <none>
kube-system   kube-scheduler-minikube            1/1     Running             0             23m   minikube   <none>           <none>
kube-system   metrics-server-7496f689c7-jwdf4    1/1     Running             0             3m2s    minikube   <none>           <none>
kube-system   storage-provisioner                1/1     Running             1 (23m ago)   23m   minikube   <none>           <none>
Enter fullscreen mode Exit fullscreen mode
  • Get HPA status:

omer@k8s:$ kubectl get hpa
NAME               REFERENCE                     TARGETS              MINPODS   MAXPODS   REPLICAS   AGE
nginx-deployment   Deployment/nginx-deployment   cpu: <unknown>/50%   1         5         1          2m
Enter fullscreen mode Exit fullscreen mode

On another terminal, create load to increase the pods automatically, it starts to create LOAD on deployment to trigger HPA:

omer@k8s:$ kubectl run -i --tty load-generator --image=busybox -- /bin/sh -c "while true; do wget -q -O- http://nginx-deployment; done"
<!DOCTYPE html>
<title>Welcome to nginx!</title>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href=""></a>.<br/>
Commercial support is available at
<a href=""></a>.</p>

<p><em>Thank you for using nginx.</em></p>
Enter fullscreen mode Exit fullscreen mode

After 5-10 mins later, your deployment will be autoscaled:

omer@k8s:$ kubectl get hpa
NAME               REFERENCE                     TARGETS        MINPODS   MAXPODS   REPLICAS   AGE
nginx-deployment   Deployment/nginx-deployment   cpu: 68%/50%   1         5         5          28m
Enter fullscreen mode Exit fullscreen mode

NOTE: After max 10mins, if there is no change, check the metrics-server config:

omer@k8s:$ kubectl edit deployment metrics-server -n kube-system
## if not presented, add these arguments to the containers.args section:
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
Enter fullscreen mode Exit fullscreen mode

Automatically increased 5 pods for nginx deployment with HPA:

omer@k8s:$ kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS        AGE   IP            NODE       NOMINATED NODE   READINESS GATES
load-generator                     1/1     Running   1 (6m51s ago)   15m   minikube   <none>           <none>
nginx-deployment-d99898c47-2kdq5   1/1     Running   0               16m   minikube   <none>           <none>
nginx-deployment-d99898c47-5f726   1/1     Running   0               13m   minikube   <none>           <none>
nginx-deployment-d99898c47-b7qtn   1/1     Running   0               13m   minikube   <none>           <none>
nginx-deployment-d99898c47-f9bxf   1/1     Running   0               13m   minikube   <none>           <none>
nginx-deployment-d99898c47-s6xvk   1/1     Running   0               13m   minikube   <none>           <none>
Enter fullscreen mode Exit fullscreen mode
  • When you delete the load:
omer@k8s:$ kubectl delete pod load-generator
pod "load-generator" deleted
Enter fullscreen mode Exit fullscreen mode
  • Check HPA, CPU load, it gradually decreases, replicas from 5 to 1:
omer@k8s:$ kubectl get hpa
NAME               REFERENCE                     TARGETS        MINPODS   MAXPODS   REPLICAS   AGE
nginx-deployment   Deployment/nginx-deployment   cpu: 70%/50%   1         5         5          34m

omer@k8s:$ kubectl get hpa
NAME               REFERENCE                     TARGETS        MINPODS   MAXPODS   REPLICAS   AGE
nginx-deployment   Deployment/nginx-deployment   cpu: 33%/50%   1         5         5          35m

omer@k8s:$ kubectl get hpa
NAME               REFERENCE                     TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
nginx-deployment   Deployment/nginx-deployment   cpu: 0%/50%   1         5         5          37m

omer@k8s:$ kubectl get hpa
NAME               REFERENCE                     TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
nginx-deployment   Deployment/nginx-deployment   cpu: 0%/50%   1         5         1          41m
Enter fullscreen mode Exit fullscreen mode
  • NOTE: If not scaled down automatically, please check the deployment HPA policy, if not presented, please add it:
omer@k8s:$ kubectl edit hpa nginx-deployment
## copy and add to HorizontalPodAutoscaler
      - periodSeconds: 15
        type: Percent
        value: 100
      - periodSeconds: 15
        type: Pods
        value: 4
      selectPolicy: Max
      stabilizationWindowSeconds: 0
      - periodSeconds: 15
        type: Percent
        value: 100
      - periodSeconds: 15
        type: Pods
        value: 4
      selectPolicy: Max
      stabilizationWindowSeconds: 0
  maxReplicas: 5
Enter fullscreen mode Exit fullscreen mode
  • Finally, only 1 pod is running:
omer@k8s:$ kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP            NODE       NOMINATED NODE   READINESS GATES
nginx-deployment-d99898c47-f9bxf   1/1     Running   0          24m   minikube   <none>           <none>
Enter fullscreen mode Exit fullscreen mode
  • Delete deployment and delete minikube:
omer@k8s:$ kubectl delete -f nginx-deployment.yaml
service "nginx-deployment" deleted
deployment.apps "nginx-deployment" deleted

omer@k8s:$ kubectl get pods -o wide
No resources found in default namespace.
minikube delete
🔥  Deleting "minikube" in docker ...
🔥  Deleting container "minikube" ...
🔥  Removing /home/omer/.minikube/machines/minikube ...
💀  Removed all traces of the "minikube" cluster.
Enter fullscreen mode Exit fullscreen mode


This post focused on manual scaling and horizontal auto scaling with load. With sample scenarios, we tested scaling up/down and automatic HPA.

If you're interested in exploring other K8s components, please have a look:

If you found the tutorial interesting, I’d love to hear your thoughts in the blog post comments. Feel free to share your reactions or leave a comment. I truly value your input and engagement 😉

For other posts 👉 🧐

Follow for Tips, Tutorials, Hands-On Labs for AWS, K8s, Docker, Linux, DevOps, Ansible, Machine Learning, Generative AI.

Sentry blog image

How I fixed 20 seconds of lag for every user in just 20 minutes.

Our AI agent was running 10-20 seconds slower than it should, impacting both our own developers and our early adopters. See how I used Sentry Profiling to fix it in record time.

Read more

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more