Latchu@DevOps

Posted on Oct 9

Part-118: 🚀Implement Vertical Pod Autoscaling (VPA) in Google Kubernetes Engine (GKE)

#kubernetes #devops #googlecloud #devto

Vertical Pod Autoscaling (VPA) in GKE automatically adjusts CPU and memory requests for your workloads, ensuring pods get exactly what they need — no more, no less.

In this guide, we’ll walk through how to implement VPA for a sample app in GKE.

🧩 Step 1: Introduction

We’ll implement Vertical Pod Autoscaling in a GKE cluster.

Pre-requisite:

Make sure Cluster Autoscaler is enabled on your node pool.
This allows GKE to scale nodes automatically when VPA increases resource requests.

📁 Step 2: Review Kubernetes Manifests

Create a new directory for your manifests:

mkdir kube-manifests-vpa
cd kube-manifests-vpa

01-kubernetes-deployment.yaml

apiVersion: apps/v1
kind: Deployment 
metadata: 
  name: myapp1-deployment
spec: 
  replicas: 1
  selector:
    matchLabels:
      app: myapp1
  template:  
    metadata:
      name: myapp1-pod
      labels:
        app: myapp1  
    spec:
      containers: 
        - name: myapp1-container
          image: ghcr.io/stacksimplify/kubenginx:1.0.0
          ports: 
            - containerPort: 80  
          resources:
            requests:
              memory: "5Mi"
              cpu: "25m"
            limits:
              memory: "50Mi"
              cpu: "50m"

02-kubernetes-cip-service.yaml

apiVersion: v1
kind: Service 
metadata:
  name: myapp1-cip-service
spec:
  type: ClusterIP
  selector:
    app: myapp1
  ports: 
    - name: http
      port: 80
      targetPort: 80

🚀 Step 3: Deploy the Kubernetes Resources

# Deploy
kubectl apply -f kube-manifests-vpa

# Verify
kubectl get deploy
kubectl get pods

⚙️ Step 4: Enable Vertical Pod Autoscaler (VPA)

You can do this either via GKE Console or YAML.

Option 1: Using GKE Console

Go to Workloads → myapp1-deployment
Click Configure
Under Vertical Pod Autoscaling > Configure

Mode: Auto
Define Resource Policies
Add your default container and save

Click Save

Option 2: Using YAML

Create 03-vpa.yaml:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp1-deployment
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp1-deployment
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
      - containerName: myapp1-container
        mode: Auto
        controlledResources: ["cpu", "memory"]
        minAllowed:
          cpu: 25m
          memory: 50Mi
        maxAllowed:
          cpu: 100m
          memory: 100Mi

Apply it:

kubectl apply -f 03-vpa.yaml

🧠 Step 5: Check VPA Status

kubectl get vpa

Example output:

NAME                MODE   CPU   MEM    PROVIDED   AGE
myapp1-deployment   Auto   25m   50Mi   True       6m

Describe for more details:

kubectl describe vpa myapp1-deployment

💥 Step 6: (Optional) Quick Load Test

Run a simple load generator to trigger autoscaling:

kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://myapp1-cip-service; done"

Watch your pod being recreated with updated CPU and memory requests as VPA adjusts resources.

🧹 Step 7: Clean Up

kubectl delete -f kube-manifests-vpa
kubectl delete vpa myapp1-deployment

✅ Summary

VPA helps right-size your workloads automatically
It works great with Cluster Autoscaler
In Autopilot clusters, VPA is enabled by default
In Standard clusters, you need to enable VPA per workload

With VPA, your GKE cluster stays efficient — pods use exactly what they need, and you spend less time guessing resource values.

🌟 Thanks for reading! If this post added value, a like ❤️, follow, or share would encourage me to keep creating more content.

— Latchu | Senior DevOps & Cloud Engineer

☁️ AWS | GCP | ☸️ Kubernetes | 🔐 Security | ⚡ Automation
📌 Sharing hands-on guides, best practices & real-world cloud solutions

DEV Community