DEV Community

Cover image for Part-118: 🚀Implement Vertical Pod Autoscaling (VPA) in Google Kubernetes Engine (GKE)
Latchu@DevOps
Latchu@DevOps

Posted on

Part-118: 🚀Implement Vertical Pod Autoscaling (VPA) in Google Kubernetes Engine (GKE)

Vertical Pod Autoscaling (VPA) in GKE automatically adjusts CPU and memory requests for your workloads, ensuring pods get exactly what they need — no more, no less.

In this guide, we’ll walk through how to implement VPA for a sample app in GKE.


🧩 Step 1: Introduction

We’ll implement Vertical Pod Autoscaling in a GKE cluster.

Pre-requisite:

Make sure Cluster Autoscaler is enabled on your node pool.
This allows GKE to scale nodes automatically when VPA increases resource requests.


📁 Step 2: Review Kubernetes Manifests

Create a new directory for your manifests:

mkdir kube-manifests-vpa
cd kube-manifests-vpa
Enter fullscreen mode Exit fullscreen mode

01-kubernetes-deployment.yaml

apiVersion: apps/v1
kind: Deployment 
metadata: 
  name: myapp1-deployment
spec: 
  replicas: 1
  selector:
    matchLabels:
      app: myapp1
  template:  
    metadata:
      name: myapp1-pod
      labels:
        app: myapp1  
    spec:
      containers: 
        - name: myapp1-container
          image: ghcr.io/stacksimplify/kubenginx:1.0.0
          ports: 
            - containerPort: 80  
          resources:
            requests:
              memory: "5Mi"
              cpu: "25m"
            limits:
              memory: "50Mi"
              cpu: "50m"
Enter fullscreen mode Exit fullscreen mode

02-kubernetes-cip-service.yaml

apiVersion: v1
kind: Service 
metadata:
  name: myapp1-cip-service
spec:
  type: ClusterIP
  selector:
    app: myapp1
  ports: 
    - name: http
      port: 80
      targetPort: 80
Enter fullscreen mode Exit fullscreen mode

🚀 Step 3: Deploy the Kubernetes Resources

# Deploy
kubectl apply -f kube-manifests-vpa

# Verify
kubectl get deploy
kubectl get pods
Enter fullscreen mode Exit fullscreen mode

v1


⚙️ Step 4: Enable Vertical Pod Autoscaler (VPA)

You can do this either via GKE Console or YAML.

Option 1: Using GKE Console

Go to Workloads → myapp1-deployment
Click Configure
Under Vertical Pod Autoscaling > Configure

  • Mode: Auto
  • Define Resource Policies
  • Add your default container and save

Click Save

Option 2: Using YAML

Create 03-vpa.yaml:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp1-deployment
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp1-deployment
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
      - containerName: myapp1-container
        mode: Auto
        controlledResources: ["cpu", "memory"]
        minAllowed:
          cpu: 25m
          memory: 50Mi
        maxAllowed:
          cpu: 100m
          memory: 100Mi
Enter fullscreen mode Exit fullscreen mode

Apply it:

kubectl apply -f 03-vpa.yaml
Enter fullscreen mode Exit fullscreen mode

🧠 Step 5: Check VPA Status

kubectl get vpa
Enter fullscreen mode Exit fullscreen mode

Example output:

NAME                MODE   CPU   MEM    PROVIDED   AGE
myapp1-deployment   Auto   25m   50Mi   True       6m
Enter fullscreen mode Exit fullscreen mode

Describe for more details:

kubectl describe vpa myapp1-deployment
Enter fullscreen mode Exit fullscreen mode

v2


💥 Step 6: (Optional) Quick Load Test

Run a simple load generator to trigger autoscaling:

kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://myapp1-cip-service; done"
Enter fullscreen mode Exit fullscreen mode

Watch your pod being recreated with updated CPU and memory requests as VPA adjusts resources.


🧹 Step 7: Clean Up

kubectl delete -f kube-manifests-vpa
kubectl delete vpa myapp1-deployment
Enter fullscreen mode Exit fullscreen mode

✅ Summary

  • VPA helps right-size your workloads automatically
  • It works great with Cluster Autoscaler
  • In Autopilot clusters, VPA is enabled by default
  • In Standard clusters, you need to enable VPA per workload

With VPA, your GKE cluster stays efficient — pods use exactly what they need, and you spend less time guessing resource values.


🌟 Thanks for reading! If this post added value, a like ❤️, follow, or share would encourage me to keep creating more content.


— Latchu | Senior DevOps & Cloud Engineer

☁️ AWS | GCP | ☸️ Kubernetes | 🔐 Security | ⚡ Automation
📌 Sharing hands-on guides, best practices & real-world cloud solutions

Top comments (0)