Vertical Pod Autoscaling (VPA) in GKE automatically adjusts CPU and memory requests for your workloads, ensuring pods get exactly what they need — no more, no less.
In this guide, we’ll walk through how to implement VPA for a sample app in GKE.
🧩 Step 1: Introduction
We’ll implement Vertical Pod Autoscaling in a GKE cluster.
Pre-requisite:
Make sure Cluster Autoscaler is enabled on your node pool.
This allows GKE to scale nodes automatically when VPA increases resource requests.
📁 Step 2: Review Kubernetes Manifests
Create a new directory for your manifests:
mkdir kube-manifests-vpa
cd kube-manifests-vpa
01-kubernetes-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp1-deployment
spec:
replicas: 1
selector:
matchLabels:
app: myapp1
template:
metadata:
name: myapp1-pod
labels:
app: myapp1
spec:
containers:
- name: myapp1-container
image: ghcr.io/stacksimplify/kubenginx:1.0.0
ports:
- containerPort: 80
resources:
requests:
memory: "5Mi"
cpu: "25m"
limits:
memory: "50Mi"
cpu: "50m"
02-kubernetes-cip-service.yaml
apiVersion: v1
kind: Service
metadata:
name: myapp1-cip-service
spec:
type: ClusterIP
selector:
app: myapp1
ports:
- name: http
port: 80
targetPort: 80
🚀 Step 3: Deploy the Kubernetes Resources
# Deploy
kubectl apply -f kube-manifests-vpa
# Verify
kubectl get deploy
kubectl get pods
⚙️ Step 4: Enable Vertical Pod Autoscaler (VPA)
You can do this either via GKE Console or YAML.
Option 1: Using GKE Console
Go to Workloads → myapp1-deployment
Click Configure
Under Vertical Pod Autoscaling > Configure
- Mode: Auto
- Define Resource Policies
- Add your default container and save
Click Save
Option 2: Using YAML
Create 03-vpa.yaml:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myapp1-deployment
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp1-deployment
updatePolicy:
updateMode: Auto
resourcePolicy:
containerPolicies:
- containerName: myapp1-container
mode: Auto
controlledResources: ["cpu", "memory"]
minAllowed:
cpu: 25m
memory: 50Mi
maxAllowed:
cpu: 100m
memory: 100Mi
Apply it:
kubectl apply -f 03-vpa.yaml
🧠 Step 5: Check VPA Status
kubectl get vpa
Example output:
NAME MODE CPU MEM PROVIDED AGE
myapp1-deployment Auto 25m 50Mi True 6m
Describe for more details:
kubectl describe vpa myapp1-deployment
💥 Step 6: (Optional) Quick Load Test
Run a simple load generator to trigger autoscaling:
kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://myapp1-cip-service; done"
Watch your pod being recreated with updated CPU and memory requests as VPA adjusts resources.
🧹 Step 7: Clean Up
kubectl delete -f kube-manifests-vpa
kubectl delete vpa myapp1-deployment
✅ Summary
- VPA helps right-size your workloads automatically
- It works great with Cluster Autoscaler
- In Autopilot clusters, VPA is enabled by default
- In Standard clusters, you need to enable VPA per workload
With VPA, your GKE cluster stays efficient — pods use exactly what they need, and you spend less time guessing resource values.
🌟 Thanks for reading! If this post added value, a like ❤️, follow, or share would encourage me to keep creating more content.
— Latchu | Senior DevOps & Cloud Engineer
☁️ AWS | GCP | ☸️ Kubernetes | 🔐 Security | ⚡ Automation
📌 Sharing hands-on guides, best practices & real-world cloud solutions
Top comments (0)