🧠 Introduction
In this guide, we’ll implement the Cluster Autoscaler in Google Kubernetes Engine (GKE) to automatically scale the number of nodes in a cluster based on workload demand.
You’ll see how GKE automatically adds nodes when resources are insufficient and removes them when the cluster is underutilized — ensuring cost efficiency and performance optimization.
🏗️ Step 1: Create a GKE Cluster (Pre-requisite)
Let’s start by creating a GKE Standard Cluster with a default node pool.
gcloud container clusters create "standard-public-cluster-1" \
--machine-type "e2-micro" \
--disk-size "20" \
--spot \
--num-nodes "1" \
--region "us-central1"
Explanation:
- --machine-type "e2-micro" → small, cost-effective VMs for testing.
- --spot → uses preemptible (spot) instances to save cost.
- --num-nodes "1" → starts with a single node.
- --region "us-central1" → deploys in the us-central1 region.
⚙️ Step 2: Enable Node Pool Autoscaling
- Go to Google Cloud Console → Kubernetes Engine → Clusters
- Select your cluster → standard-public-cluster-1
- Click NODES → default-pool → EDIT
- Enable Cluster Autoscaler
Recommended settings:
Location Policy: Balanced
Nodes Per Zone:
Minimum number of nodes: 1
Maximum number of nodes: 3
💡 Balanced policy ensures new nodes are evenly distributed across zones.
📄 Step 3: Review the Deployment Manifest
Create a new YAML file named 01-kubernetes-deployment.yaml under a folder named kube-manifests.
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp1-deployment
spec:
replicas: 1
selector:
matchLabels:
app: myapp1
template:
metadata:
name: myapp1-pod
labels:
app: myapp1
spec:
containers:
- name: myapp1-container
image: ghcr.io/stacksimplify/kubenginx:1.0.0
ports:
- containerPort: 80
resources:
requests:
memory: "5Mi"
cpu: "25m"
limits:
memory: "50Mi"
cpu: "50m"
Key Points:
- Requests & limits define the CPU/memory resource needs.
- We start with 1 replica, but we’ll scale it later to test autoscaling.
🚀 Step 4: Deploy and Verify Autoscaling
Deploy the resources:
kubectl apply -f kube-manifests/01-as-kubernetes-deployment.yaml
Check the pods:
kubectl get pods
View the nodes:
kubectl get nodes
Scale up the deployment:
kubectl scale deployment myapp1-deployment --replicas=10
kubectl scale deployment myapp1-deployment --replicas=30
kubectl scale deployment myapp1-deployment --replicas=50
kubectl scale deployment myapp1-deployment --replicas=70
Now, check your nodes again:
kubectl get nodes
🧩 Observation:
- Some pods may initially be Pending (unschedulable).
- GKE will automatically detect the need for more resources.
- New nodes will be created automatically to accommodate the extra pods.
- Once created, pods will start running successfully.
This confirms that Cluster Autoscaler is working correctly.
🧹 Step 5: Clean Up
When scaling down workloads, Cluster Autoscaler automatically removes idle nodes.
# Delete the deployed resources
kubectl delete -f kube-manifests/01-as-kubernetes-deployment.yaml
Wait for 5–10 minutes and verify the nodes:
kubectl get nodes
Observation:
- GKE will gradually terminate unused nodes.
- Within ~15–30 minutes, the cluster will return to 1 node.
- Autoscaler maintains the minimum node count defined in settings.
✅ Summary
Action | Result |
---|---|
Deploy workloads | Cluster starts scaling out nodes |
Workload load decreases | Cluster scales in automatically |
Benefits | Cost efficiency, improved availability, optimal resource usage |
💡 Key Takeaways
- Cluster Autoscaler ensures your workloads always have enough compute capacity.
- It’s cost-effective — automatically removes idle nodes.
- Works only with Standard Clusters (Autopilot uses Node Auto-Provisioning).
- Always define realistic min/max node limits to balance cost and performance.
🎯 Final Thoughts
By implementing Cluster Autoscaler in your GKE cluster, you ensure that your Kubernetes workloads run smoothly and cost-efficiently — scaling up when you need more power and scaling down when you don’t.
🌟 Thanks for reading! If this post added value, a like ❤️, follow, or share would encourage me to keep creating more content.
— Latchu | Senior DevOps & Cloud Engineer
☁️ AWS | GCP | ☸️ Kubernetes | 🔐 Security | ⚡ Automation
📌 Sharing hands-on guides, best practices & real-world cloud solutions
Top comments (0)