Latchu@DevOps

Posted on Oct 8

Part-114: 🚀Implementing Cluster Auto-scaler in Google Kubernetes Engine (GCP)

#kubernetes #googlecloud #cloud #devops

🧠 Introduction

In this guide, we’ll implement the Cluster Autoscaler in Google Kubernetes Engine (GKE) to automatically scale the number of nodes in a cluster based on workload demand.

You’ll see how GKE automatically adds nodes when resources are insufficient and removes them when the cluster is underutilized — ensuring cost efficiency and performance optimization.

🏗️ Step 1: Create a GKE Cluster (Pre-requisite)

Let’s start by creating a GKE Standard Cluster with a default node pool.

gcloud container clusters create "standard-public-cluster-1" \
  --machine-type "e2-micro" \
  --disk-size "20" \
  --spot \
  --num-nodes "1" \
  --region "us-central1"

Explanation:

--machine-type "e2-micro" → small, cost-effective VMs for testing.
--spot → uses preemptible (spot) instances to save cost.
--num-nodes "1" → starts with a single node.
--region "us-central1" → deploys in the us-central1 region.

⚙️ Step 2: Enable Node Pool Autoscaling

Go to Google Cloud Console → Kubernetes Engine → Clusters
Select your cluster → standard-public-cluster-1
Click NODES → default-pool → EDIT
Enable Cluster Autoscaler

Recommended settings:

Location Policy: Balanced
Nodes Per Zone:
  Minimum number of nodes: 1
  Maximum number of nodes: 3

💡 Balanced policy ensures new nodes are evenly distributed across zones.

📄 Step 3: Review the Deployment Manifest

Create a new YAML file named 01-kubernetes-deployment.yaml under a folder named kube-manifests.

apiVersion: apps/v1
kind: Deployment 
metadata: 
  name: myapp1-deployment
spec: 
  replicas: 1
  selector:
    matchLabels:
      app: myapp1
  template:  
    metadata:
      name: myapp1-pod
      labels:
        app: myapp1  
    spec:
      containers: 
        - name: myapp1-container
          image: ghcr.io/stacksimplify/kubenginx:1.0.0
          ports: 
            - containerPort: 80  
          resources:
            requests:
              memory: "5Mi"
              cpu: "25m"
            limits:
              memory: "50Mi"
              cpu: "50m"

Key Points:

Requests & limits define the CPU/memory resource needs.
We start with 1 replica, but we’ll scale it later to test autoscaling.

🚀 Step 4: Deploy and Verify Autoscaling

Deploy the resources:

kubectl apply -f kube-manifests/01-as-kubernetes-deployment.yaml

Check the pods:

kubectl get pods

View the nodes:

kubectl get nodes

Scale up the deployment:

kubectl scale deployment myapp1-deployment --replicas=10
kubectl scale deployment myapp1-deployment --replicas=30
kubectl scale deployment myapp1-deployment --replicas=50
kubectl scale deployment myapp1-deployment --replicas=70

Now, check your nodes again:

kubectl get nodes

🧩 Observation:

Some pods may initially be Pending (unschedulable).
GKE will automatically detect the need for more resources.
New nodes will be created automatically to accommodate the extra pods.
Once created, pods will start running successfully.

This confirms that Cluster Autoscaler is working correctly.

🧹 Step 5: Clean Up

When scaling down workloads, Cluster Autoscaler automatically removes idle nodes.

# Delete the deployed resources
kubectl delete -f kube-manifests/01-as-kubernetes-deployment.yaml

Wait for 5–10 minutes and verify the nodes:

kubectl get nodes

Observation:

GKE will gradually terminate unused nodes.
Within ~15–30 minutes, the cluster will return to 1 node.
Autoscaler maintains the minimum node count defined in settings.

✅ Summary

Action	Result
Deploy workloads	Cluster starts scaling out nodes
Workload load decreases	Cluster scales in automatically
Benefits	Cost efficiency, improved availability, optimal resource usage

💡 Key Takeaways

Cluster Autoscaler ensures your workloads always have enough compute capacity.
It’s cost-effective — automatically removes idle nodes.
Works only with Standard Clusters (Autopilot uses Node Auto-Provisioning).
Always define realistic min/max node limits to balance cost and performance.

🎯 Final Thoughts

By implementing Cluster Autoscaler in your GKE cluster, you ensure that your Kubernetes workloads run smoothly and cost-efficiently — scaling up when you need more power and scaling down when you don’t.

🌟 Thanks for reading! If this post added value, a like ❤️, follow, or share would encourage me to keep creating more content.

— Latchu | Senior DevOps & Cloud Engineer

☁️ AWS | GCP | ☸️ Kubernetes | 🔐 Security | ⚡ Automation
📌 Sharing hands-on guides, best practices & real-world cloud solutions

DEV Community