DEV Community

Latchu@DevOps
Latchu@DevOps

Posted on

Part-114: 🚀Implementing Cluster Auto-scaler in Google Kubernetes Engine (GCP)

🧠 Introduction

In this guide, we’ll implement the Cluster Autoscaler in Google Kubernetes Engine (GKE) to automatically scale the number of nodes in a cluster based on workload demand.

You’ll see how GKE automatically adds nodes when resources are insufficient and removes them when the cluster is underutilized — ensuring cost efficiency and performance optimization.


🏗️ Step 1: Create a GKE Cluster (Pre-requisite)

Let’s start by creating a GKE Standard Cluster with a default node pool.

gcloud container clusters create "standard-public-cluster-1" \
  --machine-type "e2-micro" \
  --disk-size "20" \
  --spot \
  --num-nodes "1" \
  --region "us-central1"
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • --machine-type "e2-micro" → small, cost-effective VMs for testing.
  • --spot → uses preemptible (spot) instances to save cost.
  • --num-nodes "1" → starts with a single node.
  • --region "us-central1" → deploys in the us-central1 region.

⚙️ Step 2: Enable Node Pool Autoscaling

  1. Go to Google Cloud Console → Kubernetes Engine → Clusters
  2. Select your cluster → standard-public-cluster-1
  3. Click NODES → default-pool → EDIT
  4. Enable Cluster Autoscaler

Recommended settings:

Location Policy: Balanced
Nodes Per Zone:
  Minimum number of nodes: 1
  Maximum number of nodes: 3
Enter fullscreen mode Exit fullscreen mode

💡 Balanced policy ensures new nodes are evenly distributed across zones.


📄 Step 3: Review the Deployment Manifest

Create a new YAML file named 01-kubernetes-deployment.yaml under a folder named kube-manifests.

apiVersion: apps/v1
kind: Deployment 
metadata: 
  name: myapp1-deployment
spec: 
  replicas: 1
  selector:
    matchLabels:
      app: myapp1
  template:  
    metadata:
      name: myapp1-pod
      labels:
        app: myapp1  
    spec:
      containers: 
        - name: myapp1-container
          image: ghcr.io/stacksimplify/kubenginx:1.0.0
          ports: 
            - containerPort: 80  
          resources:
            requests:
              memory: "5Mi"
              cpu: "25m"
            limits:
              memory: "50Mi"
              cpu: "50m"
Enter fullscreen mode Exit fullscreen mode

Key Points:

  • Requests & limits define the CPU/memory resource needs.
  • We start with 1 replica, but we’ll scale it later to test autoscaling.

🚀 Step 4: Deploy and Verify Autoscaling

Deploy the resources:

kubectl apply -f kube-manifests/01-as-kubernetes-deployment.yaml
Enter fullscreen mode Exit fullscreen mode

Check the pods:

kubectl get pods
Enter fullscreen mode Exit fullscreen mode

View the nodes:

kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

a1

Scale up the deployment:

kubectl scale deployment myapp1-deployment --replicas=10
kubectl scale deployment myapp1-deployment --replicas=30
kubectl scale deployment myapp1-deployment --replicas=50
kubectl scale deployment myapp1-deployment --replicas=70
Enter fullscreen mode Exit fullscreen mode

Now, check your nodes again:

kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

🧩 Observation:

  1. Some pods may initially be Pending (unschedulable).
  2. GKE will automatically detect the need for more resources.
  3. New nodes will be created automatically to accommodate the extra pods.
  4. Once created, pods will start running successfully.

This confirms that Cluster Autoscaler is working correctly.

a2


🧹 Step 5: Clean Up

When scaling down workloads, Cluster Autoscaler automatically removes idle nodes.

# Delete the deployed resources
kubectl delete -f kube-manifests/01-as-kubernetes-deployment.yaml
Enter fullscreen mode Exit fullscreen mode

Wait for 5–10 minutes and verify the nodes:

kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

Observation:

  • GKE will gradually terminate unused nodes.
  • Within ~15–30 minutes, the cluster will return to 1 node.
  • Autoscaler maintains the minimum node count defined in settings.

✅ Summary

Action Result
Deploy workloads Cluster starts scaling out nodes
Workload load decreases Cluster scales in automatically
Benefits Cost efficiency, improved availability, optimal resource usage

💡 Key Takeaways

  • Cluster Autoscaler ensures your workloads always have enough compute capacity.
  • It’s cost-effective — automatically removes idle nodes.
  • Works only with Standard Clusters (Autopilot uses Node Auto-Provisioning).
  • Always define realistic min/max node limits to balance cost and performance.

🎯 Final Thoughts

By implementing Cluster Autoscaler in your GKE cluster, you ensure that your Kubernetes workloads run smoothly and cost-efficiently — scaling up when you need more power and scaling down when you don’t.


🌟 Thanks for reading! If this post added value, a like ❤️, follow, or share would encourage me to keep creating more content.


— Latchu | Senior DevOps & Cloud Engineer

☁️ AWS | GCP | ☸️ Kubernetes | 🔐 Security | ⚡ Automation
📌 Sharing hands-on guides, best practices & real-world cloud solutions

Top comments (0)