Why Your AWS EKS Cluster Isn't Scaling Down — The PDB Trap With Stateless Services

#devops #aws #cloud #kubernetes

Introduction
What is a Pod Disruption Budget?
The Problem - PDB Blocking Node Scale Down
Why This Is Easy To Miss ?
The Fix — PDB Only For Stateful Services
Key Takeaway

Introduction

Kubernetes cost optimization on AWS EKS often focuses on scaling up efficiently — but scaling down is where hidden costs live. Cluster autoscalar focuses on identifying nodes which are consuming less resource and scaling down those nodes ( if node count is greater than min available node count set in cluster )to reduce overall EKS usage cost..In a production environment we worked on, we noticed nodes weren't been scaled down by autoscalar even when resource usage was very low. After investigating, the culprit was something small and easy to overlook — a Pod Disruption Budget configured on a stateless service.

What is a Pod Disruption Budget?

A Pod Disruption Budget (PDB) is a Kubernetes resource that limits how many pods of a deployment can be down at the same time during voluntary disruptions — something like node drains, cluster upgrades, or autoscaler scale down events. This is used to ensure critical services are always available during above disruption cases to ensure service availability. This doesn't prevent cases node failure, pod OOM events.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-service-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: my-service

This tells Kubernetes — "at least 1 pod of this service must always be running." ( minAvailable: 1)

For stateful services like Redis or Kafka this makes complete sense. You don't want all the pods of these stateful services going down unexpectedly. And some minimum number of pods to be available at all costs.The problem starts when you apply this same logic to stateless services.

The Problem -PDB Blocking Node Scale Down

Here's the exact scenario we ran into:

A stateless service was running with 1 replica .It had a PDB with minAvailable: 1. The pod was consuming very low CPU and memory
AWS EKS Cluster Autoscaler identified the node as underutilized and tried to scale it down
To scale down the node it needed to evict the pod first, But PDB said minimum 1 pod must be available at all times. Since there was only 1 replica, evicting it would violate the PDB

Result — autoscaler couldn't evict the pod, node stayed up indefinitely.The node was essentially stuck — too empty to be useful, too protected to be removed.
Cluster Autoscaler → tries to drain node
→ attempts to evict pod
→ PDB blocks eviction (minAvailable: 1, replicas: 1)
→ node scale down blocked
→ you keep paying for an underutilized node

Why this is Easy to Miss ?

The pod itself showed no issues. CPU and memory were fine. HPA wasn't triggering. Everything looked healthy from an application perspective. The only sign was nodes not scaling down during low traffic periods — which is easy to dismiss as "autoscaler being slow" rather than investigating deeper.

The Fix — PDB Only For Stateful Services

The solution was straightforward once we identified the cause. Removed PDB entirely from stateless services
Kept PDB only for stateful services like - Redis, Kafka, and similar infra components.Moved these stateful services to dedicated node group to ensure any high resource usage by stateless pods doesn't affect these pods if they are allocated in same node as PDB won't protect such cases. This fix ensured stateful services running in dedicated node group isolated from stateless pods with PDB ensuring during drain events these critical infra pods are available and doesn't cause entire production outage events.

Stateless services by definition can handle being evicted and rescheduled — that's the whole point of being stateless. They don't need disruption protection by default. If even minute disruptions are not acceptable having PDBs with Max unavailable option can be considered or isolating such services to seperate node group in EKS with high tier based on whether they are cpu/ memory intensive would be a better choice.

#  PDB makes sense here — stateful service
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: redis-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: redis

#  Avoid this — stateless service with single replica
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-stateless-service-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: my-stateless-service

Key Takeaway

If your AWS EKS cluster autoscaler isn't scaling down nodes during low traffic periods, check your PDBs before anything else. A minAvailable: 1 on a single replica stateless service is effectively telling your cluster — "this node can never be removed."
Reserve PDBs for services that genuinely need them. Your AWS bill will thank you.

Have you run into unexpected autoscaler behaviour in EKS? Drop a comment — would love to hear other gotchas people have faced.

DEV Community