DEV Community

Cover image for PodDisruptionBudgets: Your Kubernetes Outage Insurance
Naman
Naman

Posted on

PodDisruptionBudgets: Your Kubernetes Outage Insurance

It's Tuesday morning. The platform team starts draining nodes for a Kubernetes upgrade. Sixty seconds later, Slack explodes — the payment service is fully down. All 3 replicas landed on the same two nodes, both drained simultaneously. There was nothing wrong with the app. The cluster did exactly what it was told.*

This is what PodDisruptionBudgets prevent.


The Problem

Kubernetes has two kinds of pod disruptions:

  • Involuntary: Node crashes, OOM kills, hardware failures. Unpredictable. You handle these with replicas and health checks.
  • Voluntary: Node drains, cluster upgrades, autoscaler scale-downs, spot instance reclaims. Planned and controlled.

For voluntary disruptions, Kubernetes asks the eviction API to remove pods. By default, the eviction API has zero awareness of your application's availability requirements. It will happily evict every replica of your service at once if they're all on the node being drained.

Replicas don't help if the system removes all of them simultaneously.


What Is a PodDisruptionBudget?

A PDB is a simple declaration: "During voluntary disruptions, always keep at least N pods (or at most M pods unavailable) for this application."

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: payment-service-pdb
  namespace: production
spec:
  minAvailable: 2          # OR use maxUnavailable: 1
  selector:
    matchLabels:
      app: payment-service
Enter fullscreen mode Exit fullscreen mode

That's it. This tells the eviction API: "You may not evict a payment-service pod if doing so would drop the available count below 2."


How It Works

┌─────────────────────────────────────────────────────────────┐
│                    WITHOUT PDB                              │
│                                                             │
│  kubectl drain node-2                                       │
│       │                                                     │
│       ▼                                                     │
│  Evict pod-A ──── ✓ Gone                                    │
│  Evict pod-B ──── ✓ Gone                                    │
│  Evict pod-C ──── ✓ Gone                                    │
│                                                             │
│  Result: 0/3 replicas running. Service DOWN.                │
│  (New pods schedule eventually, but there's a gap)          │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────-┐
│                     WITH PDB (minAvailable: 2)               │
│                                                              │
│  kubectl drain node-2                                        │
│       │                                                      │
│       ▼                                                      │
│  Evict pod-A ──── ✓ Allowed (3→2, still ≥ 2)                 │
│  Evict pod-B ──── ✗ BLOCKED (would go 2→1, violates PDB)     │
│       │                                                      │
│       ▼ (waits...)                                           │
│  pod-A reschedules on node-3 ──── ✓ Running                  │
│       │                                                      │
│       ▼ (now 3 available again)                              │
│  Evict pod-B ──── ✓ Allowed (3→2, still ≥ 2)                 │
│                                                              │
│  Result: Always ≥ 2 replicas running. Service STAYS UP.      │
└─────────────────────────────────────────────────────────────-┘
Enter fullscreen mode Exit fullscreen mode

The drain operation becomes serialized and respectful — it waits for replacements to come healthy before continuing.


minAvailable vs maxUnavailable

Two ways to express the same idea:

Field Meaning Example (5 replicas)
minAvailable: 3 At least 3 must be running at all times Can evict up to 2 at once
maxUnavailable: 2 At most 2 can be down at once Same effect

You can also use percentages:

spec:
  maxUnavailable: "25%"    # For a 4-replica app: max 1 pod down
Enter fullscreen mode Exit fullscreen mode

Rule of thumb: Use maxUnavailable for large deployments (scales naturally with replica count). Use minAvailable when you have a hard quorum requirement (e.g., etcd needs 2/3 members alive).


When You Need One

  • Any production service with > 1 replica
  • Stateful workloads with quorum (etcd, ZooKeeper, Kafka)
  • During cluster upgrades (nodes drain one by one)
  • When using cluster autoscaler (it respects PDBs during scale-down)
  • Spot/preemptible instances (cloud provider can reclaim nodes)

** When to Be Careful**

  • Don't set minAvailable equal to your replica count. A PDB of minAvailable: 3 on a 3-replica deployment means nothing can ever be evicted. Node drains will hang forever.
  • Don't forget PDBs block node drains. If your PDB is too strict and pods can't reschedule (due to resource pressure, node affinity, etc.), your drain operation will be stuck indefinitely.
  • Single-replica deployments: A PDB with minAvailable: 1 on a 1-replica app means the pod can never be evicted. Either accept downtime or add replicas.

The Minimum Viable PDB for Every Service

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: <app>-pdb
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: <app>
Enter fullscreen mode Exit fullscreen mode

One line of config. Guarantees at least one pod stays alive during any voluntary disruption. The cost: drains take slightly longer because they wait for rescheduling. The benefit: you never get paged because a routine node drain cascaded into an outage.


PDBs don't prevent disruptions. They civilize them — turning a shotgun blast into a controlled, one-at-a-time handoff. The five minutes it takes to add one is significantly less than the five hours debugging why a cluster upgrade took down production at 2 AM.*

Top comments (0)