Executive Summary
Upgrades are where clusters live—or die—by their discipline. A rushed kubectl upgrade
can silently break workloads, APIs, or CRDs.
In this post, I’ll walk you through how to plan, test, and execute Kubernetes upgrades safely, understand version-skew guarantees, manage feature gates, and validate everything before production.
By the end, you’ll know why cadence matters more than version chasing.
Prereqs
- Familiarity with kubectl, kubeadm, or managed control planes (EKS, GKE, AKS).
- Working understanding of pods, CRDs, controllers, and API objects.
- Access to a staging or pre-prod environment where you can test.
Concepts
1️⃣ Support Windows & Version Skew
Kubernetes maintains a 1-year support window with three active minor releases (e.g., 1.29 → 1.31).
Control plane and kubelet skew rules:
Component | Supported Skew |
---|---|
kube-apiserver vs others |
±1 minor |
kube-controller-manager , scheduler
|
same as API Server |
kubelet |
up to 1 minor older |
kubectl |
±1 minor |
Decision cue: If your nodes or clients are >1 minor behind control plane → plan upgrade immediately.
Before → After
Before | After |
---|---|
1.24 control plane, 1.21 nodes | ❌ Unsupported (skew = 3) |
1.28 control plane, 1.27 nodes | ✅ Supported (skew = 1) |
2️⃣ Upgrade Order: Control Plane → Nodes
Always upgrade control plane first, then worker nodes, then addons.
Flow:
etcd → apiserver → controller-manager → scheduler → kubelet → kube-proxy → CNI → CSI → custom controllers
Decision cue:
Never let newer nodes talk to an older API Server.
The API Server must always be the newest component in the cluster.
3️⃣ Deprecation Checklist & API Migration
Each minor release deprecates APIs. To find the list, run:
kubectl krew install deprecations
kubectl deprecations --k8s-version v1.31
kubectl deprecations --k8s-version 1.31
Or if you don’t have that plugin:
kubectl get --raw /openapi/v2 | grep v1beta1
Then use:
kubectl convert -f old.yaml --output-version apps/v1 > new.yaml
Before → After Example
Before:
apiVersion: apps/v1beta2
kind: Deployment
After:
apiVersion: apps/v1
kind: Deployment
Decision cue:
→ Fix API versions before upgrading; otherwise manifests may fail silently during rollout.
4️⃣ Feature Gate Evaluation in Pre-Prod
Feature gates toggle experimental behavior.
Check current state:
kube-apiserver --help | grep feature-gates
Enable safely:
apiServer:
extraArgs:
feature-gates: "PodDisruptionPolicy=false,JobPodFailurePolicy=true"
Decision cue:
Always enable gates only in pre-prod first; validate e2e conformance before rollout.
🧪 Mini-Lab — Plan a Minor Upgrade
Goal: Upgrade 1.29 → 1.30 in staging safely.
# 1. View current versions
kubectl get nodes -o wide
# 2. Drain one node
kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data
# 3. Upgrade control plane
sudo kubeadm upgrade apply v1.30.1
# 4. Upgrade kubelet/kubectl
sudo apt install kubelet=1.30.1-00 kubectl=1.30.1-00
sudo systemctl restart kubelet
# 5. Uncordon and validate
kubectl uncordon node-1
kubectl get nodes
Validate with conformance tests:
sonobuoy run --mode=certified-conformance
sonobuoy status
sonobuoy retrieve .
Toggle a feature gate safely:
kubectl patch deployment kube-apiserver \
-n kube-system --type merge \
-p '{"spec":{"template":{"spec":{"containers":[{"name":"kube-apiserver","args":["--feature-gates=DynamicResourceAllocation=true"]}]}}}}'
Rollback with kubectl diff
→ kubectl apply -f rollback.yaml --server-side --dry-run=server
.
Cheatsheet
Task | Command |
---|---|
List deprecations | kubectl deprecations --k8s-version X.Y |
Convert manifest | kubectl convert -f old.yaml --output-version apps/v1 |
Dry-run on server | kubectl apply -f file.yaml --server-side --dry-run=server |
Diff check | kubectl diff -f manifests/ |
Upgrade order | etcd → control plane → nodes → addons → CRDs |
Pitfalls
CRD / Version Drift
CRDs registered via Helm or Operators may pin old API versions (e.g., apiextensions.k8s.io/v1beta1).
→ Always kubectl get crd
and check .spec.versions
.
Admission Policy Surprises
Mutating / Validating webhooks compiled against old libraries can block Pods post-upgrade.
→ Audit webhooks with kubectl get validatingwebhookconfigurations
and test dry-runs in staging.
Etcd Storage Version Drift
Even after API migrated, etcd may still store old serialized versions.
→ Use kube-storage-version-migrator
to reconcile.
Diagram — Upgrade Workflow
✅ Wrap-Up
Upgrades are not a sprint—they’re a cadence.
Each minor version should move through your environments with predictable rhythm, pre-validation, and rollback clarity.
Next in the series (Post 4) → Smart Scaling & Cost Control: HPA/KEDA + Cluster Autoscaler vs Karpenter.
Top comments (0)