Vincent Du

Posted on Jan 11

Kubernetes Persistence Series Part 3: Controllers & Resilience — Why Kubernetes Self-Heals

#architecture #devops #kubernetes #sre

What You'll Learn

How application controllers (NGINX Ingress, cert-manager) persist through evictions
Why controllers are stateless and can restart anywhere
The complete persistence chain from hardware to application
What survives pod evictions vs. what doesn't

Previously

In Part 1, we debugged a missing ingress after GKE node upgrades. In Part 2, we explored how systemd supervises kubelet, and how kubelet bootstraps the control plane through static pods.

Now we reach the final layer: your application controllers—and the elegant insight that makes Kubernetes truly resilient.

Layer 4: Application Controllers

How Application Controllers Persist

Controllers like NGINX Ingress, cert-manager, and Prometheus Operator are deployed as Deployments or StatefulSets:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: ingress-nginx
  template:
    spec:
      containers:
      - name: controller
        image: registry.k8s.io/ingress-nginx/controller:v1.9.0

When this pod is evicted:

kubelet stops reporting the pod → control plane marks it terminated
ReplicaSet controller notices: current replicas (0) < desired (1)
ReplicaSet creates a new pod specification
Scheduler assigns the pod to a healthy node
kubelet on that node starts the container
NGINX controller reconnects to API server and resumes watching ingresses

The controller itself doesn't store state—it reads everything from the API server (backed by etcd).

Helm Release Persistence

Helm stores release information in Kubernetes secrets:

kubectl get secret -n monitoring -l owner=helm -o yaml

apiVersion: v1
kind: Secret
metadata:
  name: sh.helm.release.v1.prometheus.v3
  labels:
    owner: helm
    name: prometheus
    version: "3"
type: helm.sh/release.v1
data:
  release: H4sIAAAAAAAAA... # Base64 encoded release manifest

This secret contains:

The chart that was installed
The values that were used
The computed manifest of all resources

Because this is stored in etcd via the API server, Helm releases survive any pod eviction.

The Complete Persistence Chain

┌─────────────────────────────────────────────────────────────────────┐
│                     Linux Host (Physical/VM)                        │
├─────────────────────────────────────────────────────────────────────┤
│  systemd (PID 1)                                                    │
│  ├── Supervises all system services                                 │
│  ├── Restarts failed services automatically                         │
│  └── Config: /etc/systemd/system/                                   │
│      │                                                              │
│      └── kubelet.service                                            │
│          ├── Started and supervised by systemd                      │
│          ├── Watches /etc/kubernetes/manifests/ for static pods     │
│          ├── Watches API server for scheduled pods                  │
│          └── Ensures containers match pod specs                     │
│              │                                                      │
│              ├── Static Pods (/etc/kubernetes/manifests/)           │
│              │   ├── etcd ──────────────────┐                       │
│              │   ├── kube-apiserver ◄───────┤ Persistent            │
│              │   ├── kube-controller-manager│ State Store           │
│              │   └── kube-scheduler         │                       │
│              │                              │                       │
│              └── Regular Pods ◄─────────────┘                       │
│                  │                 (scheduled via API server)       │
│                  │                                                  │
│                  ├── kube-system namespace                          │
│                  │   ├── CoreDNS                                    │
│                  │   ├── kube-proxy                                 │
│                  │   └── CNI plugins                                │
│                  │                                                  │
│                  ├── ingress-nginx namespace                        │
│                  │   └── NGINX Ingress Controller                   │
│                  │       └── Watches Ingress resources              │
│                  │                                                  │
│                  └── Application namespaces                         │
│                      ├── cert-manager                               │
│                      ├── Prometheus Operator                        │
│                      └── Your applications                          │
└─────────────────────────────────────────────────────────────────────┘

The Critical Insight: Controllers Are Stateless

This is the elegant core of the design: controllers don't store state.

Every controller:

Reads desired state from the API server (backed by etcd)
Watches for changes via the API server
Makes changes through the API server
Can be restarted anywhere, anytime, without losing information

The API server + etcd is the single source of truth, not the controllers.

This is why you can:

Delete any controller pod → it restarts and catches up
Move controllers between nodes → they just reconnect
Scale controllers to multiple replicas → they coordinate via the API server
Upgrade controllers → new version reads the same state

What Survives vs. What Doesn't

Survives Any Pod Eviction

Resource	Why It Survives
Kubernetes objects in etcd	Stored independently of pods
Helm releases	Stored as secrets in etcd
Operator-managed CRDs	Reconciled by operator continuously
PersistentVolumes	Storage exists outside the cluster
ConfigMaps/Secrets	Stored in etcd

Doesn't Survive Without Help

Resource	Why It Doesn't Survive
Pod-local EmptyDir volumes	Deleted with the pod
Manually applied resources with missing dependencies	Validation webhooks reject on recreation
In-memory caches	Process restarts lose memory
Node-local state	Unless explicitly persisted

The Elegance of the Design

The Kubernetes architecture embodies several design principles:

Declarative over imperative — Describe desired state, not steps to get there
Reconciliation over transactions — Continuously converge to desired state
Stateless controllers — State lives in etcd, not in components
Hierarchical supervision — Every layer watches the layer above
Failure is normal — Design for recovery, not prevention

This is why Kubernetes clusters can:

Lose nodes unexpectedly
Have pods evicted for resource pressure
Experience network partitions
Undergo rolling upgrades

...and still maintain application availability.

Conclusion

The journey from debugging a missing ingress to understanding the complete supervision hierarchy revealed the sophisticated machinery that makes Kubernetes resilient.

systemd → kubelet → static pods → control plane → controllers → your apps

Each layer supervises the next, with etcd as the persistent memory that survives any component failure.

The key insight: Kubernetes doesn't prevent failures—it recovers from them automatically through layers of supervision, persistent state in etcd, and continuous reconciliation loops.

This is the true power of Kubernetes: not that things don't fail, but that when they do, the system knows how to restore itself to the desired state.

Series Recap

Part 1: When Our Ingress Vanished — The incident that started it all
Part 2: The Foundation — systemd → kubelet → control plane
Part 3: Controllers & Resilience — Why Kubernetes self-heals

DEV Community