If you've made it past the basics of Kubernetes — you know what a cluster is, you've spun up a local environment with kind or minikube — the next wall you hit is understanding the workload objects. There are nine of them. They look similar in YAML. They all involve containers. But they each exist for a very different reason.
The Quick Reference: Which Workload Do You Need?
Before diving in, here's the decision tree you'll use every day:
| Workload | Use When |
|---|---|
| Deployment | Running stateless apps, APIs, web servers, workers |
| StatefulSet | Databases, message brokers, anything needing stable identity |
| DaemonSet | Log agents, monitoring — one pod per node |
| Job | One-off batch tasks: migrations, reports, cleanup |
| CronJob | Scheduled recurring tasks |
| Pod (bare) | Almost never in production — use one of the above |
Now let's understand why.
Pods: The Atomic Unit
Everything in Kubernetes ultimately runs as a Pod — the smallest deployable unit. You don't run containers directly; you run Pods that contain containers.
Think of a Pod as a tiny apartment that one or more containers share:
- They share a single IP address
- Containers within a Pod talk to each other via
localhost - They can share volumes (disk storage)
- They always land on the same node — never split
apiVersion: v1
kind: Pod
metadata:
name: api-server-pod
spec:
containers:
- name: api
image: mycompany/api:2.1.0
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
The Most Important Thing About Pods
Pods are ephemeral. When a Pod dies, a new one is created with a completely new name and a completely new IP. Any code that hardcoded the old IP breaks. This is why Services exist (Phase 3) — but it's the mental model you need to carry through everything else.
Two Probes You Can't Skip
Liveness probe — "Is this container still alive?" If it fails, Kubernetes restarts the container. Use for crash detection and deadlocks.
Readiness probe — "Is this container ready for traffic?" If it fails, the pod is removed from the Service's endpoints — stops receiving traffic — but is not restarted. Use for startup warmup and temporary overload.
These two probes are doing different jobs. Never confuse them.
Pod Status Lifecycle
Pending → Running → Succeeded
↓
Failed / CrashLoopBackOff / OOMKilled
When you see CrashLoopBackOff, run kubectl logs my-pod --previous — that's where your crash output lives.
Quick Debugging Playbook
# Pod stuck in Pending?
kubectl describe pod my-pod # read the Events section
# "Insufficient CPU" → reduce requests or add nodes
# ImagePullBackOff?
kubectl describe pod my-pod # wrong image name? typo? private registry?
# CrashLoopBackOff?
kubectl logs my-pod --previous # read the crash
# OOMKilled?
kubectl describe pod my-pod # see Last State → increase memory limit
# Running but not working?
kubectl exec -it my-pod -- /bin/bash # investigate from inside
🧪 Practice — Pods
Lab 1: Your first pod
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: hello-nginx
labels:
app: hello
spec:
containers:
- name: nginx
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "50m"
limits:
memory: "128Mi"
cpu: "200m"
EOF
kubectl get pods --watch
kubectl describe pod hello-nginx # read the Events section top to bottom
kubectl exec -it hello-nginx -- /bin/bash
# inside: curl localhost
kubectl port-forward pod/hello-nginx 8080:80 &
curl http://localhost:8080
kill %1
Lab 2: Multi-container pod — shared localhost
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: multi-pod
spec:
containers:
- name: app
image: nginx:1.25
- name: sidecar
image: busybox
command: ['sh', '-c', 'while true; do echo sidecar running; sleep 5; done']
EOF
kubectl logs multi-pod -c app
kubectl logs multi-pod -c sidecar
kubectl exec -it multi-pod -c sidecar -- /bin/sh
# inside: wget -q -O- localhost ← hits the nginx container via shared IP
Lab 3: Trigger and observe CrashLoopBackOff
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: crasher
spec:
containers:
- name: app
image: busybox
command: ['sh', '-c', 'echo crashing; exit 1']
EOF
kubectl get pods --watch # watch it go into CrashLoopBackOff
kubectl logs crasher --previous # see the crash output
Cleanup
kubectl delete pod hello-nginx multi-pod crasher
Namespaces: Logical Isolation
A Namespace is a virtual cluster inside your real cluster. It's how you divide one Kubernetes cluster into isolated sections — by team, environment, or application.
Four namespaces exist by default:
-
default— where objects land if you don't specify -
kube-system— Kubernetes system components (etcd, apiserver, coredns) -
kube-public— publicly readable data (rarely used) -
kube-node-lease— node heartbeat objects (internal, ignore)
What Namespaces Give You
Name reuse — you can have a webapp Deployment in both staging and production with no conflict.
RBAC scoping — give team A access only to their namespace.
Resource quotas — cap a namespace to 4 CPUs and 8Gi memory total.
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
pods: "50"
requests.cpu: "10"
requests.memory: 20Gi
limits.cpu: "20"
limits.memory: 40Gi
One Critical Misconception
Namespaces do not provide network isolation by default. A pod in dev can still reach a pod in prod if it knows the IP or DNS name. For actual network isolation, you need NetworkPolicies.
Cross-namespace DNS format: <service>.<namespace>.svc.cluster.local
🧪 Practice — Namespaces
Lab: Isolation, quotas, and the -n flag
# Create two environments
kubectl create namespace dev
kubectl create namespace prod
# Deploy the same app name in both — no conflict
kubectl create deployment webapp --image=nginx:1.25 -n dev
kubectl create deployment webapp --image=nginx:1.25 -n prod
kubectl get deployments -A # see both side-by-side
# Apply a quota to prod
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ResourceQuota
metadata:
name: prod-quota
namespace: prod
spec:
hard:
pods: "5"
requests.cpu: "2"
requests.memory: 4Gi
EOF
# Try to exceed it
kubectl scale deployment webapp --replicas=10 -n prod
kubectl get pods -n prod # stops at quota limit
kubectl describe resourcequota prod-quota -n prod # usage vs hard
# Set dev as your default namespace
kubectl config set-context --current --namespace=dev
kubectl get pods # now implicitly reads from dev
# Switch back
kubectl config set-context --current --namespace=default
Cleanup
kubectl delete namespace dev prod
Labels, Selectors & Annotations
Labels are key-value pairs on any Kubernetes object. They're the connective tissue — how Services find Pods, how Deployments track their ReplicaSets, how you filter resources.
Annotations are also key-value pairs but for non-identifying metadata: descriptions, tool config, URLs. They cannot be used in selectors.
metadata:
labels:
app: api
env: production
version: "2.1.0" # always quote values that look like numbers
annotations:
description: "Payments API server"
prometheus.io/scrape: "true"
How a Service Uses Labels
apiVersion: v1
kind: Service
metadata:
name: backend-service
spec:
selector:
app: api # matches pods with app=api
env: production # AND env=production
ports:
- port: 80
targetPort: 3000
The Service routes traffic to any pod with both labels. Remove a label from a pod and it's immediately removed from the Service's endpoints. Add it back and it rejoins. This is live — no restart required.
Set-Based Selectors
Beyond simple equality, you can use expressions:
selector:
matchExpressions:
- key: env
operator: In
values: [production, staging]
- key: tier
operator: NotIn
values: [frontend]
- key: app
operator: Exists
One gotcha: always quote version numbers in YAML. version: 2.0 is parsed as a float, stored as "2" — breaking your selectors silently.
🧪 Practice — Labels & Selectors
Lab: Live label manipulation and Service routing
# Create three pods with different label combos
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: frontend-v1
labels:
app: frontend
env: prod
version: "1.0"
spec:
containers:
- name: nginx
image: nginx:1.25
---
apiVersion: v1
kind: Pod
metadata:
name: backend-v1
labels:
app: backend
env: prod
version: "1.0"
spec:
containers:
- name: nginx
image: nginx:1.24
---
apiVersion: v1
kind: Pod
metadata:
name: backend-staging
labels:
app: backend
env: staging
version: "2.0"
spec:
containers:
- name: nginx
image: nginx:1.26
EOF
# Filter practice
kubectl get pods --show-labels
kubectl get pods -l app=backend
kubectl get pods -l app=backend,env=prod
kubectl get pods -l 'env in (prod,staging)'
kubectl get pods -l 'version notin (1.0)'
# Create a Service that selects app=backend + env=prod
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
name: backend-svc
spec:
selector:
app: backend
env: prod
ports:
- port: 80
targetPort: 80
EOF
kubectl describe service backend-svc # check Endpoints section
# Live label surgery — remove env from backend-v1
kubectl label pod backend-v1 env-
kubectl describe service backend-svc # endpoint disappears immediately
# Add it back
kubectl label pod backend-v1 env=prod
kubectl describe service backend-svc # endpoint returns
Cleanup
kubectl delete pod frontend-v1 backend-v1 backend-staging
kubectl delete service backend-svc
ReplicaSets: Self-Healing
A ReplicaSet ensures a specified number of identical Pods are always running.
spec.replicas: 3
Pod-2 crashes → ReplicaSet creates Pod-4 → back to 3
Rogue pod added → ReplicaSet deletes one → back to 3
The RS continuously counts pods matching its selector. Actual count ≠ desired → create or delete.
Why You Almost Never Create ReplicaSets Directly
ReplicaSets don't support rolling updates or rollbacks. That's exactly what Deployments add. In practice, you create a Deployment and it creates/manages a ReplicaSet for you.
You need to understand ReplicaSets because:
- Deployments own them under the hood
- Debugging often means inspecting the RS
- It explains the self-healing behavior you see
The RS name format when created by a Deployment: <deployment-name>-<pod-template-hash>. That hash changes every time the pod template changes — which is exactly how rolling updates work.
🧪 Practice — ReplicaSets
Lab: Self-healing in action
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: demo-rs
spec:
replicas: 3
selector:
matchLabels:
app: demo
template:
metadata:
labels:
app: demo
spec:
containers:
- name: nginx
image: nginx:1.25
EOF
kubectl get pods -l app=demo
# Open a watch in a second terminal: kubectl get pods --watch
# Kill one pod — RS heals immediately
kubectl delete pod $(kubectl get pods -l app=demo -o jsonpath='{.items[0].metadata.name}')
kubectl get pods -l app=demo # new pod already created
# Scale up
kubectl scale rs demo-rs --replicas=5
kubectl get pods -l app=demo
# Try creating a stray pod with the same label
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: stray-pod
labels:
app: demo # matches RS selector!
spec:
containers:
- name: nginx
image: nginx:1.25
EOF
kubectl get pods -l app=demo # RS sees 6, wants 5 → deletes one
Cleanup
kubectl delete rs demo-rs
Deployments: The Standard Way to Run Apps
A Deployment is what you use 90% of the time. It wraps a ReplicaSet and adds rolling updates, rollbacks, and version history.
You → Deployment → manages → ReplicaSets → manages → Pods
│
├── RS v1 (old image) → 0 pods
└── RS v2 (new image) → 3 pods ✅
How a Rolling Update Actually Works
With maxSurge: 1, maxUnavailable: 0 and 3 replicas:
[v1][v1][v1] # start
[v1][v1][v1][v2] # spin up 1 new pod (surge)
[v1][v1][v2] # v2 passes readiness → kill 1 v1
[v1][v1][v2][v2] # spin up another v2
[v1][v2][v2] # v2 passes → kill v1
[v1][v2][v2][v2] # final new pod
[v2][v2][v2] # all updated, zero downtime ✅
The readiness probe gates each step. Without it, Kubernetes can't tell when a new pod is actually ready.
Production Deployment YAML
apiVersion: apps/v1
kind: Deployment
metadata:
name: orders-api
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: orders-api
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0 # never go below desired count
minReadySeconds: 10
revisionHistoryLimit: 5
template:
metadata:
labels:
app: orders-api
spec:
containers:
- name: api
image: mycompany/orders-api:3.0.1
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
The Commands You'll Use Daily
# Update image → triggers rolling update
kubectl set image deployment/my-app app=nginx:1.26
# Watch it happen
kubectl rollout status deployment/my-app
# Something went wrong?
kubectl rollout undo deployment/my-app
# Check history
kubectl rollout history deployment/my-app
# Restart all pods (rolling, with zero downtime)
kubectl rollout restart deployment/my-app
kubectl rollout undo is your emergency brake. Run it first, debug second.
🧪 Practice — Deployments
Lab: Full deployment lifecycle — deploy, update, break, rollback
# Deploy v1
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp
annotations:
kubernetes.io/change-cause: "initial deployment nginx 1.24"
spec:
replicas: 3
selector:
matchLabels:
app: webapp
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: webapp
spec:
containers:
- name: nginx
image: nginx:1.24
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 3
periodSeconds: 5
EOF
kubectl get pods --watch # Ctrl+C when all 3 Running
kubectl get rs # see the RS created by the Deployment
# Rolling update to 1.25
kubectl set image deployment/webapp nginx=nginx:1.25
kubectl annotate deployment/webapp kubernetes.io/change-cause="upgrade to nginx 1.25" --overwrite
kubectl rollout status deployment/webapp
kubectl rollout history deployment/webapp # see 2 revisions
kubectl get rs # old RS now has 0 pods but still exists for rollback
# Simulate a bad update — nonexistent image
kubectl set image deployment/webapp nginx=nginx:NONEXISTENT
kubectl rollout status deployment/webapp # hangs — new pods can't start
kubectl get pods # old pods still serving! new pods in ErrImagePull
# Emergency rollback
kubectl rollout undo deployment/webapp
kubectl rollout status deployment/webapp # watch recovery
kubectl get pods # all healthy again
# Scale
kubectl scale deployment webapp --replicas=6
kubectl scale deployment webapp --replicas=2
Cleanup
kubectl delete deployment webapp
Imperative vs Declarative: The Mental Model
There are two ways to work with Kubernetes.
Imperative — tell Kubernetes what to do step by step:
kubectl create deployment web --image=nginx:1.25
kubectl scale deployment web --replicas=3
kubectl set image deployment/web nginx=nginx:1.26
Declarative — tell Kubernetes what you want, let it figure out how:
kubectl apply -f web.yaml
# edit the file
kubectl apply -f web.yaml # kubernetes computes the diff
The key advantage of declarative: idempotency.
# Imperative — fails on second run
kubectl create deployment web --image=nginx
# Error: deployments.apps "web" already exists
# Declarative — safe to run forever
kubectl apply -f web.yaml
# deployment.apps/web created (first run)
# deployment.apps/web unchanged (no changes)
# deployment.apps/web configured (file changed)
In production: always declarative. Store YAML in Git. CI/CD runs kubectl apply. This is GitOps — Git is the source of truth, the cluster always mirrors it.
Generate YAML Fast
kubectl create deployment web --image=nginx:1.25 --replicas=3 \
--dry-run=client -o yaml > deployment.yaml
--dry-run=client -o yaml generates the YAML locally without sending anything to the API server. Edit it, then apply. This is extremely useful in CKA/CKAD exams.
🧪 Practice — Imperative vs Declarative
Lab: Generate, edit, and apply
# Generate Deployment YAML without applying
kubectl create deployment web --image=nginx:1.25 --replicas=3 \
--dry-run=client -o yaml > deployment.yaml
cat deployment.yaml # inspect what was generated
# Add resources block manually, then apply
kubectl apply -f deployment.yaml
kubectl apply -f deployment.yaml # safe to run again — "unchanged"
# Edit the file — change replicas to 5
sed -i 's/replicas: 3/replicas: 5/' deployment.yaml
kubectl apply -f deployment.yaml # "configured" — only diff applied
kubectl get deployment web
# Generate Pod and Service YAML for practice
kubectl run my-pod --image=busybox --dry-run=client -o yaml > pod.yaml
kubectl expose deployment web --port=80 --type=ClusterIP \
--dry-run=client -o yaml > service.yaml
Cleanup
kubectl delete deployment web
DaemonSets: One Pod per Node
A DaemonSet ensures exactly one Pod runs on every node in the cluster. New node joins → pod auto-created. Node removed → pod garbage collected.
Node 1 Node 2 Node 3 Node 4 (new)
[log] [log] [log] [log] ← auto-created
When to Use DaemonSets
| Use Case | Examples |
|---|---|
| Log collection | Fluentd, Filebeat, Promtail |
| Monitoring agents | Prometheus node-exporter, Datadog agent |
| Network plugins | Calico, Cilium, kube-proxy |
| Security agents | Falco, Sysdig |
kube-proxy itself is a DaemonSet. Verify: kubectl get ds -n kube-system
Running on Control-Plane Nodes
By default, DaemonSet pods won't run on control-plane nodes because of a taint: node-role.kubernetes.io/control-plane:NoSchedule. If you need them there, add a toleration:
spec:
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
Update Strategies
- RollingUpdate (default): automatically updates pods one node at a time
- OnDelete: only updates pods when you manually delete them — useful for per-node maintenance windows
🧪 Practice — DaemonSets
Lab: Per-node coverage and system DaemonSets
# See existing DaemonSets in the cluster
kubectl get ds -n kube-system -o wide
# kube-proxy is a DaemonSet — one pod per node
# Create a simple node-monitoring DaemonSet
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-monitor
spec:
selector:
matchLabels:
app: node-monitor
template:
metadata:
labels:
app: node-monitor
spec:
containers:
- name: monitor
image: busybox
command:
- sh
- -c
- while true; do echo "Node $(NODE_NAME) alive at $(date)"; sleep 10; done
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
resources:
requests:
memory: "20Mi"
cpu: "10m"
EOF
# Verify one pod per worker node
kubectl get pods -l app=node-monitor -o wide
# Check logs from one
POD=$(kubectl get pods -l app=node-monitor -o jsonpath='{.items[0].metadata.name}')
kubectl logs $POD
Cleanup
kubectl delete ds node-monitor
Jobs & CronJobs: Batch and Scheduled Work
Job — runs pods to completion. Once finished successfully, stops. For one-off batch tasks.
CronJob — runs a Job on a schedule (cron syntax). For recurring tasks.
Deployment: "Run this forever"
Job: "Run this ONCE until it succeeds"
CronJob: "Run this Job every day at 2am"
Job YAML
apiVersion: batch/v1
kind: Job
metadata:
name: db-migration
spec:
backoffLimit: 3 # retry up to 3 times
activeDeadlineSeconds: 300 # kill after 5 minutes regardless
ttlSecondsAfterFinished: 60 # auto-delete 60s after completion
template:
spec:
restartPolicy: OnFailure # REQUIRED — Always is invalid for Jobs
containers:
- name: migration
image: mycompany/api:3.0.1
command: ["node", "scripts/migrate.js"]
The restartPolicy: OnFailure (or Never) is mandatory — Jobs cannot use Always because they'd never complete.
Set ttlSecondsAfterFinished or finished Jobs pile up in your cluster forever.
CronJob YAML
apiVersion: batch/v1
kind: CronJob
metadata:
name: nightly-report
spec:
schedule: "0 2 * * *" # every day at 2:00 AM
concurrencyPolicy: Forbid # skip if previous run still running
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
jobTemplate:
spec:
backoffLimit: 2
ttlSecondsAfterFinished: 300
template:
spec:
restartPolicy: OnFailure
containers:
- name: reporter
image: mycompany/reporter:1.0
command: ["python", "generate_report.py"]
Cron Schedule Quick Reference
0 2 * * * Every day at 2:00 AM
*/5 * * * * Every 5 minutes
0 9 * * 1 Every Monday at 9:00 AM
0 0 1 * * 1st of every month at midnight
@daily Shortcut for 0 0 * * *
@hourly Shortcut for 0 * * * *
Trigger a CronJob Manually
kubectl create job --from=cronjob/nightly-report manual-run-01
This creates a Job immediately using the CronJob's template, without affecting scheduled runs.
🧪 Practice — Jobs & CronJobs
Lab 1: Simple one-off Job
cat <<EOF | kubectl apply -f -
apiVersion: batch/v1
kind: Job
metadata:
name: hello-job
spec:
backoffLimit: 2
ttlSecondsAfterFinished: 60
template:
spec:
restartPolicy: OnFailure
containers:
- name: hello
image: busybox
command: ['sh', '-c', 'echo "Job running at $(date)"; sleep 5; echo Done!']
EOF
kubectl get pods --watch # watch pod go: Pending → Running → Completed
kubectl logs -l job-name=hello-job
kubectl get job hello-job # COMPLETIONS should show 1/1
Lab 2: CronJob that runs every minute
cat <<EOF | kubectl apply -f -
apiVersion: batch/v1
kind: CronJob
metadata:
name: minute-job
spec:
schedule: "* * * * *"
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
jobTemplate:
spec:
ttlSecondsAfterFinished: 60
template:
spec:
restartPolicy: OnFailure
containers:
- name: echo
image: busybox
command: ['sh', '-c', 'echo "Scheduled run at $(date)"']
EOF
# Wait 1-2 minutes
kubectl get jobs
kubectl get cronjob minute-job
# Trigger it manually right now
kubectl create job --from=cronjob/minute-job manual-run-01
kubectl logs -l job-name=manual-run-01
# Suspend future runs without deleting
kubectl patch cronjob minute-job -p '{"spec":{"suspend":true}}'
kubectl get cronjob minute-job # SUSPEND should show True
Cleanup
kubectl delete cronjob minute-job
StatefulSets: Stateful Applications
A StatefulSet is like a Deployment, but for applications that need three guarantees:
-
Stable, predictable names — pods are always
pod-0,pod-1,pod-2 - Stable storage — each pod gets its own PVC that sticks with it across restarts
- Ordered startup/shutdown — pods start in order (0→1→2) and stop in reverse
DEPLOYMENT: STATEFULSET:
my-app-abc12 (random) mysql-0 (always 0)
my-app-def34 (random) mysql-1 (always 1)
my-app-ghi56 (random) mysql-2 (always 2)
Pod dies → new name Pod dies → comes back as mysql-1
Pod dies → new IP Pod dies → same storage re-attached
Starts in any order Starts 0 first, then 1, then 2
The Headless Service Requirement
StatefulSets require a headless Service (clusterIP: None) to provide stable DNS names per pod:
mysql-0.mysql-svc.default.svc.cluster.local → 10.244.1.5
mysql-1.mysql-svc.default.svc.cluster.local → 10.244.2.8
A normal Service load-balances across pods. A headless Service returns individual DNS records for each pod — so applications can say "connect to the primary at mysql-0.mysql-svc" and mean it.
StatefulSet YAML
# Headless Service — create this FIRST
apiVersion: v1
kind: Service
metadata:
name: mysql-svc
spec:
clusterIP: None # ← this makes it headless
selector:
app: mysql
ports:
- port: 3306
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: mysql-svc # links to headless service above
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
volumeMounts:
- name: data
mountPath: /var/lib/mysql
volumeClaimTemplates: # ← key difference from Deployment
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
volumeClaimTemplates creates individual PVCs per pod: data-mysql-0, data-mysql-1, data-mysql-2.
The PVC Retention Gotcha
When you delete a StatefulSet, the PVCs are not deleted. This is intentional — data is precious. But it means they stay around until you clean them up manually:
kubectl delete sts mysql
kubectl delete pvc -l app=mysql # must do this separately
Canary Updates with partition
The partition field in rollingUpdate is powerful for staged rollouts:
updateStrategy:
type: RollingUpdate
rollingUpdate:
partition: 2 # only update pods with ordinal >= 2
Set partition: 2 to update only mysql-2 first. Test it. Lower to 1. Lower to 0 to complete. This is native canary deployment for StatefulSets.
🧪 Practice — StatefulSets
Lab: Stable identity, persistent storage, and ordered startup
# Create headless service + StatefulSet
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
name: web-svc
spec:
clusterIP: None
selector:
app: web
ports:
- port: 80
name: http
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: web-svc
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: nginx
image: nginx:1.25
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
EOF
# Watch ordered startup — web-0 first, then web-1, then web-2
kubectl get pods --watch
# Confirm stable names
kubectl get pods -l app=web # always web-0, web-1, web-2
# See the per-pod PVCs created
kubectl get pvc # www-web-0, www-web-1, www-web-2
# Write data to web-0's volume
kubectl exec web-0 -- sh -c 'echo "pod-0 data" > /usr/share/nginx/html/index.html'
# Delete web-0 — watch it come back with the SAME name
kubectl delete pod web-0
kubectl get pods --watch # web-0 comes back, not a random name
# Data persists — PVC was reattached
kubectl exec web-0 -- cat /usr/share/nginx/html/index.html # still "pod-0 data"
# Reach a specific pod by stable DNS from another pod
kubectl run tmp --image=busybox --rm -it --restart=Never -- \
wget -q -O- web-0.web-svc.default.svc.cluster.local
Cleanup — note PVCs must be deleted separately
kubectl delete sts web
kubectl delete svc web-svc
kubectl delete pvc -l app=web # PVCs are NOT auto-deleted
The Master Cheat Sheet
# LIST
kubectl get pods / deploy / rs / ds / sts / jobs / cronjobs
# USEFUL FLAGS
kubectl get pods -o wide # +IP +NODE
kubectl get pods --show-labels # show label columns
kubectl get pods -l app=my-app # filter by label
kubectl get pods -A # all namespaces
# INSPECT (always read the Events section)
kubectl describe pod/deploy/sts <n>
# LOGS
kubectl logs <pod>
kubectl logs <pod> --previous # after a crash
kubectl logs <pod> -c <container> # multi-container pod
# SHELL
kubectl exec -it <pod> -- /bin/bash
# DEPLOYMENTS
kubectl scale deploy <n> --replicas=5
kubectl set image deploy/<n> <container>=<image>:tag
kubectl rollout status deploy/<n>
kubectl rollout undo deploy/<n>
kubectl rollout restart deploy/<n>
# JOBS
kubectl create job test --from=cronjob/<n> # manual trigger
kubectl patch cronjob <n> -p '{"spec":{"suspend":true}}'
# GENERATE YAML BOILERPLATE
kubectl create deployment web --image=nginx:1.25 --replicas=3 \
--dry-run=client -o yaml > deployment.yaml
API Versions Quick Reference
| Workload | apiVersion | kind |
|---|---|---|
| Pod | v1 |
Pod |
| Namespace | v1 |
Namespace |
| ReplicaSet | apps/v1 |
ReplicaSet |
| Deployment | apps/v1 |
Deployment |
| DaemonSet | apps/v1 |
DaemonSet |
| StatefulSet | apps/v1 |
StatefulSet |
| Job | batch/v1 |
Job |
| CronJob | batch/v1 |
CronJob |
Restart Policy Rules
| Policy | Use In | Behaviour |
|---|---|---|
Always |
Deployments, DaemonSets | Restart on any exit — default |
OnFailure |
Jobs | Restart only on non-zero exit |
Never |
Jobs | Never restart — new pod on each retry |
🧪 Final Practice — End-to-End Scenario
This lab ties everything together. You'll deploy a full application stack using every concept from this article.
Scenario: Deploy a web application with a background worker, a nightly cleanup job, and per-node logging.
# Step 1: Create a dedicated namespace
kubectl create namespace myapp
# Step 2: Deploy the web app (Deployment)
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
namespace: myapp
labels:
app: web
tier: frontend
spec:
replicas: 2
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
tier: frontend
spec:
containers:
- name: nginx
image: nginx:1.25
resources:
requests: { memory: "64Mi", cpu: "50m" }
limits: { memory: "128Mi", cpu: "200m" }
readinessProbe:
httpGet: { path: /, port: 80 }
initialDelaySeconds: 3
EOF
# Step 3: Deploy a background worker (Deployment, different labels)
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: worker
namespace: myapp
labels:
app: worker
tier: backend
spec:
replicas: 1
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
tier: backend
spec:
containers:
- name: worker
image: busybox
command: ['sh', '-c', 'while true; do echo "worker processing..."; sleep 10; done']
resources:
requests: { memory: "32Mi", cpu: "25m" }
EOF
# Step 4: Per-node log collector (DaemonSet)
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: log-collector
namespace: myapp
spec:
selector:
matchLabels:
app: log-collector
template:
metadata:
labels:
app: log-collector
spec:
containers:
- name: collector
image: busybox
command: ['sh', '-c', 'while true; do echo "collecting logs from $(NODE_NAME)"; sleep 15; done']
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
resources:
requests: { memory: "20Mi", cpu: "10m" }
EOF
# Step 5: Nightly cleanup Job (CronJob)
cat <<EOF | kubectl apply -f -
apiVersion: batch/v1
kind: CronJob
metadata:
name: cleanup
namespace: myapp
spec:
schedule: "0 3 * * *"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 2
jobTemplate:
spec:
ttlSecondsAfterFinished: 120
template:
spec:
restartPolicy: OnFailure
containers:
- name: cleanup
image: busybox
command: ['sh', '-c', 'echo "cleaning up old data..."; sleep 3; echo done']
EOF
# Inspect the whole namespace
kubectl get all -n myapp
# Check labels
kubectl get pods -n myapp --show-labels
# Filter by tier
kubectl get pods -n myapp -l tier=frontend
kubectl get pods -n myapp -l tier=backend
# Trigger cleanup job manually
kubectl create job --from=cronjob/cleanup manual-cleanup-01 -n myapp
kubectl logs -l job-name=manual-cleanup-01 -n myapp
# Simulate rolling update on web
kubectl set image deployment/web nginx=nginx:1.26 -n myapp
kubectl rollout status deployment/web -n myapp
# Scale worker up
kubectl scale deployment worker --replicas=3 -n myapp
kubectl get pods -n myapp
Cleanup
kubectl delete namespace myapp # deletes everything inside it
Top comments (0)