Table of Contents
- Why Kubernetes Exists
- Cluster Architecture
- Control Plane
- Worker Node Components
- kubectl
- Local Cluster Setup
- Cheat Sheet
Why Kubernetes Exists
Before Kubernetes, running containers in production meant: SSH into servers, start and stop containers by hand, write fragile shell scripts for restarts, and scale by spinning up machines one by one.
Kubernetes is a container orchestration system. It solves five problems that every team hits at scale.
🔴 Problem 1 — Single Point of Failure
Without Kubernetes:
[App] ──► [Server A] ──✕──► App is DOWN until someone manually fixes it
With Kubernetes:
┌──► [Node A] ──✕── dies
[App] ─────┤──► [Node B] ✓ still running
└──► [Node C] ✓ still running
K8s auto-reschedules pod from A → B or C
Kubernetes detects the dead node and automatically reschedules all pods to healthy ones. No pager. No SSH.
📈 Problem 2 — Manual Scaling
Without Kubernetes:
Traffic spike → someone notices → SSH → docker run ×10 → 20 minutes wasted
With Kubernetes:
Traffic spike → HPA detects high CPU → pods spin up automatically → seconds, no human
The Horizontal Pod Autoscaler watches metrics and adds or removes pods automatically.
🚀 Problem 3 — Deployment Downtime
Without Kubernetes:
Deploy v2: stop v1 ──► [gap — users see errors] ──► start v2
With Kubernetes (rolling update):
[v1][v1][v1]
↓
[v2][v1][v1] ← v2 started and health-checked first
↓
[v2][v2][v1]
↓
[v2][v2][v2] ← v1 fully gone. Zero downtime.
New pods are health-checked before old ones are removed. Users never see an error page.
🔑 Problem 4 — Config & Secrets Chaos
Without Kubernetes:
DB_PASSWORD="hunter2" ← hardcoded in source code
Different .env files per environment ← drift guaranteed
With Kubernetes:
ConfigMaps + Secrets ← config lives outside the image
Same binary ← works in dev / staging / prod
Config is decoupled from the container image entirely.
🌐 Problem 5 — Service Discovery Breakage
Without Kubernetes:
Service A ──hardcoded──► IP 10.0.0.45 (Service B)
Service B restarts → new IP → Service A breaks
With Kubernetes:
Service A ──DNS──► "payment-service"
Kubernetes routes to the right pods — IPs are irrelevant
Pods are ephemeral and get new IPs on restart. Kubernetes Services provide stable DNS names that always resolve correctly.
Common Misconceptions
| Misconception | Reality |
|---|---|
| Kubernetes replaces Docker | No — K8s orchestrates containers; Docker/containerd runs them |
| Only for large companies | Works equally well for a 2-person startup |
| Cloud-only | Runs on laptops, on-prem hardware, and cloud equally |
| Replaces CI/CD pipelines | K8s is the deployment target, not the pipeline |
Cluster Architecture
A Kubernetes cluster has two types of machines:
- Control Plane — the brain. Makes all scheduling and reconciliation decisions.
- Worker Nodes — the muscle. Runs your actual application containers.
All communication goes through the control plane. You never interact with worker nodes directly.
┌──────────────────────────────────────────────────────────────────────┐
│ KUBERNETES CLUSTER │
│ │
│ You (kubectl) │
│ │ HTTPS │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ CONTROL PLANE │ │
│ │ │ │
│ │ kube-apiserver ◄──────────────► etcd │ │
│ │ (front door · port 6443) (cluster database) │ │
│ │ │ │
│ │ kube-scheduler controller-manager │ │
│ │ (pod placement) (reconciliation loops) │ │
│ └───────────────┬──────────────────────────┬──────────────── ┘ │
│ │ │ │
│ ┌────────────▼──────┐ ┌────────────────▼──┐ ┌────────────┐ │
│ │ Worker Node 1 │ │ Worker Node 2 │ │ Worker 3 │ │
│ │ Pod Pod │ │ Pod │ │ Pod Pod │ │
│ │ kubelet │ │ kubelet │ │ kubelet │ │
│ │ kube-proxy │ │ kube-proxy │ │ kube-proxy│ │
│ │ containerd │ │ containerd │ │ containerd│ │
│ └───────────────────┘ └───────────────────┘ └────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
▶ Full flow: kubectl create deployment → running container
1. kubectl reads ~/.kube/config → finds API server URL
2. kube-apiserver authenticates, RBAC check, validates request
3. kube-apiserver writes Deployment object to etcd
4. controller-mgr detects new Deployment → creates ReplicaSet → unscheduled Pod
5. kube-scheduler filters + scores nodes → binds pod to best node
6. kubelet sees assignment → calls containerd via CRI
7. containerd pulls image → calls runc → starts container
8. kubelet reports "Running" to API server
9. etcd pod status updated
10. kubectl get pods → my-app 1/1 Running ✓
Control Plane
kube-apiserver
The single entry point for everything — kubectl, CI/CD pipelines, dashboards, and internal components. Validates requests, enforces RBAC, persists state to etcd. Port 6443 over HTTPS.
▶ The 4-gate request pipeline
Incoming request
│
▼
┌─ GATE 1: AUTHENTICATION ─────────────────────────────────────────┐
│ "Who are you?" │
│ TLS client certificates · Bearer tokens · OIDC │
│ Failure → 401 Unauthorized │
└──────────────────────────────────────┬───────────────────────────┘
│
▼
┌─ GATE 2: AUTHORIZATION (RBAC) ───────────────────────────────────┐
│ "Are you allowed to do this?" │
│ Checks Roles and ClusterRoles bound to the user │
│ Failure → 403 Forbidden │
└──────────────────────────────────────┬───────────────────────────┘
│
▼
┌─ GATE 3: ADMISSION CONTROL ──────────────────────────────────────┐
│ Mutating webhooks → can modify the request │
│ Validating webhooks → can reject the request │
│ Failure → 400 / 422 │
└──────────────────────────────────────┬───────────────────────────┘
│
▼
┌─ GATE 4: PERSIST TO etcd ────────────────────────────────────────┐
│ Object written → 201 Created returned ✓ │
└──────────────────────────────────────────────────────────────────┘
Controllers and kubectl --watch use long-lived HTTP watch streams, not polling:
# GET /api/v1/namespaces/default/pods?watch=true
# Server streams:
# ADDED my-pod Pending
# MODIFIED my-pod Running
# DELETED my-pod
etcd
Distributed key-value store. The single source of truth for all cluster state. Losing etcd without a backup = losing all cluster configuration.
/registry/pods/default/my-pod → { pod spec }
/registry/deployments/default/my-app → { deployment spec }
/registry/services/default/my-svc → { service config }
etcd uses Raft consensus — a write commits only when a majority quorum agrees.
▶ Raft quorum explained
3-node etcd cluster:
Leader ──heartbeat──► Follower 1
──heartbeat──► Follower 2
Write committed when majority (2 of 3) agree ✓
Leader + one Follower offline → 1 of 3 → no majority → writes halt ✗
| Cluster Size | Quorum | Failures Tolerated |
|---|---|---|
| 1 | 1 | 0 |
| 3 | 2 | 1 ← production minimum |
| 5 | 3 | 2 ← recommended |
| 7 | 4 | 3 |
Always use an odd number. Even numbers risk a 50/50 split → no quorum → all writes halt.
kube-scheduler
Decides which node a new pod runs on. Does not start the pod — only decides placement.
▶ How the scheduler picks a node
New pod (unscheduled)
│
▼
PHASE 1 — FILTER: "Which nodes can run this pod?"
Node 1: 90% CPU → skip
Node 2: 20% CPU → feasible ✓
Node 3: 55% CPU → feasible ✓
PHASE 2 — SCORE: "Which is the best node?"
Node 2: score 85 (less loaded, image already cached)
Node 3: score 62
PHASE 3 — BIND:
Pod assigned to Node 2 → kubelet starts the container
Diagnosing a pod stuck in Pending:
kubectl describe pod my-pod
# Events:
# Warning FailedScheduling 0/3 nodes are available:
# 3 Insufficient memory.
controller-manager
Runs all built-in controllers in a single binary. Each one watches for drift between desired and actual state and corrects it — forever.
while True:
desired = api_server.get_desired_state()
actual = api_server.get_actual_state()
if desired != actual:
take_corrective_action() # create / delete / update
sleep(brief_interval)
▶ Built-in controllers
| Controller | Responsibility |
|---|---|
| Deployment Controller | Manages ReplicaSets to maintain desired replicas |
| ReplicaSet Controller | Ensures the correct number of Pods exist |
| Node Controller | Detects node failures, evicts pods from dead nodes |
| Job Controller | Runs pods to completion for batch workloads |
| CronJob Controller | Creates Jobs on a cron schedule |
| Endpoint Controller | Updates Service endpoints when backing pods change |
| Namespace Controller | Cleans up resources when a namespace is deleted |
Production note: Run 3 or 5 control plane nodes for HA. Back up etcd on a schedule. Never run user workloads on control plane nodes.
Worker Node Components
kubelet
Runs on every worker node as a systemd service — not a Kubernetes pod. If it crashes, systemd restarts it.
▶ What kubelet manages
| Responsibility | Detail |
|---|---|
| PodSpecs | Downloads pod definitions from the API server |
| Container lifecycle | Start, stop, restart via the container runtime |
| Health checks | Runs liveness, readiness, and startup probes |
| Resource reporting | Reports CPU/memory usage to the API server |
| Volume mounting | Mounts ConfigMaps, Secrets, and PVCs into containers |
| Static pods | Reads YAML from /etc/kubernetes/manifests/ directly |
▶ Static pods — how the control plane bootstraps itself
Static pods allow kubelet to start containers without a running API server, by reading YAML files from disk. This is how K8s starts itself from scratch:
/etc/kubernetes/manifests/
etcd.yaml
kube-apiserver.yaml
kube-controller-manager.yaml
kube-scheduler.yaml
Bootstrap order:
systemd starts kubelet
→ kubelet reads manifests → starts etcd
→ starts kube-apiserver
→ everything else connects normally
# Inspect in a kind cluster:
docker exec kind-control-plane ls /etc/kubernetes/manifests/
kube-proxy
Programs iptables or IPVS rules into the Linux kernel for Service routing. Not in the packet data path at runtime — it writes the rules and steps aside.
Client → GET http://my-service:80
↓
Linux kernel intercepts (rules written by kube-proxy)
├── 33% ──► Pod 10.244.1.5 (Node 1)
├── 33% ──► Pod 10.244.2.8 (Node 2)
└── 34% ──► Pod 10.244.3.2 (Node 3)
▶ Proxy modes comparison
| Mode | Mechanism | Lookup | Use Case |
|---|---|---|---|
iptables (default) |
Linux iptables chains | O(n) | < ~1,000 Services |
ipvs |
Kernel hash table | O(1) | Large clusters |
| eBPF (Cilium) | Bypasses iptables entirely | Best-in-class | Performance-critical |
containerd
The default container runtime. kubelet calls it via the CRI (Container Runtime Interface).
kubelet → (gRPC / CRI) → containerd → (OCI) → runc → container
▶ The "Docker was removed" myth — clarified
What was removed in K8s 1.24: dockershim (an adapter between kubelet and Docker's API).
What was NOT removed:
✓ Docker images and Dockerfiles → still work perfectly
✓ docker build → still valid locally
Before K8s 1.24: kubelet → dockershim → Docker → containerd → runc
After K8s 1.24: kubelet → containerd → runc
Docker was a middleman. containerd — which Docker itself uses internally — always did the actual work.
| Runtime | Used By | Notes |
|---|---|---|
| containerd | GKE, EKS, AKS, kind | Default — lightweight, fast |
| CRI-O | OpenShift, bare-metal | Built specifically for Kubernetes |
kubectl
kubectl reads ~/.kube/config to find your cluster's API server URL and credentials, then talks to kube-apiserver over HTTPS.
Managing Multiple Clusters
kubectl config view
kubectl config get-contexts
kubectl config use-context kind-k8s-lab
kubectl config current-context
kubectl config set-context --current --namespace=my-team
kubectl get pods --context=production-cluster
Command Structure
kubectl [verb] [resource] [name] [flags]
get pods my-pod -n production
describe deployment my-dep
delete service my-svc
apply -f app.yaml
exec -it my-pod -- /bin/bash
logs my-pod -f --previous
▶ Getting information
kubectl get pods
kubectl get pods -n kube-system # specific namespace
kubectl get pods -A # all namespaces
kubectl get pods -o wide # with IP and NODE columns
kubectl get pods -o yaml # full YAML spec
kubectl get pods --watch # live stream
kubectl get pods -l app=my-app # filter by label
kubectl get all -n my-namespace
kubectl describe pod my-pod # full details + Events ← debug here
kubectl describe node my-node
kubectl describe deployment my-dep
> Always read the Events section in kubectl describe — it is the primary debugging signal in Kubernetes.
▶ Creating and updating resources
kubectl apply -f app.yaml # create or update (idempotent) ← always prefer this
kubectl apply -f ./manifests/
kubectl create deployment my-app --image=nginx
kubectl create namespace my-team
kubectl apply is idempotent — creates if absent, updates if present. kubectl create fails if the resource already exists. Use apply in all scripts and pipelines.
▶ Deleting resources
kubectl delete pod my-pod
kubectl delete -f app.yaml
kubectl delete deployment my-app # removes ReplicaSet and pods too
kubectl delete pod my-pod --force --grace-period=0 # immediate (stuck pods)
▶ Logs
kubectl logs my-pod
kubectl logs my-pod -f # stream live
kubectl logs my-pod --previous # logs before last restart ← use after crashes
kubectl logs my-pod -c my-container # specific container
kubectl logs my-pod --tail=100
kubectl logs my-pod --since=1h
kubectl logs -l app=my-app # all matching pods
▶ Exec into a pod
kubectl exec -it my-pod -- /bin/bash
kubectl exec -it my-pod -- /bin/sh # if bash is unavailable
kubectl exec my-pod -- ls /app
kubectl exec -it my-pod -c sidecar -- /bin/sh
# Ephemeral debug pod — auto-deleted on exit
kubectl run debug --image=busybox --rm -it --restart=Never -- /bin/sh
▶ Port forwarding
kubectl port-forward pod/my-pod 8080:80
kubectl port-forward service/my-svc 8080:80
kubectl port-forward deployment/my-app 8080:3000
▶ Scaling and rollouts
kubectl scale deployment my-app --replicas=5
kubectl set image deployment/my-app nginx=nginx:1.25
kubectl rollout status deployment/my-app
kubectl rollout history deployment/my-app
kubectl rollout undo deployment/my-app
kubectl rollout undo deployment/my-app --to-revision=3
▶ JSONPath — extract specific fields
kubectl get pod my-pod -o jsonpath='{.status.podIP}'
kubectl get pods -o jsonpath='{.items[*].metadata.name}'
# All pods: namespace, name, IP
kubectl get pods -A -o jsonpath='{range .items[*]}{.metadata.namespace}{"\t"}{.metadata.name}{"\t"}{.status.podIP}{"\n"}{end}'
Common Mistakes
| Mistake | Correct Approach |
|---|---|
Using kubectl create in scripts |
Use kubectl apply — idempotent |
Forgetting -n <namespace>
|
Use -A or set a default namespace in the context |
| Deleting a pod and expecting it to stay gone | Delete the Deployment — the ReplicaSet recreates pods |
Skipping the Events section in describe
|
Events are the primary debugging signal |
Not using --previous after a crash |
kubectl logs my-pod --previous shows pre-crash logs |
Local Cluster Setup
▶ Tool comparison
| Tool | Nodes | Use Case |
|---|---|---|
| kind (recommended) | Multi-node | CI/CD, multi-node testing, closest to production |
| minikube | Single-node | Quick experiments, rich addons |
Install
# macOS
brew install kubectl kind
# Linux — kubectl
curl -LO "https://dl.k8s.io/release/$(curl -sL https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl && sudo mv kubectl /usr/local/bin/
# Linux — kind
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.22.0/kind-linux-amd64
chmod +x kind && sudo mv kind /usr/local/bin/
Create a Multi-Node Cluster
# kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: k8s-lab
nodes:
- role: control-plane
- role: worker
- role: worker
kind create cluster --config kind-config.yaml
kubectl get nodes
# k8s-lab-control-plane Ready control-plane
# k8s-lab-worker Ready <none>
# k8s-lab-worker2 Ready <none>
Verify Health
kubectl get nodes # all nodes → Ready
kubectl get pods -n kube-system # all system pods → Running
kubectl cluster-info
▶ First application lab
# Deploy
kubectl create deployment my-nginx --image=nginx:1.25 --replicas=2
kubectl get pods -o wide
# Expose
kubectl expose deployment my-nginx --port=80 --name=my-nginx-service
kubectl describe service my-nginx-service
# Access
kubectl port-forward service/my-nginx-service 8080:80 &
curl http://localhost:8080
# Scale
kubectl scale deployment my-nginx --replicas=4
# Rolling update
kubectl set image deployment/my-nginx nginx=nginx:1.26
kubectl rollout status deployment/my-nginx
# Rollback
kubectl rollout undo deployment/my-nginx
# Clean up
kill %1
kubectl delete deployment my-nginx
kubectl delete service my-nginx-service
kind Management
kind get clusters
kind create cluster --config kind-config.yaml
kind delete cluster --name k8s-lab
kind load docker-image my-app:latest --name k8s-lab
kubectl config use-context kind-k8s-lab
Cheat Sheet
▶ Architecture & key concepts
Control Plane: kube-apiserver · etcd · kube-scheduler · controller-manager
Worker Nodes: kubelet · kube-proxy · containerd
Full flow: kubectl → apiserver → etcd/scheduler → kubelet → container
Reconciliation: desired ≠ actual → controller fixes it → repeat forever
Static pods: /etc/kubernetes/manifests/ → kubelet reads directly
kubeconfig: ~/.kube/config → cluster URLs + credentials
etcd quorum: always odd — 3 or 5 in production
Scheduler: filter → score → bind to winner
kubelet: systemd service (not a pod); bootstraps the control plane
▶ kubectl quick reference
# Get
kubectl get pods / svc / deploy / nodes
kubectl get pods -A # all namespaces
kubectl get pods -o wide # +IP +NODE
kubectl get pod -o yaml # full spec
kubectl get pods --watch # live stream
kubectl describe pod # details + Events ← debug here
kubectl explain pod.spec.containers # built-in docs
# Apply / Delete
kubectl apply -f file.yaml
kubectl delete -f file.yaml
kubectl delete pod --force
# Logs
kubectl logs -f
kubectl logs --previous # after crash ← important
kubectl logs -c
kubectl logs -l app=my-app
# Exec / Port-forward
kubectl exec -it -- /bin/bash
kubectl port-forward svc/ 8080:80
# Scale / Rollout
kubectl scale deployment --replicas=5
kubectl set image deployment/ =<img>:tag
kubectl rollout status deployment/
kubectl rollout history deployment/
kubectl rollout undo deployment/
# Context
kubectl config get-contexts
kubectl config use-context
kubectl config set-context --current --namespace=
▶ kind quick reference
kind create cluster --config kind-config.yaml
kind get clusters
kind load docker-image --name
kind delete cluster --name
🤖 AI-generated · Personal reference only · Not original content"
Top comments (0)