π Complete Kubernetes Architecture - Master Notes
From Beginner to Expert Level
π― Table of Contents
- Why Kubernetes? The Real Problem
- Core Concepts & Vocabulary
- Kubernetes Architecture Overview
- Worker Node Components (Data Plane)
- Control Plane Components (Master)
- Complete Deployment Flow
- Pod vs Container Deep Dive
- Real-World Analogies
- Interview Ready Answers
- Advanced Concepts
π Why Kubernetes? The Real Problem {#why-kubernetes}
Before Kubernetes Era: Docker's Limitations
Imagine you're running a successful e-commerce website with microservices:
- User Service (handles login/signup)
- Product Service (manages catalog)
- Payment Service (processes transactions)
- Notification Service (sends emails)
With Docker alone, you face these problems:
Problem | Real-World Example |
---|---|
Manual Container Management | Black Friday traffic hits - you need to manually start 10 more containers |
No Auto-Healing | Payment service crashes at 2 AM - nobody knows until customers complain |
Complex Networking | How do 50+ containers find and talk to each other? |
No Load Balancing | All traffic hits one container while others sit idle |
Zero Orchestration | You manually SSH into each server to deploy updates |
Enter Kubernetes: The Solution
Kubernetes = Container Orchestration System
It's like having a smart manager who automatically:
- Deploys your containers
- Monitors their health 24/7
- Scales them up/down based on demand
- Restarts failed containers
- Routes traffic intelligently
Real Impact:
- Netflix runs thousands of microservices on Kubernetes
- Spotify handles millions of users with auto-scaling
- Companies reduce infrastructure costs by 30-50%
π Core Concepts & Vocabulary {#core-concepts}
Before diving deep, let's understand the fundamental building blocks:
Essential Terms
Term | Simple Definition | Real-World Analogy |
---|---|---|
Cluster | Group of machines working together | A data center with multiple servers |
Node | Individual machine (physical/virtual) | One server in the data center |
Pod | Smallest deployable unit (1+ containers) | A shipping container with packages inside |
Container Runtime | Software that runs containers | Docker engine or similar |
Control Plane | Brain of Kubernetes | Management office of a factory |
Worker Nodes | Where actual work happens | Factory floor where products are made |
Key Relationships
Cluster
βββ Control Plane (1 or more nodes)
β βββ API Server
β βββ etcd
β βββ Scheduler
β βββ Controller Manager
βββ Worker Nodes (multiple)
βββ kubelet
βββ kube-proxy
βββ Container Runtime
βββ Pods
βββ Containers
βοΈ Kubernetes Architecture Overview {#architecture-overview}
Kubernetes follows a master-worker architecture with clear separation of concerns:
Two Main Parts:
-
Control Plane (Master) π§
- Role: Decision maker, coordinator, brain
- Responsibilities: Scheduling, monitoring, storing state
- Components: API Server, etcd, Scheduler, Controllers
-
Worker Nodes (Data Plane) πͺ
- Role: Executor, muscle, worker
- Responsibilities: Running containers, networking, reporting status
- Components: kubelet, kube-proxy, Container Runtime
Communication Flow:
User/CLI β Control Plane β Worker Nodes β Containers
β β
βββββ Status Reports ββββββββββββ
π οΈ Worker Node Components (Data Plane) {#worker-node}
Think of Worker Node as a "Smart Factory Worker"
Each worker has specific tools and responsibilities to get the job done.
Component Breakdown:
1. π§βπ§ kubelet - The Node Agent
What it does:
- Acts as the primary agent on each worker node
- Communicates with Control Plane
- Ensures containers are running as specified
- Reports node and pod status back
Real-world analogy: Like a factory supervisor who:
- Gets instructions from management
- Ensures workers are doing their jobs
- Reports back on progress and issues
Technical Details:
- Runs as a system service on each node
- Watches for PodSpecs from API Server
- Uses Container Runtime Interface (CRI) to manage containers
- Performs health checks (liveness, readiness probes)
- Manages volumes and secrets
Example Workflow:
1. Control Plane: "Run nginx pod on Node-1"
2. kubelet receives instruction
3. kubelet β Container Runtime: "Start nginx container"
4. kubelet monitors container health
5. kubelet reports back: "nginx is running successfully"
Kubelet is an agent that runs on every Kubernetes worker node, and it is part of the data plane. Its main responsibilities are:
Pod Lifecycle Management β It takes PodSpecs from the API Server (via the control plane) and ensures that the containers described in those specs are running and healthy on the node.
Health Monitoring & Reporting β It continuously monitors the health of both the node and the pods, and reports their status back to the API Server.
Interaction with Container Runtime β It doesnβt directly run containers; instead, it talks to the container runtime (like Docker, containerd, CRI-O) using the Container Runtime Interface (CRI) to actually create, start, and stop containers.
Other Responsibilities β It also manages pod logs, executes liveness/readiness probes, mounts volumes, and enforces resource limits (CPU, memory) as defined in the PodSpec.
2. π kube-proxy - The Network Manager
What it does:
- Manages networking rules on each node
- Implements Kubernetes Services
- Provides load balancing across pod replicas
- Handles traffic routing
Real-world analogy: Like a smart traffic controller who:
- Directs traffic to the right destinations
- Balances load across multiple routes
- Updates routes when roads change
Technical Details:
- Runs as DaemonSet (one per node)
- Uses iptables or IPVS for traffic routing
- Maintains network rules for Services
- Handles NodePort, ClusterIP, LoadBalancer services
Example:
Service: frontend-service
βββ Pod 1 (IP: 10.1.1.1)
βββ Pod 2 (IP: 10.1.1.2)
βββ Pod 3 (IP: 10.1.1.3)
kube-proxy creates rules:
frontend-service:80 β Round-robin to Pod IPs
3. βοΈ Container Runtime - The Executor
What it does:
- Downloads container images
- Starts/stops containers
- Manages container lifecycle
- Provides container isolation
Supported Runtimes:
- containerd (most popular, Docker's core)
- CRI-O (RedHat's runtime)
- Docker (deprecated in K8s 1.24+)
Technical Details:
- Implements Container Runtime Interface (CRI)
- Handles image pulling from registries
- Manages container networking and storage
- Provides resource isolation (CPU, memory, disk)
Kubernetes Node Components - Interview Explanation
Opening Statement
"I'd like to walk you through the three core components that make every Kubernetes worker node function. Think of each node as a mini data center with specialized roles working together."
1. kubelet - The Node's Brain and Hands
High-Level Explanation
"The kubelet is essentially the Kubernetes agent running on every worker node. It's the bridge between the control plane's decisions and the actual container execution."
Key Responsibilities
- Pod Lifecycle Management: Receives pod specifications from the API server and ensures they're running correctly
- Health Monitoring: Continuously checks if containers are healthy using liveness and readiness probes
- Resource Management: Manages volumes, secrets, and configmaps for pods
- Status Reporting: Sends node and pod status back to the control plane
Interview-Ready Example
"Imagine you deploy an nginx pod. The kubelet receives this instruction, pulls the nginx image, starts the container, sets up networking and storage, then continuously monitors it. If nginx crashes, kubelet restarts it automatically."
Technical Deep-Dive
kubelet workflow:
1. Watches API Server for pod assignments
2. Calls Container Runtime via CRI
3. Sets up networking via CNI
4. Mounts volumes via CSI
5. Runs health checks
6. Reports status back
2. kube-proxy - The Traffic Director
High-Level Explanation
"kube-proxy implements Kubernetes Services at the node level. It's not a traditional proxy but a network rule manager that handles traffic routing and load balancing."
Key Responsibilities
- Service Implementation: Translates Service objects into network rules
- Load Balancing: Distributes traffic across healthy pod replicas
- Network Abstraction: Provides stable networking for dynamic pods
Interview-Ready Example
"When you create a Service for 3 nginx pods, kube-proxy creates iptables rules that route traffic to service-ip:80
randomly across the 3 pod IPs. If a pod dies, it automatically removes that endpoint."
Technical Modes
- iptables mode: Uses netfilter rules (default)
- IPVS mode: Better performance for large clusters
- userspace mode: Legacy, rarely used
3. Container Runtime - The Execution Engine
High-Level Explanation
"The container runtime is what actually runs your containers. kubelet tells it what to do, but the runtime does the heavy lifting of image management and container execution."
Key Responsibilities
- Image Management: Pulls, stores, and manages container images
- Container Lifecycle: Creates, starts, stops, and destroys containers
- Resource Isolation: Enforces CPU, memory, and storage limits
- Security: Implements container isolation and security policies
Modern Runtime Landscape
containerd (most common)
βββ High performance
βββ Industry standard
βββ Docker's core engine
CRI-O (Red Hat ecosystem)
βββ Lightweight
βββ Kubernetes-focused
βββ OCI compliant
How They Work Together
The Complete Flow
- Control Plane schedules a pod to Node-A
- kubelet receives the pod spec from API server
- kubelet calls Container Runtime to start containers
- kube-proxy updates network rules for any new services
- kubelet monitors and reports back to control plane
Real-World Scenario
"Let's say you're deploying a web application:
- kubelet ensures your app containers are running and healthy
- kube-proxy makes sure traffic reaches your app through Services
- Container runtime handles the actual container execution and resource management"
Interview Tips
Common Questions & Answers
Q: "What happens if kubelet fails?"
A: "The node becomes unresponsive to the control plane. Existing pods keep running, but no new pods can be scheduled, and health monitoring stops."
Q: "How does kube-proxy handle service discovery?"
A: "It doesn't handle discovery directly - that's done by DNS (CoreDNS). kube-proxy implements the routing rules once a service is discovered."
Q: "Why did Kubernetes deprecate Docker?"
A: "Docker as a runtime was deprecated because kubelet needed CRI compatibility. Docker Engine includes unnecessary components for K8s. containerd (Docker's core) is still widely used."
Key Points to Emphasize
- These components work independently but collaboratively
- Each has a specific, non-overlapping responsibility
- They're essential for any Kubernetes deployment
- Understanding them helps with troubleshooting production issues
Bonus Technical Details
- kubelet runs as a systemd service (not a pod)
- kube-proxy typically runs as a DaemonSet
- Container runtime communicates via CRI API
- All components are stateless and can be restarted safely
π§ Control Plane Components (Master) {#control-plane}
Think of Control Plane as "Corporate Headquarters"
It makes all the strategic decisions and coordinates the entire operation.
Component Breakdown:
1. π‘ API Server - The Gateway
What it does:
- Front-end for the Kubernetes control plane
- Validates and processes all API requests
- Only component that talks to etcd
- Authenticates and authorizes requests
Real-world analogy: Like a company's reception desk that:
- Handles all incoming requests
- Verifies visitor credentials
- Directs requests to appropriate departments
- Maintains security protocols
Technical Details:
- RESTful API with JSON/YAML
- Supports multiple API versions simultaneously
- Implements RBAC (Role-Based Access Control)
- Horizontally scalable for high availability
Example Request Flow:
kubectl apply -f deployment.yaml
β
1. API Server validates YAML syntax
2. Checks user permissions (RBAC)
3. Validates resource specifications
4. Stores desired state in etcd
5. Returns success/failure response
2. πΊοΈ Scheduler - The Decision Maker
What it does:
- Selects which node should run each pod
- Considers resource requirements
- Applies scheduling policies
- Does NOT execute - only decides
Real-world analogy: Like a project manager who:
- Assigns tasks to team members
- Considers workload and skills
- Follows company policies
- But doesn't do the actual work
Technical Details:
-
Two-phase process:
- Filtering: Eliminate unsuitable nodes
- Scoring: Rank remaining nodes
-
Factors considered:
- Resource requests (CPU, memory)
- Node affinity/anti-affinity
- Pod affinity/anti-affinity
- Taints and tolerations
Example Scheduling Decision:
New Pod Request: nginx (CPU: 100m, Memory: 128Mi)
Available Nodes:
- Node-1: CPU: 50%, Memory: 70% β (insufficient memory)
- Node-2: CPU: 20%, Memory: 30% β
(best fit)
- Node-3: CPU: 80%, Memory: 40% β
(acceptable)
Scheduler chooses: Node-2
3. πΎ etcd - The Database
What it does:
- Stores entire cluster state
- Distributed key-value database
- Source of truth for Kubernetes
- Backup/restore point for cluster
Real-world analogy: Like a company's filing system that:
- Keeps all important documents
- Multiple copies for safety
- Everyone refers to it for truth
- Critical for business continuity
Technical Details:
- Consistent and highly-available
- Uses Raft consensus algorithm
- Only API Server can read/write
- Supports watch operations for real-time updates
What's Stored:
/registry/
βββ pods/
βββ services/
βββ deployments/
βββ secrets/
βββ configmaps/
βββ nodes/
4. π Controller Manager - The Maintainer
What it does:
- Ensures desired state = actual state
- Runs multiple controllers simultaneously
- Watches for changes via API Server
- Takes corrective actions
Real-world analogy: Like quality control inspectors who:
- Continuously check if everything is as planned
- Fix issues automatically when possible
- Report problems that need human intervention
Key Controllers:
- ReplicaSet Controller: Maintains pod replicas
- Deployment Controller: Manages deployments
- Service Controller: Manages service endpoints
- Node Controller: Monitors node health
Example Controller Action:
Desired State: 3 nginx pods
Actual State: 2 nginx pods (1 crashed)
Controller Manager:
1. Detects discrepancy
2. Calls API Server to create new pod
3. Scheduler assigns it to a node
4. kubelet starts the container
5. Desired state achieved β
5. βοΈ Cloud Controller Manager - The Cloud Integrator
What it does:
- Integrates with cloud providers
- Manages cloud-specific resources
- Translates K8s objects to cloud APIs
- Optional (only for cloud deployments)
Cloud Provider Integrations:
- AWS: ELB, EBS, EC2
- GCP: Cloud Load Balancer, Persistent Disks
- Azure: Azure Load Balancer, Azure Disks
π Complete Deployment Flow {#deployment-flow}
Let's trace a complete request from kubectl
to running container:
Step-by-Step Workflow:
kubectl apply -f nginx-deployment.yaml
Phase 1: Request Processing
- kubectl sends HTTPS request to API Server
- API Server authenticates user (certificates/tokens)
- API Server authorizes request (RBAC policies)
- API Server validates YAML syntax and schema
- API Server stores desired state in etcd
Phase 2: Scheduling
- Scheduler watches API Server for unscheduled pods
- Scheduler filters suitable nodes (resources, constraints)
- Scheduler scores and selects best node
- Scheduler updates pod spec with nodeName in etcd
Phase 3: Execution
- kubelet on selected node watches API Server
- kubelet sees new pod assignment
- kubelet calls Container Runtime (containerd)
- Container Runtime pulls image from registry
- Container Runtime creates and starts container
Phase 4: Networking
- kubelet reports pod status (IP, ready state)
- kube-proxy updates iptables rules
- Service becomes accessible via ClusterIP
Phase 5: Monitoring
- Controller Manager continuously monitors
- kubelet performs health checks
- Status updates flow back to etcd
Visual Timeline:
T0: kubectl apply
T1: API validation & etcd storage
T2: Scheduler assignment
T3: kubelet receives task
T4: Image pull begins
T5: Container starts
T6: Health checks pass
T7: Service ready β
π Pod vs Container Deep Dive {#pod-vs-container}
Fundamental Difference:
Aspect | Docker Container | Kubernetes Pod |
---|---|---|
Unit of Work | Single container | 1+ containers |
Networking | Bridge network | Shared IP & ports |
Storage | Individual volumes | Shared volumes |
Lifecycle | Independent | Coupled lifecycle |
Scaling | Manual | Automatic via ReplicaSets |
Why Pods, Not Containers?
1. Shared Network:
Pod: frontend-pod
βββ nginx container (port 80)
βββ redis container (port 6379)
βββ Shared IP: 10.1.1.100
2. Shared Storage:
Pod Volumes:
βββ /shared-logs (both containers can read/write)
βββ /config (both containers can read)
βββ /tmp (ephemeral, shared)
3. Atomic Operations:
- Both containers start together
- Both containers stop together
- Both containers are scheduled on same node
Common Pod Patterns:
1. Single Container Pod (Most Common)
apiVersion: v1
kind: Pod
spec:
containers:
- name: web-server
image: nginx:1.21
2. Sidecar Pattern
apiVersion: v1
kind: Pod
spec:
containers:
- name: web-server
image: nginx:1.21
- name: log-collector # Sidecar
image: fluentd:latest
3. Init Container Pattern
apiVersion: v1
kind: Pod
spec:
initContainers:
- name: setup-database
image: mysql-setup:latest
containers:
- name: web-app
image: my-app:latest
π Real-World Analogies {#real-world-analogies}
Kubernetes = Modern Factory
Kubernetes Component | Factory Analogy | Responsibility |
---|---|---|
Control Plane | Management Office | Strategic decisions, planning |
API Server | Reception Desk | Handle all requests, security |
etcd | Filing Cabinet | Store all important documents |
Scheduler | Project Manager | Assign work to best workers |
Controller Manager | Quality Inspector | Ensure everything works as planned |
Worker Nodes | Factory Floor | Where actual production happens |
kubelet | Floor Supervisor | Manage local workers |
kube-proxy | Shipping Coordinator | Handle logistics and routing |
Pods | Production Units | Groups of workers doing related tasks |
Containers | Individual Workers | Specialized tasks within units |
Kubernetes = Orchestra
Component | Orchestra Role | Function |
---|---|---|
Control Plane | Conductor | Coordinates entire performance |
API Server | Concert Hall Manager | Manages audience requests |
Scheduler | Seating Coordinator | Assigns musicians to positions |
Worker Nodes | Orchestra Sections | Violins, brass, percussion |
Pods | Musical Groups | String quartet, brass ensemble |
Containers | Individual Musicians | Violin player, trumpet player |
π€ Interview Ready Answers {#interview-answers}
Q1: Explain Kubernetes Architecture in 2 minutes
"Kubernetes uses a master-worker architecture. The Control Plane acts as the brain with five main components: API Server (handles all requests), etcd (stores cluster state), Scheduler (assigns pods to nodes), Controller Manager (maintains desired state), and Cloud Controller Manager (cloud integration).
Worker Nodes execute the actual work with three components: kubelet (node agent that manages pods), kube-proxy (handles networking and load balancing), and Container Runtime (runs containers).
The flow is: User request β API Server β etcd β Scheduler β kubelet β Container Runtime. Controllers continuously monitor and maintain desired state."
Q2: What happens when you run kubectl apply -f deployment.yaml
?
"1. kubectl sends request to API Server
- API Server validates, authenticates, and stores in etcd
- Scheduler watches for unscheduled pods and assigns them to nodes
- kubelet on assigned node sees the task and calls Container Runtime
- Container Runtime pulls image and starts container
- kube-proxy updates networking rules
- Controller Manager monitors to ensure desired replicas are maintained"
Q3: How does Kubernetes ensure high availability?
"Multiple mechanisms: etcd runs in clusters with leader election, Control Plane components can be replicated across nodes, Controllers continuously monitor and heal failed components, Pods are distributed across nodes, and ReplicaSets maintain desired replica counts by automatically replacing failed pods."
Q4: Difference between kubelet and kube-proxy?
"kubelet is the node agent responsible for pod lifecycle - it receives pod specs, manages containers via runtime, and reports status. kube-proxy handles networking - it maintains network rules, implements Services, and provides load balancing across pod replicas. kubelet manages 'what runs', kube-proxy manages 'how traffic flows'."
π Advanced Concepts {#advanced-concepts}
1. High Availability Setup
Multi-Master Configuration:
Load Balancer
βββ Master-1 (API Server + etcd)
βββ Master-2 (API Server + etcd)
βββ Master-3 (API Server + etcd)
Worker Nodes
βββ Node-1 (kubelet + kube-proxy)
βββ Node-2 (kubelet + kube-proxy)
βββ Node-N (kubelet + kube-proxy)
2. Container Runtime Evolution
Timeline:
- Docker (Original, deprecated in K8s 1.24)
- containerd (Docker's core, most popular)
- CRI-O (RedHat, OCI compliant)
- gVisor (Google, security focused)
3. Network Models
Pod-to-Pod Communication:
- Every pod gets unique IP
- Pods can communicate without NAT
- Implemented via CNI plugins (Calico, Flannel, Weave)
Service Types:
- ClusterIP: Internal cluster access
- NodePort: External access via node ports
- LoadBalancer: Cloud provider load balancer
- ExternalName: DNS name mapping
4. Storage Architecture
Volume Types:
- emptyDir: Temporary storage
- hostPath: Node filesystem
- PersistentVolume: Persistent storage
- ConfigMap/Secret: Configuration data
5. Security Features
RBAC (Role-Based Access Control):
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
Network Policies:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
π Quick Reference & Cheat Sheet
Essential Commands:
# Cluster info
kubectl cluster-info
kubectl get nodes
# Pod operations
kubectl get pods
kubectl describe pod <pod-name>
kubectl logs <pod-name>
# Service operations
kubectl get services
kubectl expose deployment <name> --port=80
# Resource monitoring
kubectl top nodes
kubectl top pods
Architecture Summary:
Kubernetes Cluster
β
βββ Control Plane (Master)
β βββ kube-apiserver (API gateway)
β βββ etcd (Data store)
β βββ kube-scheduler (Pod placement)
β βββ kube-controller-manager (State management)
β βββ cloud-controller-manager (Cloud integration)
β
βββ Worker Nodes
βββ kubelet (Node agent)
βββ kube-proxy (Network proxy)
βββ container-runtime (Container execution)
Component Communication:
kubectl β API Server β etcd
β
Scheduler β kubelet β Container Runtime
β
kube-proxy β iptables/IPVS
π³ Docker vs Containerd - Complete Guide
Understanding Container Runtimes & CLI Tools
π― Table of Contents
- The Container Evolution Story
- What is Docker Really?
- What is Containerd?
- Why Kubernetes Dropped Docker
- CLI Tools Comparison
- Practical Examples
- When to Use What
- Migration Guide
π The Container Evolution Story {#evolution-story}
Phase 1: Docker Dominance (2013-2016)
Imagine the early days:
You want to run applications in containers. Docker was like the only car manufacturer in the world.
Container World in 2013:
βββ Docker β Everyone used this
βββ rkt (CoreOS) β Very few people
βββ LXC β Even fewer people
Problems:
- Kubernetes only worked with Docker
- Other container runtimes (like rkt) couldn't be used
- Vendor lock-in - no choice but Docker
Phase 2: Standards Introduction (2016-2018)
The Industry Said: "We need standards so anyone can build container runtimes!"
Two Important Standards Created:
-
OCI Image Spec π¦
- How container images should be built
- Like saying "all cars should have 4 wheels, steering wheel, brakes"
-
OCI Runtime Spec βοΈ
- How container runtimes should work
- Like saying "all cars should start with ignition, have gears, etc."
Phase 3: CRI Introduction (2017)
Kubernetes Said: "We want to support ANY container runtime, not just Docker!"
Container Runtime Interface (CRI) was born:
- Standardized way for Kubernetes to talk to container runtimes
- Any runtime following CRI can work with Kubernetes
- Like creating a universal car interface - any car manufacturer can plug in
Before CRI:
Kubernetes ββ Only Docker
After CRI:
Kubernetes ββ CRI ββ β¬ββ Containerd
βββ CRI-O
βββ rkt
βββ Docker (via dockershim)
Phase 4: Docker's Compatibility Problem
The Issue: Docker was built before CRI existed, so it didn't follow CRI standards.
Kubernetes' Solution: Dockershim
- A translation layer between Kubernetes and Docker
- Like having a language translator for Docker
Kubernetes ββ CRI ββ β¬ββ Direct: Containerd, CRI-O
βββ Via Translator: Docker (dockershim)
Why This Was Bad:
- Extra complexity
- Maintenance overhead
- Docker got special treatment
π³ What is Docker Really? {#what-is-docker}
Docker is NOT just a container runtime!
It's a complete platform with multiple components.
Docker Architecture:
Docker Platform
βββ π₯οΈ Docker CLI (command line tool)
βββ π Docker API (REST API)
βββ π¨ Build Tools (docker build)
βββ πΎ Volume Management
βββ π Authentication & Security
βββ π Container Runtime (runC)
βββ πΉ Docker Daemon (manages everything)
βββ Containerd (the actual runtime)
Real-World Analogy:
Docker = Complete Car Manufacturing Company
- Has showroom (CLI)
- Has service center (API)
- Has assembly line (build tools)
- Has financing (authentication)
- AND the actual car engine (Containerd)
Containerd = Just the Car Engine
- Can work independently
- More focused, lightweight
- Does one job very well
βοΈ What is Containerd? {#what-is-containerd}
Containerd is the actual container runtime that was inside Docker all along!
Key Facts:
Aspect | Details |
---|---|
Origin | Originally part of Docker |
Status | Now independent CNCF project |
Purpose | Pure container runtime |
CRI Support | β Native CRI compatible |
Size | Smaller, lighter than full Docker |
Focus | Container lifecycle management |
What Containerd Does:
- Pulls container images
- Starts/stops containers
- Manages container lifecycle
- Handles storage and networking
- Provides low-level container operations
What Containerd DOESN'T Do:
β Build images (no docker build
)
β High-level CLI (basic ctr
only)
β Volume management like Docker
β Compose file support
β Registry authentication (basic only)
π« Why Kubernetes Dropped Docker {#kubernetes-docker}
The Timeline:
Kubernetes 1.20 (Dec 2020):
βββ β οΈ Docker deprecation warning
βββ "Dockershim will be removed"
Kubernetes 1.24 (May 2022):
βββ β Dockershim removed
βββ β Direct Docker support ended
βββ β
But Docker images still work!
Reasons for Removal:
-
Maintenance Burden π§
- Dockershim was extra code to maintain
- Only Docker needed special treatment
-
Unnecessary Complexity π€―
- Docker brought extra layers
- Kubernetes only needed container runtime
-
Standardization π
- CRI became the standard
- Docker was the exception
What This Means:
Before K8s 1.24:
kubectl β API Server β dockershim β Docker β Containerd β runC
After K8s 1.24:
kubectl β API Server β CRI β Containerd β runC
Result: Shorter path, less complexity, better performance!
Important Note:
Docker images still work perfectly!
Because Docker follows OCI Image Spec, all Docker images are compatible with Containerd.
π οΈ CLI Tools Comparison {#cli-tools}
Ab samjhte hain different command-line tools:
1. π³ Docker CLI
Purpose: Complete container management
Best For: Development, building images, general use
# Docker commands (traditional)
docker pull nginx
docker run -d -p 8080:80 nginx
docker build -t myapp .
docker ps
docker logs container_name
2. βοΈ ctr (Containerd CLI)
Purpose: Low-level debugging of Containerd
Best For: Debugging only, NOT for production
# ctr commands (debugging only)
ctr images pull docker.io/library/nginx:latest
ctr run docker.io/library/nginx:latest my-nginx
ctr images list
ctr containers list
Limitations:
- β Not user-friendly
- β Limited features
- β No port mapping like
-p 8080:80
- β No volume mounting
- β οΈ Only for debugging
3. π€ nerdctl (Better Containerd CLI)
Purpose: Docker-like experience for Containerd
Best For: General use, production, Docker replacement
# nerdctl commands (Docker-like)
nerdctl pull nginx
nerdctl run -d -p 8080:80 nginx # Same as Docker!
nerdctl ps
nerdctl logs container_name
nerdctl build -t myapp . # Can build images!
Advantages:
- β Docker-compatible commands
- β All Docker features + new Containerd features
- β Production ready
- β Image building support
- β Advanced features (encrypted images, lazy pulling)
4. π crictl (CRI CLI)
Purpose: Debugging CRI-compatible runtimes
Best For: Kubernetes troubleshooting
# crictl commands (Kubernetes debugging)
crictl pull nginx
crictl images
crictl ps # Lists containers
crictl pods # Lists pods (K8s specific)
crictl exec -it <id> /bin/sh
crictl logs <container_id>
Key Points:
- β Works with all CRI runtimes (Containerd, CRI-O)
- β Kubernetes-aware (shows pods)
- β οΈ Don't create containers with crictl in production
- β οΈ Kubelet will delete containers created outside its control
π‘ Practical Examples {#practical-examples}
Scenario 1: Running nginx with different tools
Docker Way:
docker run -d -p 8080:80 --name web-server nginx
curl http://localhost:8080
nerdctl Way (Containerd):
nerdctl run -d -p 8080:80 --name web-server nginx
curl http://localhost:8080
ctr Way (Debugging only):
ctr images pull docker.io/library/nginx:latest
ctr run -d docker.io/library/nginx:latest web-server
# No port mapping possible with ctr!
crictl Way (Not recommended):
crictl pull nginx
# Don't use crictl to create containers!
# Kubelet will delete them
Scenario 2: Building Images
Docker Way:
docker build -t myapp:v1 .
docker push myapp:v1
nerdctl Way:
nerdctl build -t myapp:v1 .
nerdctl push myapp:v1
ctr Way:
# β Can't build images with ctr
# You need external tools
Scenario 3: Kubernetes Troubleshooting
Check containers in K8s cluster:
# Use crictl (designed for this)
crictl ps # All containers
crictl pods # All pods
crictl logs <pod_id> # Pod logs
# Don't use nerdctl/ctr for this
# They don't understand Kubernetes
π― When to Use What {#when-to-use}
Decision Matrix:
Use Case | Recommended Tool | Why? |
---|---|---|
Local Development | Docker or nerdctl | Full features, easy to use |
Production K8s | Containerd (runtime) | Lightweight, CRI compatible |
Building Images | Docker or nerdctl | Both support build command |
K8s Debugging | crictl | Kubernetes-aware |
Containerd Debugging | ctr | Low-level access |
Learning Containers | Docker | Best documentation |
Platform-Specific Recommendations:
π’ Production Kubernetes:
Runtime: Containerd
CLI for debugging: crictl
Management: kubectl (for K8s objects)
π» Development Environment:
Option 1: Docker (if you need build features)
Option 2: nerdctl + Containerd (Docker alternative)
βοΈ Cloud Managed Kubernetes:
EKS: Containerd (default)
GKE: Containerd (default)
AKS: Containerd (default)
CLI: crictl for debugging
π Migration Guide {#migration-guide}
Migrating from Docker to Containerd:
Step 1: Install Containerd
# Ubuntu/Debian
apt-get install containerd
# Configure containerd
containerd config default > /etc/containerd/config.toml
systemctl enable --now containerd
Step 2: Install nerdctl (optional)
# Download from GitHub releases
wget https://github.com/containerd/nerdctl/releases/download/v1.0.0/nerdctl-1.0.0-linux-amd64.tar.gz
tar -xzf nerdctl-1.0.0-linux-amd64.tar.gz
sudo mv nerdctl /usr/local/bin/
Step 3: Update Kubernetes
# /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
containerRuntimeEndpoint: unix:///run/containerd/containerd.sock
Step 4: Restart kubelet
systemctl restart kubelet
Command Translation:
Docker Command | nerdctl Command | Notes |
---|---|---|
docker pull nginx |
nerdctl pull nginx |
Same syntax |
docker run -d nginx |
nerdctl run -d nginx |
Same syntax |
docker ps |
nerdctl ps |
Same syntax |
docker build -t app . |
nerdctl build -t app . |
Same syntax |
docker images |
nerdctl images |
Same syntax |
π§ Interview Questions & Answers
Q1: Why did Kubernetes remove Docker support?
"Kubernetes removed Docker support in v1.24 because Docker required a special translation layer called dockershim, while other runtimes worked directly with CRI. This added maintenance overhead and complexity. Docker images still work because they follow OCI standards, but now Kubernetes uses Containerd directly for better performance and simplicity."
Q2: What's the difference between Docker and Containerd?
"Docker is a complete platform with CLI, API, build tools, and runtime. Containerd is just the runtime component that was inside Docker. Think of Docker as a complete car manufacturing company, while Containerd is just the engine. Containerd is lighter, CRI-compatible, and perfect for Kubernetes production environments."
Q3: When should I use crictl vs nerdctl?
"Use crictl for Kubernetes debugging - it understands pods and works with kubelet. Use nerdctl for general container management as a Docker replacement - it has all Docker features plus Containerd-specific ones. Never use crictl to create containers in production as kubelet will delete them."
Q4: Can I still use Docker images with Containerd?
"Yes! Docker images are fully compatible with Containerd because both follow OCI Image Spec standards. You can pull Docker Hub images and run them with Containerd without any changes."
π Quick Reference
Runtime Comparison:
Docker:
β
Complete platform
β
Best for development
β Heavy for K8s production
β Requires dockershim (deprecated)
Containerd:
β
Lightweight runtime
β
CRI native
β
Perfect for K8s
β Basic CLI only
CLI Tool Summary:
ctr: Containerd debugging only
nerdctl: Docker replacement for Containerd
crictl: Kubernetes debugging
docker: Traditional full-feature CLI
Modern Stack:
Development: Docker or nerdctl
Production K8s: Containerd + crictl for debugging
CI/CD: Docker or nerdctl for building
π― Key Takeaways
- Docker != Container Runtime - Docker is a platform, Containerd is the runtime
- Containerd came from Docker - It was extracted as a separate project
- K8s dropped Docker for simplicity, not because of problems with Docker
- Docker images still work - OCI standards ensure compatibility
- Use right tool for right job - nerdctl for general use, crictl for K8s debugging
- Future is Containerd - Lighter, faster, standard-compliant
Kubernetes Architecture - Interview Ready Notes
Simple explanations, real scenarios, and common interview questions
π― Table of Contents
- Quick Overview - The Big Picture
- ETCD - The Memory Bank
- API Server - The Reception Desk
- Controller Manager - The Manager
- Scheduler - The HR Department
- Kubelet - The Site Supervisor
- Kube Proxy - The Network Guy
- Real Interview Scenarios
- Troubleshooting Like a Pro
π Quick Overview - The Big Picture {#overview}
Simple Analogy: Kubernetes = A Construction Company
Think of Kubernetes like a construction company:
- Head Office (Control Plane): Makes all decisions
- Construction Sites (Worker Nodes): Where actual work happens
- Containers: The workers doing the job
Why This Architecture?
Interview Question: "Why does Kubernetes have so many components? Isn't it complex?"
Answer with Reasoning:
"Actually, it's brilliant! Each component has ONE job and does it well:
- If scheduler fails, pods stop getting placed, but running pods continue
- If API server fails, no new changes, but existing workloads run
- This separation makes debugging easier and scaling possible"
The Components in Simple Terms:
Component | Real-World Job | What It Does |
---|---|---|
ETCD | Company Database | Remembers everything |
API Server | Reception Desk | Handles all requests |
Scheduler | HR Department | Decides who works where |
Controller Manager | Site Manager | Ensures work gets done |
Kubelet | Site Supervisor | Manages workers on-site |
Kube Proxy | Network Engineer | Connects everything |
πΎ ETCD - The Memory Bank {#etcd}
The Simple Explanation
ETCD is like your phone's contacts list - it stores everything important and everyone asks it for information.
Why Key-Value Store?
Interview Question: "Why not use MySQL or PostgreSQL for Kubernetes?"
Reasoning:
Traditional Database (Like Excel):
| Name | Age | Department | Salary |
|------|-----|------------|--------|
| John | 25 | IT | 50000 |
| Mary | 30 | | | β Lots of empty cells!
Key-Value Store (Like JSON):
{
"employee_1": {"name": "John", "age": 25, "dept": "IT", "salary": 50000},
"student_1": {"name": "Mary", "course": "CS", "grade": "A"}
}
Benefits:
β
Flexible structure
β
No empty fields
β
Fast lookups
β
Distributed easily
What ETCD Actually Stores
Interview Tip: When asked "What's in ETCD?", say:
"Everything you see with
kubectl get
commands is stored in ETCD"
# All this data lives in ETCD:
kubectl get pods # Pod definitions and status
kubectl get nodes # Node information
kubectl get services # Service configurations
kubectl get secrets # Encrypted secrets
kubectl get configmaps # Configuration data
Common ETCD Interview Questions
Q1: "What happens if ETCD goes down?"
Answer: "The cluster becomes read-only. Running pods continue working,
but you can't create/update/delete anything. It's like losing your
phone's contacts - you can still call people you remember, but can't
look up new numbers."
Q2: "How do you backup ETCD?"
# Simple backup command
etcdctl snapshot save my-backup.db
# Why backup?
# "ETCD is your cluster's brain. Lose it = lose everything.
# It's like backing up your entire company database."
Q3: "ETCD vs Redis - what's the difference?"
ETCD: Built for consistency and reliability (CP in CAP theorem)
Redis: Built for speed and availability (AP in CAP theorem)
Kubernetes needs consistency - can't have conflicting cluster states!
π’ API Server - The Reception Desk {#api-server}
The Simple Explanation
API Server is like a company's reception desk - everyone has to go through it, it checks who you are, and directs you to the right department.
Why Only API Server Talks to ETCD?
Interview Question: "Why can't other components directly access ETCD?"
Reasoning:
Imagine if everyone in a company could directly access the main database:
β No security checks
β No validation
β Data corruption
β No audit trail
With API Server as gatekeeper:
β
Authentication: "Who are you?"
β
Authorization: "Are you allowed to do this?"
β
Validation: "Is your request correct?"
β
Audit: "Log everything for security"
Real-World Example: Creating a Pod
Interview Scenario: "Walk me through what happens when I run kubectl create pod
"
graph TD
A[You: kubectl create pod] --> B[API Server: Who are you?]
B --> C{Valid User?}
C -->|No| D[Error: Authentication failed]
C -->|Yes| E[API Server: Is this request valid?]
E --> F{Valid YAML?}
F -->|No| G[Error: Invalid specification]
F -->|Yes| H[API Server: Save to ETCD]
H --> I[ETCD: Pod saved as 'Pending']
I --> J[API Server: Tell user 'Pod created']
J --> K[Scheduler: Hey, there's a new pod to place!]
Your Answer:
"First, API server authenticates me and validates the pod spec.
If valid, it saves the pod to ETCD with status 'Pending' and
immediately returns success to me. Meanwhile, scheduler notices
the new pod and starts finding a node for it. This async design
is why kubectl returns quickly even for complex operations."
Common API Server Interview Questions
Q1: "Why is API server the only component that talks to ETCD?"
"Security and consistency. API server acts as a bouncer - it checks
permissions, validates requests, and ensures data integrity. If every
component could access ETCD directly, it would be chaos - like letting
everyone directly modify a company's main database."
Q2: "What's the difference between authentication and authorization?"
Authentication: "Who are you?" (like showing ID at building entrance)
Authorization: "What can you do?" (like having a key card for specific floors)
Example:
- User 'john' is authenticated β
- But john can only read pods, not delete them β
- So 'kubectl delete pod' would fail with authorization error
π Controller Manager - The Manager {#controller-manager}
The Simple Explanation
Controller Manager is like a project manager who constantly checks if everything is going according to plan and fixes problems.
The Controller Concept
Interview Question: "What is a controller in Kubernetes?"
Simple Answer:
"A controller is like a thermostat:
1. It knows the desired state (temperature you set)
2. It monitors current state (actual temperature)
3. It takes action when they don't match (turn on heating/cooling)
In Kubernetes:
- Deployment Controller: 'I need 3 replicas running'
- Node Controller: 'All nodes should be healthy'
- Replication Controller: 'Replace failed pods immediately'"
Real Controllers in Action
Node Controller - The Health Monitor
Interview Scenario: "A worker node suddenly goes down. What happens?"
Step-by-step:
Time 0:00 - Node is healthy, sending heartbeats every 5 seconds
Time 0:05 - Last heartbeat received
Time 0:45 - Node Controller: "40 seconds, no heartbeat. Mark as Unknown"
Time 5:45 - Node Controller: "5 minutes passed. This node is dead!"
Time 5:46 - Node Controller: "Move all pods to healthy nodes"
Your Answer:
"The Node Controller gives the node 40 seconds to respond, then marks
it as 'Unknown'. After 5 minutes total, it assumes the node is dead
and tells other controllers to reschedule the pods elsewhere. This
prevents data loss and maintains application availability."
Deployment Controller - The Replica Manager
Interview Question: "I have a deployment with 3 replicas, and I manually delete one pod. What happens?"
1. Pod gets deleted
2. Deployment Controller notices: "Wait, I see only 2 pods, but need 3!"
3. Controller creates a new pod immediately
4. Scheduler assigns it to a node
5. Kubelet starts the pod
Why? "The controller's job is to maintain desired state, not ask why
things changed. It just fixes the gap."
Common Controller Manager Interview Questions
Q1: "What's the difference between Deployment Controller and ReplicaSet Controller?"
Think of it like management hierarchy:
- Deployment Controller: "I manage rollouts and versions"
- ReplicaSet Controller: "I just maintain the right number of pods"
When you update a deployment:
1. Deployment Controller creates new ReplicaSet
2. New ReplicaSet Controller starts creating new pods
3. Old ReplicaSet Controller scales down old pods
4. Deployment Controller manages this transition
Q2: "Can you write your own controller?"
"Yes! That's how operators work. You write code that:
1. Watches for changes in your custom resources
2. Compares current vs desired state
3. Takes action to fix differences
Example: Database Operator that automatically backs up databases"
ποΈ Scheduler - The HR Department {#scheduler}
The Simple Explanation
Scheduler is like HR department - it doesn't hire people (create pods), it just decides which team (node) they should join.
The Two-Phase Process
Interview Question: "How does the scheduler decide where to place a pod?"
Phase 1: Filtering (Elimination Round)
"Like filtering job candidates:
β Node1: Not enough memory (like candidate without required skills)
β Node2: Has taint that pod can't tolerate (like location preference mismatch)
β
Node3: Meets all requirements
β
Node4: Meets all requirements"
Phase 2: Scoring (Final Selection)
"Rank remaining nodes (0-10 scale):
Node3: 7/10 (decent resources, but high load)
Node4: 9/10 (lots of free resources, low load)
Winner: Node4!"
Real-World Scheduling Example
Interview Scenario: "I have a pod that needs 2 CPU and 4GB RAM. Walk through the scheduling decision."
Available Nodes:
βββββββββββ¬ββββββββββ¬ββββββββββ¬βββββββββ
β Node β CPU β RAM β Status β
βββββββββββΌββββββββββΌββββββββββΌβββββββββ€
β Node1 β 1 CPU β 8GB β β Out β
β Node2 β 4 CPU β 2GB β β Out β
β Node3 β 3 CPU β 6GB β β
In β
β Node4 β 8 CPU β 16GB β β
In β
βββββββββββ΄ββββββββββ΄ββββββββββ΄βββββββββ
Scoring:
Node3: (3-2)/3 * 10 = 3.3 points (CPU) + (6-4)/6 * 10 = 3.3 points (RAM) = 6.6/10
Node4: (8-2)/8 * 10 = 7.5 points (CPU) + (16-4)/16 * 10 = 7.5 points (RAM) = 15/10 = 10/10
Winner: Node4 (more resources = higher score)
Advanced Scheduling Concepts
Interview Question: "What are taints and tolerations? Give a real example."
Real Scenario: "You have GPU nodes that are expensive. You don't want
regular pods wasting them."
Solution:
1. Taint GPU nodes: kubectl taint node gpu-node1 gpu=nvidia:NoSchedule
2. Only ML pods tolerate this taint:
tolerations:
- key: "gpu"
operator: "Equal"
value: "nvidia"
effect: "NoSchedule"
Result: Only ML workloads can run on GPU nodes, saving money!
Common Scheduler Interview Questions
Q1: "What happens if the scheduler is down?"
"New pods get stuck in 'Pending' state. Running pods are unaffected
because kubelet manages them. It's like HR being on vacation - current
employees keep working, but new hires can't be assigned to teams."
Q2: "Can you have multiple schedulers?"
"Yes! Specify scheduler name in pod spec:
spec:
schedulerName: my-custom-scheduler
Use cases:
- GPU workload scheduler
- Cost-optimizing scheduler
- Compliance-aware scheduler"
π· Kubelet - The Site Supervisor {#kubelet}
The Simple Explanation
Kubelet is like a construction site supervisor - it manages all work happening on that specific site and reports back to headquarters.
Kubelet's Daily Routine
Interview Question: "What does kubelet actually do on each node?"
The Daily Checklist:
Every 10 seconds, kubelet asks:
1. "API Server, any new pods assigned to my node?"
2. "Are all my current pods healthy?"
3. "Do I need to pull any new container images?"
4. "Should I restart any crashed containers?"
5. "Let me report everything back to API server"
Real-World Example: Pod Lifecycle
Interview Scenario: "A pod is assigned to your node. Walk through what kubelet does."
graph TD
A[API Server: Pod assigned to this node] --> B[Kubelet: New pod detected]
B --> C[Kubelet: Pull container images]
C --> D[Kubelet: Create containers via runtime]
D --> E[Container Runtime: Start containers]
E --> F[Kubelet: Monitor pod health]
F --> G[Kubelet: Report status to API Server]
G --> H[API Server: Update ETCD]
H --> F
Your Answer:
"Kubelet first pulls the required images, then asks the container
runtime (like Docker) to create containers. It continuously monitors
the pod's health and reports status back to API server. If a container
crashes, kubelet restarts it based on the restart policy."
Container Runtime Interface (CRI)
Interview Question: "What's the difference between kubelet and Docker?"
Think of it like this:
- Kubelet: Site supervisor who gives orders
- Container Runtime (Docker/containerd): Construction crew who does the work
Kubelet says: "Create a container with this image"
Docker/containerd says: "Done! Container is running"
This separation allows Kubernetes to work with different runtimes:
- Docker
- containerd
- CRI-O
- gVisor (for security)
Health Monitoring
Interview Question: "How does kubelet know if a pod is healthy?"
The Three Types of Probes:
1. Liveness Probe: "Is the app still alive?"
- Fails: Restart the container
- Like checking if a worker is responsive
2. Readiness Probe: "Is the app ready to serve traffic?"
- Fails: Remove from service endpoints
- Like checking if a restaurant is ready for customers
3. Startup Probe: "Has the app finished starting up?"
- For slow-starting applications
- Like giving extra time for a complex setup
Real Example:
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
# Kubelet checks /health every 10 seconds
# If it fails, restart the container
Common Kubelet Interview Questions
Q1: "What happens if kubelet goes down on a node?"
"Existing pods keep running (containers don't need kubelet to stay alive),
but:
- No new pods can be scheduled to that node
- Failed containers won't be restarted
- Pod status won't be updated in API server
- Health checks stop working
It's like a site supervisor leaving - work continues, but no management."
Q2: "How does kubelet authenticate with API server?"
"Kubelet uses a kubeconfig file with certificates, just like kubectl:
- Certificate-based authentication
- Node authorization (kubelet can only access its own node's resources)
- TLS encryption for all communications"
π Kube Proxy - The Network Guy {#kube-proxy}
The Simple Explanation
Kube Proxy is like a smart receptionist who knows how to route phone calls to the right person, even when people move desks.
The Core Problem It Solves
Interview Question: "Why do we need kube-proxy? Can't pods talk directly?"
The Problem:
Pods can talk to each other directly, BUT:
β Pod IPs change when pods restart
β Multiple pods behind one service - which one to call?
β Load balancing needed
β Service discovery is hard
Example:
Frontend pod needs to call backend, but:
- Backend pod restarts β New IP address
- 3 backend replicas β Which one to choose?
- Backend pod moves to different node β Different IP
The Solution:
Service creates stable endpoint:
- Service IP: 10.96.0.100 (never changes)
- Kube-proxy creates rules: "10.96.0.100 β 10.244.1.5, 10.244.2.8, 10.244.3.2"
- Traffic gets load balanced across all backend pods
How Traffic Flow Works
Interview Scenario: "A frontend pod calls a backend service. Trace the network path."
graph LR
A[Frontend Pod<br/>10.244.1.10] --> B[Service: backend<br/>10.96.0.100:80]
B --> C[Kube-proxy rules]
C --> D[Backend Pod 1<br/>10.244.2.15:8080]
C --> E[Backend Pod 2<br/>10.244.2.16:8080]
C --> F[Backend Pod 3<br/>10.244.3.20:8080]
Step-by-step:
1. Frontend pod calls http://backend:80
2. DNS resolves 'backend' to service IP 10.96.0.100
3. Packet goes to 10.96.0.100:80
4. Kube-proxy's iptables rules intercept the packet
5. Rules randomly select one backend pod (e.g., 10.244.2.15:8080)
6. Packet gets forwarded to selected pod
7. Response comes back through same path
Kube-proxy Modes
Interview Question: "What are the different kube-proxy modes?"
1. iptables Mode (Most Common)
# Example rule created by kube-proxy
iptables -t nat -A KUBE-SERVICES -d 10.96.0.100/32 -p tcp --dport 80 -j KUBE-SVC-BACKEND
# Translation: "Traffic to 10.96.0.100:80 goes to KUBE-SVC-BACKEND chain"
2. IPVS Mode (Better Performance)
Advantages:
β
Better for large clusters (1000+ services)
β
More load balancing algorithms
β
Lower latency
β
Better debugging tools
When to use: Large production clusters
3. Userspace Mode (Legacy)
β Deprecated
β High overhead (traffic goes through userspace)
β Only for very old clusters
Service Types and Kube-proxy
Interview Question: "Explain different service types and how kube-proxy handles them."
ClusterIP (Default):
apiVersion: v1
kind: Service
spec:
type: ClusterIP
selector:
app: backend
ports:
- port: 80
targetPort: 8080
# Kube-proxy creates internal load balancing rules
# Only accessible within cluster
NodePort:
spec:
type: NodePort
ports:
- port: 80
targetPort: 8080
nodePort: 30080
# Kube-proxy creates rules on ALL nodes
# External traffic to any-node-ip:30080 β backend pods
LoadBalancer:
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
# Cloud provider creates external load balancer
# Kube-proxy handles traffic once it reaches nodes
Common Kube Proxy Interview Questions
Q1: "What happens if kube-proxy crashes on a node?"
"Services stop working on that node!
- Existing connections might continue (TCP is stateful)
- New connections to services fail
- Direct pod-to-pod communication still works
- Fix: kube-proxy runs as DaemonSet, so it restarts automatically"
Q2: "How does kube-proxy know about service changes?"
"It watches the API server for service and endpoint changes:
1. New service created β kube-proxy creates iptables rules
2. Pod added to service β kube-proxy adds pod to rules
3. Service deleted β kube-proxy removes all related rules
It's event-driven, so changes are reflected quickly."
πΌ Real Interview Scenarios {#interview-scenarios}
Scenario 1: Pod Stuck in Pending State
Interviewer: "A pod has been in Pending state for 10 minutes. How do you troubleshoot?"
Your Approach:
# Step 1: Check pod events
kubectl describe pod stuck-pod
# Common causes and solutions:
1. "Insufficient resources" β Check node capacity
kubectl top nodes
2. "No nodes available" β Check node selectors/affinity
kubectl get nodes --show-labels
3. "Taint violations" β Check tolerations
kubectl describe node problematic-node
4. "Image pull errors" β Check image name/registry access
kubectl logs stuck-pod
Interview Tip: Always explain your thought process:
"I start with 'kubectl describe pod' because it shows events chronologically.
The scheduler tries to place the pod, and any failure reason appears here.
Based on the error, I drill down to the specific component causing issues."
Scenario 2: Cluster Performance Issues
Interviewer: "Users complain that kubectl commands are slow. What do you check?"
Your Troubleshooting Path:
# 1. Check API server health
kubectl get componentstatuses
# 2. Check API server logs
kubectl logs -n kube-system kube-apiserver-master1
# 3. Check ETCD performance
etcdctl endpoint status --cluster
etcdctl endpoint health
# 4. Check resource usage
kubectl top nodes
kubectl top pods -n kube-system
Root Cause Analysis:
Common causes:
1. ETCD performance issues (slow disk, network latency)
2. API server overloaded (too many requests)
3. Network problems between components
4. Resource constraints on master nodes
Solution approach:
- Scale API server horizontally
- Optimize ETCD (SSD storage, tune parameters)
- Implement request rate limiting
- Monitor and alert on component health
Scenario 3: Node Failure Recovery
Interviewer: "A worker node suddenly becomes unreachable. Walk me through what happens and how you'd respond."
Automatic Recovery Process:
Timeline:
0:00 - Node stops sending heartbeats
0:40 - Node Controller marks node as "Unknown"
5:00 - Node Controller marks node as "NotReady"
5:01 - Pod eviction begins
5:02 - Pods rescheduled to healthy nodes
Your Actions:
1. Verify node is truly down (not network issue)
2. Check if node can be recovered
3. If permanent failure, drain and remove node
4. Monitor pod rescheduling
5. Investigate root cause
Commands to Run:
# Check node status
kubectl get nodes
kubectl describe node failed-node
# Check pod distribution
kubectl get pods -o wide
# If node is permanently dead
kubectl drain failed-node --ignore-daemonsets --delete-emptydir-data
kubectl delete node failed-node
# Add replacement node
kubeadm join <master-ip>:6443 --token <token>
Scenario 4: Service Discovery Issues
Interviewer: "Pods can't reach a service by name. How do you debug?"
Debugging Steps:
# 1. Test DNS resolution
kubectl run test-pod --image=busybox --rm -it -- nslookup my-service
# 2. Check service exists and has endpoints
kubectl get service my-service
kubectl get endpoints my-service
# 3. Check kube-proxy rules
kubectl get pods -n kube-system | grep kube-proxy
kubectl logs -n kube-system kube-proxy-xxxxx
# 4. Test direct pod access
kubectl get pods -l app=my-app -o wide
kubectl exec test-pod -- wget -qO- pod-ip:port
Common Issues and Solutions:
1. Service has no endpoints:
- Check selector labels match pod labels
- Verify pods are running and ready
2. DNS not working:
- Check CoreDNS pods are running
- Verify DNS policy in pod spec
3. Kube-proxy issues:
- Check kube-proxy is running on all nodes
- Verify iptables rules are created
4. Network policy blocking traffic:
- Check for restrictive network policies
- Test with temporary policy allowing all traffic
π§ Troubleshooting Like a Pro {#troubleshooting}
The Systematic Approach
Interview Question: "How do you approach troubleshooting in Kubernetes?"
The SCALE Method:
S - Symptoms: What exactly is broken?
C - Components: Which components are involved?
A - Access: Can you access relevant logs/metrics?
L - Logs: What do the logs tell you?
E - Environment: Any recent changes?
Essential Troubleshooting Commands
Quick Health Check Commands:
# Overall cluster health
kubectl get componentstatuses
kubectl get nodes
kubectl get pods --all-namespaces
# Component-specific checks
kubectl logs -n kube-system kube-apiserver-master1
kubectl logs -n kube-system kube-scheduler-master1
kubectl logs -n kube-system kube-controller-manager-master1
# Node-specific checks
kubectl describe node worker1
kubectl top node worker1
Advanced Debugging:
# Check resource usage
kubectl top pods --all-namespaces --sort-by=memory
kubectl top nodes --sort-by=cpu
# Network debugging
kubectl run netshoot --image=nicolaka/netshoot --rm -it -- bash
# Inside container: ping, nslookup, traceroute, etc.
# Event monitoring
kubectl get events --sort-by=.metadata.creationTimestamp
Common Issue Patterns
Pattern 1: Cascading Failures
Root Cause: ETCD performance issue
β
API server becomes slow
β
Controllers can't update status
β
Scheduler can't place pods
β
Users see "cluster is broken"
Lesson: Always check the data layer (ETCD) first in widespread issues
Pattern 2: Resource Starvation
Root Cause: No resource limits on pods
β
One pod consumes all CPU/memory
β
Node becomes unresponsive
β
Kubelet can't send heartbeats
β
Node marked as failed
β
All pods evicted
Lesson: Always set resource requests and limits
Complete Kubernetes Pods Interview Guide
Table of Contents
- Prerequisites & Background
- What is a Pod?
- Pod Architecture & Design
- Pod Lifecycle
- Scaling with Pods
- Multi-Container Pods
- Pod Networking
- Pod Storage
- Pod Management
- Common Interview Questions
- Practical Examples
Prerequisites & Background
Why Kubernetes Exists
Before diving into pods, understand the problem Kubernetes solves:
Traditional Deployment Challenges:
- Manual container orchestration is complex and error-prone
- Scaling requires manual intervention
- Service discovery and load balancing are difficult
- Health monitoring and recovery need custom solutions
- Resource management across multiple machines is challenging
Container Evolution:
- Physical Servers β Single application per server, resource waste
- Virtual Machines β Better resource utilization, but heavy overhead
- Containers β Lightweight, portable, but need orchestration
- Kubernetes β Orchestrates containers at scale
Key Concepts Before Pods
- Container: Packaged application with its dependencies
- Docker Image: Template for creating containers
- Container Registry: Repository storing container images (Docker Hub, ECR, GCR)
- Kubernetes Cluster: Set of machines (nodes) running Kubernetes
- Worker Node: Machine that runs your application containers
- Master Node: Controls the cluster (API server, scheduler, controller manager)
What is a Pod?
Definition
A Pod is the smallest and simplest unit in the Kubernetes object model that you create or deploy. It represents a single instance of a running process in your cluster.
Key Characteristics
- Atomic Unit: Cannot be divided further in Kubernetes
- Ephemeral: Pods are mortal and can be created, destroyed, and recreated
- Unique IP: Each pod gets its own IP address within the cluster
- Shared Resources: Containers in a pod share network, storage, and lifecycle
Why Pods, Not Just Containers?
Interview Question: "Why doesn't Kubernetes manage containers directly?"
Answer: Kubernetes uses pods as an abstraction layer because:
- Grouping: Some applications need helper containers (sidecar pattern)
- Shared Resources: Containers in a pod need to share network and storage
- Atomic Operations: All containers in a pod are scheduled together
- Lifecycle Management: Simplified management of related containers
Pod Architecture & Design
Single Container Pod (Most Common)
βββββββββββββββββββ
β Pod β
β βββββββββββββ β
β βContainer β β β Your Application
β β App β β
β βββββββββββββ β
βββββββββββββββββββ
Multi-Container Pod (Advanced)
βββββββββββββββββββββββββββββββ
β Pod β
β βββββββββββββ βββββββββββ β
β βMain App β β Helper β β β Sidecar Pattern
β βContainer β βContainerβ β
β βββββββββββββ βββββββββββ β
βββββββββββββββββββββββββββββββ
Pod vs Container Relationship
- 1:1 Relationship: Most common (one container per pod)
- 1:Many Relationship: Advanced use cases (main + helper containers)
- Never Many:1: You cannot have multiple pods sharing one container
Pod Lifecycle
Pod States
- Pending: Pod accepted but not yet scheduled to a node
- Running: Pod bound to node, at least one container is running
- Succeeded: All containers terminated successfully
- Failed: All containers terminated, at least one failed
- Unknown: Pod state cannot be determined
Container States Within Pods
- Waiting: Container is waiting to start (pulling image, etc.)
- Running: Container is executing
- Terminated: Container has finished execution
Pod Lifecycle Flow
Create Pod β Schedule β Pull Images β Start Containers β Running β Terminate
Scaling with Pods
Horizontal Scaling (Scale Out)
Correct Approach:
- Create more pods to handle increased load
- Each pod runs one instance of your application
- Load is distributed across multiple pods
Before Scaling: After Scaling:
βββββββ βββββββ βββββββ βββββββ
βPod 1β βPod 1β βPod 2β βPod 3β
βββββββ βββββββ βββββββ βββββββ
Incorrect Approach:
- Adding more containers to existing pod
- This violates Kubernetes design principles
Scaling Scenarios
- Same Node Scaling: Multiple pods on one node
- Cross-Node Scaling: Pods distributed across multiple nodes
- Auto-Scaling: Kubernetes can automatically create/destroy pods based on load
Interview Insight
Question: "How do you scale applications in Kubernetes?"
Answer: "You scale by creating more pods, not by adding containers to existing pods. This is because pods are the atomic unit of scaling in Kubernetes, and each pod should represent one instance of your application."
Multi-Container Pods
When to Use Multi-Container Pods
Multi-container pods are used for tightly coupled applications that need to:
- Share the same network (communicate via localhost)
- Share storage volumes
- Have synchronized lifecycles
- Work together as a single unit
Common Patterns
1. Sidecar Pattern
Main container + helper container working together
Example: Web app + Log shipping container
- Main: Nginx web server
- Sidecar: Fluentd collecting and shipping logs
2. Ambassador Pattern
Proxy container handling external communications
Example: App + Proxy container
- Main: Application
- Ambassador: Redis proxy handling connections
3. Adapter Pattern
Container that transforms data for the main container
Example: App + Monitoring adapter
- Main: Legacy application
- Adapter: Converts metrics to Prometheus format
Multi-Container Communication
- Network: All containers share same IP and port space
- Storage: Can mount same volumes
- Process: Can share process namespace (optional)
Pod Networking
Network Fundamentals
- Each pod gets a unique IP address within the cluster
- All containers in a pod share the same network namespace
- Containers communicate via localhost
- Pods communicate via their pod IP addresses
Network Model
Pod A (IP: 10.1.1.1) Pod B (IP: 10.1.1.2)
βββββββββββββββββββ βββββββββββββββββββ
β Container 1 β β Container 1 β
β Container 2 β β Container 2 β
βββββββββββββββββββ βββββββββββββββββββ
β β
βββββββββββββ¬ββββββββββββ
β
Cluster Network
Important Network Facts
- Pods are ephemeral, so their IPs change when recreated
- Services provide stable networking (covered in later topics)
- Containers in same pod cannot bind to same port
- External access requires Services (not direct pod access)
Pod Storage
Volume Sharing
Containers in a pod can share storage through volumes:
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
image: myapp
volumeMounts:
- name: shared-data
mountPath: /app/data
- name: helper
image: helper
volumeMounts:
- name: shared-data
mountPath: /helper/data
volumes:
- name: shared-data
emptyDir: {}
Volume Types for Pods
- emptyDir: Temporary storage, deleted when pod dies
- hostPath: Mount from host file system
- configMap: Configuration data
- secret: Sensitive data
- persistentVolumeClaim: Persistent storage
Pod Management
Creating Pods
Imperative Way (kubectl run)
kubectl run nginx-pod --image=nginx
Declarative Way (YAML manifest)
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.20
ports:
- containerPort: 80
Viewing Pods
# List all pods
kubectl get pods
# Detailed pod information
kubectl describe pod nginx-pod
# Pod logs
kubectl logs nginx-pod
# Execute commands in pod
kubectl exec -it nginx-pod -- /bin/bash
Pod Deletion
kubectl delete pod nginx-pod
Common Interview Questions
Q1: What is a Pod in Kubernetes?
Answer: A Pod is the smallest deployable unit in Kubernetes that represents a single instance of a running process. It encapsulates one or more containers, storage resources, a unique network IP, and options that govern how containers should run.
Q2: Why does Kubernetes use Pods instead of managing containers directly?
Answer: Kubernetes uses Pods because:
- They provide an abstraction layer for grouping related containers
- They enable resource sharing (network, storage) between containers
- They simplify lifecycle management of related containers
- They support advanced deployment patterns like sidecars
- They make the system more modular and extensible
Q3: Can a Pod contain multiple containers? Give examples.
Answer: Yes, but it's less common. Examples include:
- Sidecar: Web server + log collector
- Ambassador: App + proxy for external services
- Adapter: Legacy app + monitoring adapter Containers in the same pod share network and storage, communicating via localhost.
Q4: How do you scale applications in Kubernetes?
Answer: You scale by creating more Pods, not by adding containers to existing Pods. This maintains the one-to-one relationship between Pods and application instances. Scaling can be manual (kubectl scale) or automatic (HorizontalPodAutoscaler).
Q5: What happens when a Pod fails?
Answer: When a Pod fails:
- Kubernetes marks it as Failed
- If managed by a controller (Deployment, ReplicaSet), a new Pod is created
- The failed Pod retains its logs until manually deleted
- Any data in non-persistent volumes is lost
- Controllers ensure desired state is maintained
Q6: How do containers in a Pod communicate?
Answer: Containers in the same Pod can communicate via:
- localhost (same network namespace)
- Shared volumes for file-based communication
- Environment variables
- Shared process namespace (if enabled)
Q7: What's the difference between a Pod and a Container?
Answer:
- Container: Runtime instance of an image with isolated processes
- Pod: Kubernetes wrapper around one or more containers with shared resources
- Key difference: Pods provide shared networking, storage, and lifecycle management
Q8: How do you troubleshoot a failing Pod?
Answer: Troubleshooting steps:
-
kubectl get pods
- Check status -
kubectl describe pod <name>
- Check events and conditions -
kubectl logs <pod-name>
- Check application logs -
kubectl exec -it <pod-name> -- /bin/bash
- Debug inside container - Check resource constraints, image pull issues, or configuration problems
Practical Examples
Example 1: Simple Web Application Pod
apiVersion: v1
kind: Pod
metadata:
name: webapp
labels:
app: webapp
spec:
containers:
- name: web
image: nginx:1.20
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Example 2: Multi-Container Pod with Sidecar
apiVersion: v1
kind: Pod
metadata:
name: webapp-with-sidecar
spec:
containers:
- name: webapp
image: nginx:1.20
ports:
- containerPort: 80
volumeMounts:
- name: logs
mountPath: /var/log/nginx
- name: log-collector
image: fluentd:latest
volumeMounts:
- name: logs
mountPath: /var/log
volumes:
- name: logs
emptyDir: {}
Example 3: Pod with Environment Variables and Secrets
apiVersion: v1
kind: Pod
metadata:
name: app-with-config
spec:
containers:
- name: app
image: myapp:1.0
env:
- name: DATABASE_URL
value: "postgresql://localhost:5432/mydb"
- name: API_KEY
valueFrom:
secretKeyRef:
name: api-secret
key: key
ports:
- containerPort: 8080
Pod YAML Configuration Deep Dive
Understanding Kubernetes YAML Structure
Every Kubernetes object definition follows the same basic structure with four mandatory top-level fields:
apiVersion: # Version of Kubernetes API
kind: # Type of object to create
metadata: # Data about the object
spec: # Object specifications
1. apiVersion Field
Purpose: Specifies which version of the Kubernetes API to use for creating the object.
For Pods: Always use v1
Other Common API Versions:
-
apps/v1
- For Deployments, ReplicaSets, DaemonSets -
v1
- For Pods, Services, ConfigMaps, Secrets -
batch/v1
- For Jobs -
networking.k8s.io/v1
- For NetworkPolicies
Interview Tip: Different objects use different API versions. Always check the official Kubernetes API documentation.
2. kind Field
Purpose: Defines the type of Kubernetes object you want to create.
Common Values:
-
Pod
- Single instance of application -
Deployment
- Manages ReplicaSets and Pods -
Service
- Network service for Pods -
ConfigMap
- Configuration data -
Secret
- Sensitive data
Case Sensitivity: The kind
field is case-sensitive. Use exact capitalization.
3. metadata Field
Purpose: Contains data about the object like name, labels, annotations.
Structure: Dictionary/Map with specific allowed fields:
metadata:
name: my-pod-name # Required: Object identifier
labels: # Optional: Key-value pairs for grouping
app: frontend
version: v1.0
environment: production
annotations: # Optional: Non-identifying metadata
description: "Main application pod"
owner: "team-alpha"
namespace: default # Optional: Namespace (defaults to 'default')
Key Points:
- name: Must be unique within the namespace
- labels: Used for selecting and grouping objects
- annotations: Store arbitrary metadata (not used for selection)
- Indentation matters: All metadata children must be properly indented
Labeling Best Practices:
labels:
app: myapp # Application name
version: v1.2.3 # Version
component: frontend # Component type
environment: production # Environment
tier: web # Application tier
4. spec Field
Purpose: Defines the desired state and configuration of the object.
Structure: Varies by object type. For Pods, it contains container specifications:
spec:
containers: # List of containers
- name: container-name # Container identifier
image: nginx:1.20 # Docker image
ports: # Exposed ports
- containerPort: 80
env: # Environment variables
- name: ENV_VAR
value: "some-value"
restartPolicy: Always # Pod restart policy
nodeSelector: # Node selection criteria
kubernetes.io/os: linux
Complete Pod YAML Example
apiVersion: v1
kind: Pod
metadata:
name: webapp-pod
labels:
app: webapp
version: v1.0
environment: production
annotations:
description: "Frontend web application"
spec:
containers:
- name: webapp
image: nginx:1.20
ports:
- containerPort: 80
name: http
env:
- name: ENVIRONMENT
value: "production"
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
restartPolicy: Always
YAML Syntax Rules for Kubernetes
Indentation Rules
- Use spaces, not tabs
- 2 spaces per indentation level (standard)
- Consistent indentation for sibling elements
- Children indented more than parents
# Correct indentation
metadata:
name: my-pod
labels:
app: myapp
version: v1
# Incorrect indentation
metadata:
name: my-pod # Should be indented
labels:
app: myapp # Should be indented more
Lists/Arrays in YAML
# List of containers
containers:
- name: container1 # First item (dash + space)
image: nginx
- name: container2 # Second item
image: redis
# Alternative syntax (less common)
containers: [
{name: container1, image: nginx},
{name: container2, image: redis}
]
Pod Management Commands
Creating Pods
Imperative Way (kubectl run):
# Quick pod creation
kubectl run nginx-pod --image=nginx
# With additional options
kubectl run webapp --image=nginx --port=80 --labels="app=web,env=prod"
Declarative Way (YAML manifest):
# Create from file
kubectl create -f pod-definition.yaml
# Apply (create or update)
kubectl apply -f pod-definition.yaml
Viewing and Managing Pods
Basic Pod Information:
# List all pods
kubectl get pods
# List pods with additional info
kubectl get pods -o wide
# List pods with labels
kubectl get pods --show-labels
# Filter pods by labels
kubectl get pods -l app=nginx
kubectl get pods -l environment=production,tier=frontend
Detailed Pod Information:
# Detailed pod description
kubectl describe pod pod-name
# Pod YAML output
kubectl get pod pod-name -o yaml
# Pod JSON output
kubectl get pod pod-name -o json
Pod Logs and Debugging:
# View pod logs
kubectl logs pod-name
# Follow logs (like tail -f)
kubectl logs -f pod-name
# Logs from specific container in multi-container pod
kubectl logs pod-name -c container-name
# Previous container logs (if restarted)
kubectl logs pod-name --previous
Interactive Pod Access:
# Execute command in pod
kubectl exec pod-name -- ls /app
# Interactive shell access
kubectl exec -it pod-name -- /bin/bash
kubectl exec -it pod-name -- /bin/sh
# Execute in specific container (multi-container pod)
kubectl exec -it pod-name -c container-name -- /bin/bash
Pod Deletion:
# Delete specific pod
kubectl delete pod pod-name
# Delete from YAML file
kubectl delete -f pod-definition.yaml
# Delete pods by label
kubectl delete pods -l app=nginx
# Force delete (immediate termination)
kubectl delete pod pod-name --force --grace-period=0
Advanced Pod YAML Features
Resource Management
spec:
containers:
- name: app
image: myapp:1.0
resources:
requests: # Minimum guaranteed resources
memory: "64Mi"
cpu: "250m"
limits: # Maximum allowed resources
memory: "128Mi"
cpu: "500m"
Environment Variables
spec:
containers:
- name: app
image: myapp:1.0
env:
- name: DATABASE_URL # Simple value
value: "postgresql://localhost:5432"
- name: API_KEY # From Secret
valueFrom:
secretKeyRef:
name: api-secret
key: key
- name: CONFIG_VALUE # From ConfigMap
valueFrom:
configMapKeyRef:
name: app-config
key: config-key
Volume Mounts
spec:
containers:
- name: app
image: myapp:1.0
volumeMounts:
- name: data-volume
mountPath: /app/data
- name: config-volume
mountPath: /app/config
readOnly: true
volumes:
- name: data-volume
emptyDir: {}
- name: config-volume
configMap:
name: app-config
Health Checks
spec:
containers:
- name: app
image: myapp:1.0
livenessProbe: # Restart container if fails
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe: # Remove from service if fails
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
Security Context
spec:
securityContext: # Pod-level security
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
containers:
- name: app
image: myapp:1.0
securityContext: # Container-level security
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
Common Pod YAML Patterns
Multi-Container Pod Example
apiVersion: v1
kind: Pod
metadata:
name: multi-container-pod
spec:
containers:
- name: main-app
image: nginx:1.20
ports:
- containerPort: 80
volumeMounts:
- name: shared-logs
mountPath: /var/log/nginx
- name: log-collector
image: fluentd:latest
volumeMounts:
- name: shared-logs
mountPath: /var/log
volumes:
- name: shared-logs
emptyDir: {}
Pod with Init Container
apiVersion: v1
kind: Pod
metadata:
name: pod-with-init-container
spec:
initContainers: # Run before main containers
- name: init-db
image: busybox:1.35
command: ['sh', '-c', 'until nc -z db 5432; do echo waiting for db; sleep 2; done;']
containers:
- name: app
image: myapp:1.0
ports:
- containerPort: 8080
Complete Kubernetes ReplicaSets Interview Guide
Table of Contents
- Background & Why ReplicaSets Exist
- Controllers in Kubernetes
- ReplicationController vs ReplicaSet
- ReplicaSet Architecture
- Labels and Selectors
- ReplicaSet YAML Configuration
- ReplicaSet Operations
- Scaling Strategies
- Troubleshooting ReplicaSets
- Best Practices
- Interview Questions
- Practical Examples
Background & Why ReplicaSets Exist
The Problem with Single Pods
In production environments, running a single Pod creates several critical problems:
1. Single Point of Failure
User Request β Single Pod β If Pod Dies β Application Down
2. No Load Distribution
100 Users β Single Pod β Overwhelmed β Poor Performance
3. Manual Recovery
Pod Fails β Manual Intervention Required β Downtime
4. No Scaling
Traffic Increases β No Automatic Scaling β Service Degradation
The Solution: ReplicaSets
ReplicaSets solve these problems by:
1. High Availability
User Request β Load Balancer β Multiple Pods β Always Available
2. Load Distribution
100 Users β 3 Pods β 33 Users/Pod β Better Performance
3. Automatic Recovery
Pod Fails β ReplicaSet Detects β Creates New Pod β No Downtime
4. Scalability
Traffic Increases β Scale ReplicaSet β More Pods β Handle Load
Real-World Scenario
Imagine an e-commerce website:
- Black Friday traffic surge: Need to scale from 3 to 50 pods instantly
- Server failure: One node crashes, ReplicaSet reschedules pods to healthy nodes
- Rolling updates: Deploy new version gradually without downtime
- Cost optimization: Scale down during low-traffic periods
Controllers in Kubernetes
What are Controllers?
Controllers are the "brain" of Kubernetes - they're control loops that:
- Watch the current state of resources
- Compare with desired state
- Take action to reconcile differences
Controller Pattern
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Desired β β Controller β β Current β
β State βββββΆβ (Watch Loop) βββββΆβ State β
β (3 replicas) β β β β (2 replicas) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β
βΌ
ββββββββββββββββββββ
β Take Action β
β (Create 1 Pod) β
ββββββββββββββββββββ
Types of Controllers
- ReplicaSet: Manages Pod replicas
- Deployment: Manages ReplicaSets and rolling updates
- DaemonSet: Ensures one Pod per node
- Job: Manages batch workloads
- CronJob: Manages scheduled jobs
Controller Responsibilities
- Monitoring: Continuously watch resources
- Reconciliation: Ensure desired state matches actual state
- Self-healing: Automatically fix problems
- Lifecycle management: Handle creation, updates, deletion
ReplicationController vs ReplicaSet
Timeline and Evolution
Kubernetes v1.0 ββββββΆ Kubernetes v1.2+ βββββββββΆ Present
ReplicationController ReplicaSet Introduced ReplicaSet Recommended
(Legacy) (Modern) (Current)
Key Differences
Aspect | ReplicationController | ReplicaSet |
---|---|---|
API Version | v1 |
apps/v1 |
Selector Support | Simple equality-based | Set-based (more flexible) |
Adoption | Can't adopt existing pods | Can adopt existing pods |
Recommended | β Legacy | β Current standard |
Deployment Support | β Not supported | β Managed by Deployments |
Selector Comparison
ReplicationController (Limited):
selector:
app: frontend
version: v1
# Only supports equality matching
ReplicaSet (Flexible):
selector:
matchLabels: # Equality-based
app: frontend
matchExpressions: # Set-based
- key: environment
operator: In
values: [prod, staging]
- key: tier
operator: NotIn
values: [cache]
Migration Path
ReplicationController β ReplicaSet β Deployment
(Legacy) (Direct) (Recommended)
Best Practice: Don't use ReplicationController or ReplicaSet directly. Use Deployments which manage ReplicaSets automatically.
ReplicaSet Architecture
Component Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β ReplicaSet β
β β
β βββββββββββββββ ββββββββββββββββ β
β β Spec β β Status β β
β β replicas: 3 β β replicas: 3 β β
β β selector β β ready: 2 β β
β β template β β available: 2 β β
β βββββββββββββββ ββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββ
β Pod Template β
β βββββββ βββββββ βββββββ β
β βPod 1β βPod 2β βPod 3β β
β βββββββ βββββββ βββββββ β
βββββββββββββββββββββββββββββββββββ
ReplicaSet Controller Logic
# Simplified controller logic
while True:
current_pods = get_pods_matching_selector()
desired_replicas = replicaset.spec.replicas
current_replicas = len(current_pods)
if current_replicas < desired_replicas:
# Scale up - create pods
create_pods(desired_replicas - current_replicas)
elif current_replicas > desired_replicas:
# Scale down - delete pods
delete_pods(current_replicas - desired_replicas)
sleep(reconcile_interval)
Pod Lifecycle Management
1. Pod Creation
ReplicaSet β Pod Template β Scheduler β Node β Container Runtime β Running Pod
2. Pod Failure Detection
Pod Dies β Kubelet Reports β API Server β ReplicaSet Controller β Create New Pod
3. Pod Adoption
Existing Pod + Matching Labels β ReplicaSet Adopts β Manages Lifecycle
Labels and Selectors
The Label-Selector Relationship
Labels and selectors are the foundation of Kubernetes organization:
Labels: Key-value pairs attached to objects
metadata:
labels:
app: frontend
version: v2.1
environment: production
tier: web
Selectors: Query mechanisms to select objects
selector:
matchLabels:
app: frontend
tier: web
How ReplicaSets Use Selectors
1. Pod Discovery
ReplicaSet Selector β Finds Matching Pods β Manages Count
2. Adoption Process
Orphaned Pod + Matching Labels β ReplicaSet Adopts β Includes in Count
3. Continuous Monitoring
Label Changes β Selector Reevaluation β Pod Adoption/Release
Advanced Selector Examples
Set-based Selectors:
selector:
matchLabels:
app: webapp
matchExpressions:
# Include production and staging
- key: environment
operator: In
values: [production, staging]
# Exclude cache tier
- key: tier
operator: NotIn
values: [cache]
# Must have version label
- key: version
operator: Exists
Selector Operators:
-
In
: Value in specified set -
NotIn
: Value not in specified set -
Exists
: Key exists (ignore value) -
DoesNotExist
: Key doesn't exist
Label Best Practices
Recommended Labels:
labels:
app.kubernetes.io/name: webapp
app.kubernetes.io/instance: webapp-prod
app.kubernetes.io/version: "v2.1.0"
app.kubernetes.io/component: frontend
app.kubernetes.io/part-of: ecommerce
app.kubernetes.io/managed-by: helm
Common Patterns:
# Environment-based
environment: production
# Application-based
app: frontend
# Version-based
version: v2.1
# Tier-based
tier: web
ReplicaSet YAML Configuration
Complete ReplicaSet Structure
apiVersion: apps/v1 # API version for ReplicaSet
kind: ReplicaSet # Object type
metadata: # ReplicaSet metadata
name: frontend-rs
labels:
app: frontend
tier: web
spec: # ReplicaSet specification
replicas: 3 # Desired number of pods
selector: # Pod selection criteria
matchLabels:
app: frontend
tier: web
template: # Pod template
metadata: # Pod metadata
labels:
app: frontend # Must match selector
tier: web
spec: # Pod specification
containers:
- name: webapp
image: nginx:1.20
ports:
- containerPort: 80
Template Section Deep Dive
The template section is a complete Pod specification:
template:
metadata:
labels: # MUST match ReplicaSet selector
app: frontend
tier: web
annotations: # Optional pod annotations
prometheus.io/scrape: "true"
spec:
containers:
- name: webapp
image: nginx:1.20
ports:
- containerPort: 80
name: http
env:
- name: ENVIRONMENT
value: production
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 80
initialDelaySeconds: 5
restartPolicy: Always # Pod restart policy
Critical YAML Rules
1. Label Matching
# ReplicaSet selector
selector:
matchLabels:
app: frontend
# Pod template labels MUST include all selector labels
template:
metadata:
labels:
app: frontend # REQUIRED - matches selector
version: v1 # Optional - additional labels OK
2. API Version Compatibility
# Correct for ReplicaSet
apiVersion: apps/v1
kind: ReplicaSet
# Wrong - will fail
apiVersion: v1
kind: ReplicaSet
3. Indentation Precision
# Correct indentation
spec:
replicas: 3
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
ReplicationController vs ReplicaSet YAML
ReplicationController (Legacy):
apiVersion: v1
kind: ReplicationController
metadata:
name: frontend-rc
spec:
replicas: 3
selector: # Simple key-value matching
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- name: webapp
image: nginx:1.20
ReplicaSet (Modern):
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: frontend-rs
spec:
replicas: 3
selector: # Advanced matching capabilities
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- name: webapp
image: nginx:1.20
ReplicaSet Operations
Creating ReplicaSets
From YAML file:
# Create from file
kubectl create -f replicaset-definition.yaml
# Apply (create or update)
kubectl apply -f replicaset-definition.yaml
# Validate before creating
kubectl create -f replicaset-definition.yaml --dry-run=client
Imperative creation (not recommended for production):
# Generate YAML
kubectl create replicaset webapp --image=nginx --replicas=3 --dry-run=client -o yaml
# Direct creation (avoid in production)
kubectl create replicaset webapp --image=nginx --replicas=3
Viewing ReplicaSets
Basic information:
# List all ReplicaSets
kubectl get replicaset
kubectl get rs # Shorthand
# List with more details
kubectl get rs -o wide
# Show labels
kubectl get rs --show-labels
# Filter by labels
kubectl get rs -l app=frontend
Detailed information:
# Detailed description
kubectl describe replicaset frontend-rs
# YAML output
kubectl get rs frontend-rs -o yaml
# JSON output
kubectl get rs frontend-rs -o json
Sample output:
NAME DESIRED CURRENT READY AGE
frontend-rs 3 3 3 5m
Managing ReplicaSet Pods
View pods created by ReplicaSet:
# All pods
kubectl get pods
# Filter pods by ReplicaSet labels
kubectl get pods -l app=frontend
# Show pod details with node information
kubectl get pods -o wide
Pod naming convention:
ReplicaSet Name: frontend-rs
Pod Names: frontend-rs-abc12
frontend-rs-def34
frontend-rs-ghi56
Updating ReplicaSets
Method 1: Update YAML and replace:
# Edit the YAML file, then:
kubectl replace -f replicaset-definition.yaml
# Force replace if needed
kubectl replace -f replicaset-definition.yaml --force
Method 2: Edit directly:
# Edit ReplicaSet in default editor
kubectl edit replicaset frontend-rs
Method 3: Patch specific fields:
# Update replica count
kubectl patch replicaset frontend-rs -p '{"spec":{"replicas":5}}'
# Update image (affects template only, not existing pods)
kubectl patch replicaset frontend-rs -p '{"spec":{"template":{"spec":{"containers":[{"name":"webapp","image":"nginx:1.21"}]}}}}'
Deleting ReplicaSets
Delete ReplicaSet and Pods:
# Delete everything
kubectl delete replicaset frontend-rs
# Delete from file
kubectl delete -f replicaset-definition.yaml
Delete ReplicaSet but keep Pods:
# Orphan the pods (useful for debugging)
kubectl delete replicaset frontend-rs --cascade=orphan
Delete all ReplicaSets:
# Delete all in current namespace
kubectl delete replicaset --all
# Delete by label
kubectl delete replicaset -l app=frontend
Scaling Strategies
Understanding Scaling
Horizontal Pod Autoscaling vs Manual Scaling:
Manual Scaling: Admin β Scale Command β More/Fewer Pods
Auto Scaling: Metrics β HPA β Scale Decision β More/Fewer Pods
Manual Scaling Methods
Method 1: Update YAML file:
# Change replicas in YAML
spec:
replicas: 5 # Changed from 3 to 5
kubectl replace -f replicaset-definition.yaml
Method 2: kubectl scale with file:
# Scale using file reference
kubectl scale --replicas=6 -f replicaset-definition.yaml
# Note: This doesn't update the file itself
Method 3: kubectl scale with resource name:
# Scale by resource name
kubectl scale replicaset frontend-rs --replicas=8
# Scale using resource type/name format
kubectl scale rs/frontend-rs --replicas=8
Method 4: Interactive editing:
# Edit directly
kubectl edit replicaset frontend-rs
# Change replicas value, save and exit
Scaling Scenarios
Scale Up (Handle increased load):
# Gradual scaling
kubectl scale rs/webapp --replicas=5 # 3 β 5
kubectl scale rs/webapp --replicas=8 # 5 β 8
kubectl scale rs/webapp --replicas=15 # 8 β 15
Scale Down (Resource optimization):
# Gradual scale down
kubectl scale rs/webapp --replicas=10 # 15 β 10
kubectl scale rs/webapp --replicas=5 # 10 β 5
kubectl scale rs/webapp --replicas=3 # 5 β 3
Scale to Zero (Maintenance):
# Stop all pods (but keep ReplicaSet)
kubectl scale rs/webapp --replicas=0
# Restart from zero
kubectl scale rs/webapp --replicas=3
Scaling Best Practices
1. Gradual Scaling:
# Don't: 3 β 50 (sudden spike)
kubectl scale rs/webapp --replicas=50
# Do: Gradual increase
kubectl scale rs/webapp --replicas=6 # Wait and monitor
kubectl scale rs/webapp --replicas=10 # Wait and monitor
kubectl scale rs/webapp --replicas=15 # Wait and monitor
2. Resource Considerations:
# Ensure cluster has resources
resources:
requests:
memory: "256Mi" # 10 pods = 2.5GB memory needed
cpu: "250m" # 10 pods = 2.5 CPU cores needed
3. Monitor During Scaling:
# Watch scaling in real-time
kubectl get pods -w
# Check resource usage
kubectl top pods
kubectl top nodes
Horizontal Pod Autoscaler (HPA)
Basic HPA setup:
# Create HPA based on CPU
kubectl autoscale replicaset frontend-rs --cpu-percent=50 --min=1 --max=10
HPA YAML configuration:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: frontend-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: ReplicaSet
name: frontend-rs
minReplicas: 3
maxReplicas: 15
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Troubleshooting ReplicaSets
Common Issues and Solutions
Issue 1: ReplicaSet Not Creating Pods
Symptoms:
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
webapp-rs 3 0 0 5m
Debugging steps:
# Check ReplicaSet events
kubectl describe rs webapp-rs
# Check for common issues:
# 1. Image pull errors
# 2. Resource constraints
# 3. Node affinity issues
# 4. Security policy violations
Common causes and fixes:
# Issue: Wrong image name
containers:
- name: webapp
image: ngnix:1.20 # Typo: should be "nginx"
# Fix: Correct image name
containers:
- name: webapp
image: nginx:1.20 # Corrected
Issue 2: Pods Not Matching Selector
Symptoms:
$ kubectl get pods -l app=frontend
No resources found
Problem: Selector doesn't match pod labels
# ReplicaSet selector
selector:
matchLabels:
app: frontend # Looking for this
# Pod template (WRONG)
template:
metadata:
labels:
application: frontend # Doesn't match!
Solution: Ensure exact label matching
# Pod template (CORRECT)
template:
metadata:
labels:
app: frontend # Matches selector
Issue 3: Scaling Not Working
Symptoms:
$ kubectl scale rs/webapp --replicas=5
$ kubectl get rs webapp
NAME DESIRED CURRENT READY AGE
webapp 5 3 3 10m
Debugging:
# Check events
kubectl describe rs webapp
# Check node resources
kubectl describe nodes
kubectl top nodes
# Check resource quotas
kubectl describe resourcequota
Common causes:
- Insufficient cluster resources
- Resource quotas exceeded
- Node affinity constraints
- Pod disruption budgets
Issue 4: Unwanted Pod Adoption
Symptoms: ReplicaSet managing more pods than expected
Cause: Existing pods with matching labels being adopted
Investigation:
# Check all pods with matching labels
kubectl get pods -l app=frontend --show-labels
# Check which ReplicaSet owns each pod
kubectl get pods -o wide
Solution: Fix label conflicts
# Remove conflicting labels from unwanted pods
kubectl label pod unwanted-pod app-
# Or change ReplicaSet selector to be more specific
Debugging Commands Reference
ReplicaSet status:
# Detailed information
kubectl describe rs <name>
# Watch changes
kubectl get rs <name> -w
# Events across namespace
kubectl get events --field-selector involvedObject.kind=ReplicaSet
Pod investigation:
# Pods owned by ReplicaSet
kubectl get pods --selector=<labels>
# Pod events
kubectl describe pod <pod-name>
# Pod logs
kubectl logs <pod-name>
Resource analysis:
# Resource usage
kubectl top pods
kubectl top nodes
# Resource quotas
kubectl describe quota
kubectl describe limitrange
Best Practices
ReplicaSet Design Principles
1. Don't Use ReplicaSets Directly
β Direct ReplicaSet Usage
β
Use Deployments (which manage ReplicaSets)
Why Deployments are better:
- Rolling updates without downtime
- Rollback capabilities
- Revision history
- Declarative update strategy
2. Proper Label Strategy
# Good: Specific and meaningful labels
labels:
app: ecommerce-frontend
version: v2.1.0
environment: production
# Bad: Generic or conflicting labels
labels:
app: app
name: pod
3. Resource Management
# Always specify resources
containers:
- name: webapp
image: nginx:1.20
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
Operational Best Practices
1. Monitoring and Observability
# Add observability labels
metadata:
labels:
app.kubernetes.io/name: webapp
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/component: frontend
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
2. Health Checks
# Always include health checks
containers:
- name: webapp
livenessProbe:
httpGet:
path: /health
port: 8080
readinessProbe:
httpGet:
path: /ready
port: 8080
3. Security Considerations
# Security context
securityContext:
runAsNonRoot: true
runAsUser: 1000
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
4. Graceful Scaling
# Monitor during scaling operations
kubectl get pods -w &
kubectl scale rs/webapp --replicas=10
# Check resource impact
kubectl top nodes
kubectl top pods
Interview Questions
Q1: What is a ReplicaSet and why is it needed?
Answer: A ReplicaSet is a Kubernetes controller that ensures a specified number of pod replicas are running at all times. It's needed for:
- High Availability: If one pod fails, others continue serving traffic
- Load Distribution: Multiple pods share the workload
- Self-healing: Automatically replaces failed pods
- Scalability: Easy horizontal scaling by adjusting replica count
Q2: What's the difference between ReplicationController and ReplicaSet?
Answer:
-
API Version: ReplicationController uses
v1
, ReplicaSet usesapps/v1
-
Selector Support: ReplicationController has simple equality-based selectors, ReplicaSet supports set-based selectors with
matchLabels
andmatchExpressions
- Adoption: ReplicaSet can adopt existing pods with matching labels
- Recommendation: ReplicaSet is the modern standard, ReplicationController is legacy
- Deployment Integration: Only ReplicaSets work with Deployments
Q3: Explain the role of labels and selectors in ReplicaSets.
Answer:
- Labels are key-value pairs attached to pods for identification
- Selectors define which pods the ReplicaSet should manage
- ReplicaSet uses selectors to:
- Find existing pods to manage
- Determine if scaling is needed
- Adopt orphaned pods with matching labels
- The selector must match the labels in the pod template
- This loose coupling allows flexibility in pod management
Q4: How does ReplicaSet scaling work?
Answer: ReplicaSet scaling works through reconciliation:
- Current State Check: Controller counts pods matching selector
-
Desired State Compare: Compares with
spec.replicas
-
Action:
- If current < desired: Create new pods using template
- If current > desired: Delete excess pods
-
Methods: Scale via
kubectl scale
, editing YAML, or HPA
Q5: What happens if you delete a pod managed by ReplicaSet?
Answer: The ReplicaSet controller detects the pod deletion and immediately creates a new pod to maintain the desired replica count. This ensures:
- No service disruption
- Desired state is maintained
- Self-healing behavior The new pod uses the same template but gets a new name and IP.
Q6: How do you troubleshoot a ReplicaSet that's not creating pods?
Answer: Follow these steps:
-
kubectl describe rs <name>
- Check events section - Common issues to check:
- Image pull errors: Wrong image name/tag, private registry access
- Resource constraints: Insufficient CPU/memory on nodes
- Label mismatches: Selector doesn't match template labels
- Node affinity: Pods can't be scheduled on available nodes
- Security policies: Pod security standards blocking creation
Q7: Can you update the container image in a ReplicaSet?
Answer: Yes, but with limitations:
- You can update the template in ReplicaSet specification
- Existing pods won't be updated - only new pods use the new image
- To update existing pods, you must delete them manually (ReplicaSet recreates with new image)
- Better approach: Use Deployments which handle rolling updates automatically
- Command:
kubectl set image replicaset/my-rs container=newimage:tag
Q8: What's the difference between ReplicaSet and Deployment?
Answer:
- ReplicaSet: Low-level controller managing pod replicas
- Deployment: Higher-level controller that manages ReplicaSets
-
Deployment advantages:
- Rolling updates without downtime
- Rollback capabilities
- Revision history
- Declarative updates
- Pause/resume functionality
ReplicaSet Controller
Ye ensure karta hai ki specified number of pod replicas hamesha chal rahi ho.
Agar koi pod mar jaaye, toh naya pod create karega. Agar jyada chal rahe ho toh extra ko delete karega.
BUT: ReplicaSet sirf "replication" handle karta hai. Ye aapko rolling updates, rollback, ya versioning ka feature nahi deta.
π Example:
Aapne kaha 3 replicas of nginx:1.14.
ReplicaSet ensure karega ki 3 pod hamesha chal rahi ho.
Agar aapko nginx:1.16 chahiye β pura ReplicaSet manually naya banana padega.
Deployment Controller
Deployment internally ReplicaSet ka wrapper hai.
Ye not only replicas maintain karta hai, but rollout (updates), rollback, and strategy (rolling update, recreate) handle karta hai.
Deployment automatically new ReplicaSet banata hai jab aap image update karte ho, aur purane ko gradually replace karta hai.
π Example:
Aapne ek Deployment banaya with 3 replicas of nginx:1.14.
Phir aapne image ko update karke nginx:1.16 kar diya.
Deployment ek naya ReplicaSet create karega nginx:1.16 ke liye.
Gradually purane pods (1.14) ko terminate karega aur naye pods (1.16) launch karega.
Agar kuch galat ho jaye, toh rollback option bhi hai.
- Relationship: Deployment β ReplicaSet β Pods
- Best Practice: Always use Deployments, not ReplicaSets directly
Q9: How do you perform a zero-downtime update with ReplicaSets?
Answer: You cannot achieve zero-downtime updates with ReplicaSets alone because:
- Updating ReplicaSet template doesn't update existing pods
- Manual pod deletion causes temporary capacity reduction
-
Solution: Use Deployments which:
- Create new ReplicaSet with updated template
- Gradually scale up new ReplicaSet
- Gradually scale down old ReplicaSet
- Ensure minimum number of pods always available
Q10: What are the limitations of ReplicaSets?
Answer:
- No rolling updates: Can't update existing pods automatically(one container image update or anything like that)
- No rollback: No built-in rollback mechanism
- No update strategies: No control over update process
- Manual intervention: Requires manual steps for updates
- No pause/resume: Can't pause/resume operations
- Limited lifecycle management: Basic replica management only
- Solution: Use Deployments for production workloads
Q11: How does ReplicaSet handle node failures?
Answer: When a node fails:
- Detection: Kubelet stops reporting, node marked as NotReady
- Pod Eviction: Pods on failed node are marked for eviction
- Rescheduling: ReplicaSet controller creates new pods on healthy nodes
- Timeline: Default timeout is ~5 minutes before rescheduling
- Considerations: Network partitions may cause temporary over-provisioning
Q12: Can a pod belong to multiple ReplicaSets?
Answer: No,
Comprehensive Kubernetes Notes: Deployments & Services
Table of Contents
- Kubernetes Deployments
- Understanding Kubernetes Services
- Service Types Deep Dive
- Interview Preparation Points
Kubernetes Deployments
Why Do We Need Deployments?
Background Problem: Imagine you're running a web application in production. You face several challenges:
- You need multiple instances running (high availability)
- You need to update your application without downtime
- Sometimes updates break things - you need to rollback quickly
- You want to make multiple changes together, not one by one
The Solution: Kubernetes Deployments provide a higher-level abstraction that handles all these production concerns automatically.
Deployment Hierarchy (Critical Understanding)
Deployment (Highest Level - Your Management Interface)
β Creates and Manages
ReplicaSet (Middle Level - Ensures Pod Count)
β Creates and Manages
Pods (Lowest Level - Actual Running Containers)
Key Insight: You don't directly manage ReplicaSets or Pods in production. You work with Deployments, and they handle the complexity below.
Deployment Capabilities
- Rolling Updates: Updates pods one by one, ensuring zero downtime
- Rollback: Can undo changes if something goes wrong
- Pause/Resume: Make multiple changes, then apply them together
- Scaling: Easily increase/decrease the number of replicas
Creating a Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-deployment
labels:
app: my-app
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app-container
image: my-app:v1
ports:
- containerPort: 80
What Happens: Deployment β Creates ReplicaSet β Creates 3 Pods
Understanding Kubernetes Services
The Core Problem Services Solve
Background: In Kubernetes, every Pod gets its own IP address, but these IPs are:
- Temporary: Pods die and get recreated with new IPs
- Internal Only: You can't access them from outside the cluster
- Unpredictable: When you have multiple pods, which one should handle a request?
Real-World Analogy: Think of Services like a receptionist at a company. Instead of visitors trying to find specific employees (Pods) who might have moved desks (changed IPs), they go to the receptionist (Service) who always knows how to route them to the right person.
Service Types - CLEARING YOUR CONFUSION!
Your confusion is common! Let me clarify each service type with their specific use cases:
WHY These Service Types Exist - Real World Problems & Solutions
The Evolution Story (Why 3 Different Types?)
Imagine you're building a food delivery app like Zomato:
Your App Architecture:
Frontend (React) β Users access this
β
Backend API β Frontend calls this
β
Database β Backend calls this
β
Redis Cache β Backend calls this
Different parts need DIFFERENT types of access!
Service Types Deep Dive
1. ClusterIP - "Office Intercom System" π’
Real Problem: Your Database aur Backend ko bahar se access nahi karna chahiye! Security risk hai!
Example: Zomato ka database sirf backend se hi accessible hona chahiye, internet se nahi.
apiVersion: v1
kind: Service
metadata:
name: database-service
spec:
type: ClusterIP # Internal only!
selector:
app: mysql
ports:
- port: 3306
targetPort: 3306
What happens:
- Database gets internal address:
database-service:3306
- Backend pods can connect:
mysql://database-service:3306
- Internet users CANNOT access this β (Security!)
- Even you can't access from your laptop (That's the point!)
Real Use Cases:
- Database connections (MySQL, MongoDB)
- Cache services (Redis, Memcached)
- Internal APIs (microservices talking to each other)
- Message queues (RabbitMQ, Kafka)
Why this exists: Security + Organization. Kuch services sirf internal use ke liye hoti hai.
2. NodePort - "Direct Phone Numbers" βοΈ
Real Problem: Ab tumhara Frontend ko users access karna chahte hai, but ClusterIP se nahi kar sakte!
Example: Tumne Zomato banaya, but users kaise access karenge? They need external access!
apiVersion: v1
kind: Service
metadata:
name: zomato-frontend-nodeport
spec:
type: NodePort
selector:
app: zomato-frontend
ports:
- port: 80
targetPort: 3000 # React app runs on 3000
nodePort: 30080 # External access port
What happens:
- Gets ClusterIP functionality (internal communication) β
- PLUS opens port 30080 on ALL cluster nodes
- Users can access:
http://192.168.1.10:30080
(any node IP)
Real Scenario:
Your Kubernetes Cluster:
Node 1: 192.168.1.10:30080 β User can access
Node 2: 192.168.1.11:30080 β User can access
Node 3: 192.168.1.12:30080 β User can access
All point to same app! π―
When to use:
- Development/testing environments
- On-premise setups (no cloud provider)
- Small applications
- When you're okay with ugly URLs like
192.168.1.10:30080
Problems with NodePort:
- Ugly URLs (IP addresses + weird ports)
- Users need to remember multiple IPs
- No automatic load balancing between nodes
3. LoadBalancer - "Professional Reception Desk" π¨
Real Problem: NodePort ugly URLs de raha hai! Production mein 192.168.1.10:30080
nahi denge users ko!
Example: Zomato production mein hai. Users want zomato.com
, not 192.168.1.10:30080
!
apiVersion: v1
kind: Service
metadata:
name: zomato-production
spec:
type: LoadBalancer # Cloud magic! β¨
selector:
app: zomato-frontend
ports:
- port: 80
targetPort: 3000
What happens:
- Gets NodePort + ClusterIP functionality β
- PLUS asks cloud provider: "Bhai, external IP do!"
- Cloud provider creates: AWS ELB / GCP Load Balancer / Azure Load Balancer
- You get clean IP:
34.102.136.180
- Point your domain:
zomato.com
β34.102.136.180
Real Magic:
User types: zomato.com
β
DNS resolves: 34.102.136.180 (AWS Load Balancer)
β
AWS Load Balancer distributes to:
βββ Node 1 (If alive)
βββ Node 2 (If alive)
βββ Node 3 (If alive)
When to use:
- Production applications
- AWS/GCP/Azure environments
- When you want professional URLs
- Automatic load balancing + health checks
What if no cloud?:
- On local machine: Behaves exactly like NodePort
- Shows "EXTERNAL-IP: " forever
Complete Real-World Example: E-commerce Platform π
Let's build a complete e-commerce site and see WHY each service type exists:
Architecture:
π Internet Users
β
π± Frontend (React) - Users see this
β
π§ Backend API - Frontend calls this
β
ποΈ Database - Backend calls this
π Redis Cache - Backend calls this
π§ Email Service - Backend calls this
Service Configuration:
1. Database (ClusterIP - Internal Only)
apiVersion: v1
kind: Service
metadata:
name: database-service
spec:
type: ClusterIP # NO external access!
selector:
app: mysql
ports:
- port: 3306
Why ClusterIP?: Database ko bahar se access nahi karna! Security breach hoga!
2. Redis Cache (ClusterIP - Internal Only)
apiVersion: v1
kind: Service
metadata:
name: redis-service
spec:
type: ClusterIP # NO external access!
selector:
app: redis
ports:
- port: 6379
Why ClusterIP?: Cache bhi internal service hai. Users ko direct access nahi chahiye.
3. Backend API (ClusterIP - Internal Only)
apiVersion: v1
kind: Service
metadata:
name: backend-api-service
spec:
type: ClusterIP # Frontend will call this internally
selector:
app: backend-api
ports:
- port: 8080
Why ClusterIP?: Backend ko direct access nahi denge users ko. Sirf frontend se access.
4. Frontend (LoadBalancer - Public Access)
apiVersion: v1
kind: Service
metadata:
name: ecommerce-frontend
spec:
type: LoadBalancer # Users access this!
selector:
app: frontend
ports:
- port: 80
targetPort: 3000
Why LoadBalancer?: Users ko clean URL chahiye: mystore.com
Traffic Flow:
User: mystore.com
β (External - LoadBalancer)
Frontend Pod
β (Internal - ClusterIP)
Backend Pod: http://backend-api-service:8080
β (Internal - ClusterIP)
Database Pod: mysql://database-service:3306
Service Decision Tree - The Right Way π―
What kind of component is this?
βββ ποΈ Database/Cache/Queue?
β βββ ClusterIP (Keep it internal & secure)
β
βββ π§ Internal API/Microservice?
β βββ ClusterIP (Only other services should call it)
β
βββ π± User-facing Frontend/Public API?
βββ Are you on cloud (AWS/GCP/Azure)?
βββ YES β LoadBalancer (Professional setup)
βββ NO β NodePort (Development/On-premise)
Why Not Just Use LoadBalancer for Everything? π€
Beginner Thinking: "Bhai LoadBalancer sabse powerful hai, sab mein LoadBalancer use kar dete hai!"
Why This is WRONG:
1. Security Nightmare:
# β NEVER DO THIS!
apiVersion: v1
kind: Service
metadata:
name: database-exposed # DON'T!
spec:
type: LoadBalancer # Database external exposed!
selector:
app: mysql
ports:
- port: 3306
Result: Your database is now accessible from internet! π¨ Hackers ka invitation!
2. Cost Issues:
- Each LoadBalancer service = 1 Cloud Load Balancer
- AWS ELB costs ~$20/month per load balancer
- 10 services = $200/month extra cost! πΈ
3. Complexity:
- More external IPs to manage
- More DNS entries
- More firewall rules
Right Approach:
- 1 LoadBalancer for user-facing services
- Multiple ClusterIPs for internal communication
- Clean, secure, cost-effective β
Common Mistakes & How to Avoid Them
Mistake 1: "Database ko LoadBalancer bana deta hu"
# β WRONG - Security risk!
type: LoadBalancer
selector:
app: mysql
Fix: Use ClusterIP for databases, always!
Mistake 2: "Development mein LoadBalancer use karunga"
# β Won't work on local machine
type: LoadBalancer # Shows <pending> forever
Fix: Use NodePort for local development
Mistake 3: "NodePort ugly hai, use nahi karunga"
Reality: NodePort is perfect for:
- Development environments
- On-premise production
- CI/CD pipelines
- Testing
Mistake 4: "ClusterIP se external access nahi kar sakta?"
Answer: Exactly! That's the point! Security feature hai, bug nahi!
Interview Preparation Points
Deployments Interview Questions
Q: What's the difference between ReplicaSet and Deployment?
A: ReplicaSet ensures a specific number of pods are running but doesn't handle updates well. Deployment wraps ReplicaSet and adds rolling updates, rollback capabilities, and pause/resume functionality. In production, you always use Deployments.
Q: How does rolling update work?
A: Deployment creates a new ReplicaSet with updated pods while gradually scaling down the old ReplicaSet. This ensures zero downtime during updates.
Q: What happens when you run kubectl create -f deployment.yaml
?
A:
- Deployment object is created
- Deployment creates a ReplicaSet
- ReplicaSet creates the specified number of Pods
- You can see all with
kubectl get all
Interview Questions - Real Answers That Impress π‘
Q: "Why not use LoadBalancer for everything?"
Your Answer: "LoadBalancer har service ke liye use karna security risk hai aur costly bhi. Database ko external expose karna means hackers ka invitation dena. Plus, har LoadBalancer cloud provider pe $20/month cost karta hai. Smart approach is: 1 LoadBalancer for frontend, ClusterIP for internal services."
Q: "ClusterIP ka kya faayda hai agar external access hi nahi kar sakte?"
Your Answer: "Yahi to main benefit hai! ClusterIP security boundary create karta hai. Real production mein 80% services internal hoti hai - database, cache, internal APIs. Unhe external access nahi chahiye. ClusterIP ensures ki sirf authorized internal services hi access kar sake."
Q: "NodePort kab use karenge?"
Your Answer: "NodePort perfect hai jab aap on-premise environment mein ho ya development kar rahe ho. Cloud provider nahi hai to LoadBalancer work nahi karega. NodePort direct node IPs use karta hai, which is fine for internal teams or development environments."
Q: "Service mesh concept pata hai?"
Your Answer: "Ha! Service mesh like Istio advanced level hai. Wo ClusterIP services ke beech mein security, monitoring, traffic management add karta hai. But basic Kubernetes services samjhna zaroori hai pehle."
Key Commands to Remember
# Deployments
kubectl create -f deployment.yaml
kubectl get deployments
kubectl get replicasets
kubectl get pods
kubectl get all
# Services
kubectl create -f service.yaml
kubectl get services
kubectl describe service <service-name>
# Testing connectivity
kubectl exec -it <pod-name> -- curl <service-name>:<port>
Common Mistakes to Avoid
- Mixing up service types: Remember ClusterIP = internal only, NodePort = external via nodes, LoadBalancer = cloud external
- Forgetting selectors: Services need selectors to find their target pods
- Port confusion: targetPort (pod), port (service), nodePort (external)
- Assuming LoadBalancer works everywhere: Only works on supported cloud platforms
Pro Tips for Interviews
- Always mention the hierarchy: Deployment β ReplicaSet β Pods
- Explain the "why": Don't just say what services do, explain the problems they solve
- Use real-world analogies: Compare services to receptionists, load balancers to traffic cops
- Show understanding of networking: Explain why pod IPs are unreliable and how services solve this
Summary
Deployments manage the lifecycle of your applications - updates, scaling, rollbacks.
Services solve networking problems - stable endpoints, load balancing, external access.
The key insight is that Kubernetes is solving real production problems, not just being complex for the sake of it. Once you understand the problems each component solves, the architecture makes perfect sense.
The Cross-Node Magic πͺ
Scenario: Pod is on Different Node
What happens when:
- You hit
192.168.56.70:30035
(Node 1) - But voting-app Pod is actually running on Node 3 (192.168.56.72)
The Flow:
User Request: 192.168.56.70:30035
β
Node 1 kube-proxy: "I got a request for voting-app"
β
kube-proxy: "Let me check... Pod is on Node 3"
β
Network Plugin (Flannel/Calico): "I'll route this to Node 3's Pod"
β
Pod on Node 3: Processes the request
β
Response flows back to user
Who Does What?
kube-proxy's job:
- Decides which Pod should get the request
- Creates routing rules (iptables/IPVS)
- Triggers cross-node forwarding
Network Plugin's job (Flannel/Calico/Weave):
- Makes cross-node Pod communication possible
- Creates overlay network
- Handles actual packet forwarding between nodes
Real-World Udemy Example Breakdown π
The Setup:
- 4-node Kubernetes cluster
- 2 applications: Voting app + Result app
- Each app: Multiple Pods spread across nodes
- Each app: One NodePort service
The Services:
Voting App Service:
apiVersion: v1
kind: Service
metadata:
name: voting-service
spec:
type: NodePort
selector:
app: voting-app
ports:
- nodePort: 30035
port: 80
targetPort: 80
Result App Service:
apiVersion: v1
kind: Service
metadata:
name: result-service
spec:
type: NodePort
selector:
app: result-app
ports:
- nodePort: 31061
port: 80
targetPort: 80
The Access Pattern:
For Voting App (all work the same):
-
192.168.56.70:30035
β Service β Any voting Pod -
192.168.56.71:30035
β Service β Any voting Pod -
192.168.56.72:30035
β Service β Any voting Pod -
192.168.56.73:30035
β Service β Any voting Pod
For Result App (all work the same):
-
192.168.56.70:31061
β Service β Any result Pod -
192.168.56.71:31061
β Service β Any result Pod -
192.168.56.72:31061
β Service β Any result Pod -
192.168.56.73:31061
β Service β Any result Pod
Key Concepts to Remember π§
1. Port Assignment
- NodePort range: 30000-32767 (fixed by Kubernetes)
- Same port on all nodes: If you assign 30035, it opens on ALL nodes
- Unique per service: Each service gets its own NodePort
2. Load Balancing
- Automatic: Service automatically balances across all matching Pods
- Algorithm: Round-robin or random (you don't control this)
- Cross-node: Pods can be on any node, service finds them
3. High Availability
- Multiple entry points: Any node can receive traffic
- Node failure: If one node fails, others still work
- Pod failure: Service automatically removes failed Pods
4. Network Requirements
- Cluster nodes must be reachable: From where you're accessing
- Firewall rules: NodePort range must be open
- Network plugin: Required for cross-node communication
Common Misconceptions β
Misconception 1: "I can use any IP"
Reality: IP must be a real node IP in your cluster
Misconception 2: "Each node has different ports"
Reality: NodePort assigns the SAME port to ALL nodes
Misconception 3: "Pod must be on the node I'm hitting"
Reality: kube-proxy + network plugin handle cross-node routing
Misconception 4: "NodePort is only for development"
Reality: NodePort is used in production for on-premise setups
When to Use NodePort? π€·ββοΈ
Perfect For:
- Development environments
- On-premise clusters (no cloud load balancer)
- Testing setups
- Internal tools (where ugly IPs are okay)
Not Great For:
- Public-facing production apps (ugly URLs)
- When you need SSL termination
- When you have many services (port management nightmare)
NodePort vs Other Service Types π
Feature | ClusterIP | NodePort | LoadBalancer |
---|---|---|---|
External Access | β | β | β |
Clean URLs | N/A | β | β |
Cloud Integration | N/A | β | β |
Cost | Free | Free | Cloud charges |
Port Management | N/A | Manual | Automatic |
Interview Questions & Answers πΌ
Q: How does NodePort work internally?
A: NodePort opens the same port on all cluster nodes. kube-proxy on each node creates iptables/IPVS rules to route traffic to service endpoints. If a Pod is on a different node, the network plugin (like Flannel) handles cross-node routing.
Q: Why can I access the same app from multiple node IPs?
A: Because NodePort opens the assigned port on EVERY node in the cluster, not just the nodes running the Pods. This provides high availability - if one node fails, you can still access via other nodes.
Q: What happens if I hit a node that doesn't have the Pod?
A: kube-proxy on that node will forward the request to a node that does have the Pod. The network plugin enables this cross-node communication transparently.
Q: Can I specify which nodes get the NodePort?
A: No, NodePort always opens on ALL nodes in the cluster. If you need selective exposure, you'd use other mechanisms like Ingress controllers or external load balancers.
Q: What's the difference between kube-proxy and network plugins in NodePort?
A: kube-proxy handles service routing and load balancing decisions. Network plugins (Flannel/Calico) provide the underlying network infrastructure that makes cross-node Pod communication possible.
Commands to Test NodePort π§
# Create NodePort service
kubectl apply -f nodeport-service.yaml
# Check service details
kubectl get services
kubectl describe service <service-name>
# See which nodes have the port open
kubectl get nodes -o wide
# Test from outside cluster
curl http://<node-ip>:<nodeport>
# Check kube-proxy logs
kubectl logs -n kube-system <kube-proxy-pod>
# See iptables rules (on node)
sudo iptables -t nat -L | grep <nodeport>
Kubernetes Namespaces - Essential Guide
π― What are Namespaces?
Simple Definition: Virtual clusters within a physical Kubernetes cluster for resource isolation.
Building Analogy:
Kubernetes Cluster = Building
βββ Floor 1 (production namespace)
β βββ web-app service
β βββ database service
βββ Floor 2 (staging namespace)
β βββ api service
π Default Namespaces
default # Your resources go here by default
kube-system # System components (DNS, dashboard) - DON'T TOUCH
kube-public # Publicly readable
kube-node-lease # Node management - DON'T TOUCH
π¨ Creating & Managing Namespaces
Create Namespace
# Command
kubectl create namespace my-app
kubectl create ns my-app # short form
# YAML
apiVersion: v1
kind: Namespace
metadata:
name: my-app
Basic Operations
# List namespaces
kubectl get ns
# Set default namespace
kubectl config set-context --current --namespace=my-app
# Check current namespace
kubectl config get-contexts
π§ Working with Resources
Apply Resources to Namespace
# Specific namespace
kubectl apply -f app.yaml -n my-app
# List resources
kubectl get pods -n my-app
kubectl get all -n my-app
# All namespaces
kubectl get pods -A
π Service Discovery (Most Important!)
Same Namespace Communication
# Both services in same namespace
curl http://database:3306 # Simple name works
Cross-Namespace Communication
# Full DNS name required
curl http://database.shared.svc.cluster.local:3306
# Short form
curl http://database.shared:3306
DNS Format: service-name.namespace.svc.cluster.local
π Resource Management
Resource Quota
apiVersion: v1
kind: ResourceQuota
metadata:
name: my-quota
namespace: my-app
spec:
hard:
requests.cpu: "2"
requests.memory: 4Gi
limits.cpu: "4"
limits.memory: 8Gi
pods: "10"
Check Usage
kubectl describe quota -n my-app
π Security (RBAC)
Create Role
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: my-app
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
Create RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: my-app
subjects:
- kind: User
name: developer@company.com
roleRef:
kind: Role
name: pod-reader
π Troubleshooting
Common Issues
1. Service can't connect across namespaces
# β Wrong
curl http://database:3306
# β
Correct
curl http://database.shared.svc.cluster.local:3306
2. Namespace stuck in "Terminating"
# Remove finalizers
kubectl patch namespace stuck-ns -p '{"metadata":{"finalizers":[]}}' --type=merge
3. Resource quota exceeded
kubectl describe quota -n my-namespace
π― Interview Questions & Detailed Answers
Basic Level Questions
Q1: What is a Kubernetes namespace and why do we need it?
A: A namespace is a virtual cluster within a physical Kubernetes cluster that provides:
- Resource Isolation: Separate resources logically
- Multi-tenancy: Multiple teams can share same cluster
- Name scoping: Same resource names can exist in different namespaces
- Security boundaries: Apply different RBAC policies per namespace
- Resource management: Set quotas and limits per namespace
Q2: What are the default namespaces in Kubernetes? Explain each.
A:
-
default
: Where resources go when no namespace is specified -
kube-system
: Contains system components like CoreDNS, kube-proxy, etcd -
kube-public
: Publicly readable by all users, contains cluster info -
kube-node-lease
: Contains lease objects for node heartbeat mechanism
Q3: How do you create a namespace? Show multiple ways.
A:
# Method 1: Direct command
kubectl create namespace production
# Method 2: YAML file
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
environment: prod
# Method 3: Declarative
kubectl apply -f namespace.yaml
# Method 4: With dry-run
kubectl create namespace production --dry-run=client -o yaml > ns.yaml
Intermediate Level Questions
Q4: How do services communicate across namespaces? Provide examples.
A: Services use DNS resolution:
-
Same namespace:
curl http://database:3306
-
Cross-namespace:
curl http://database.shared.svc.cluster.local:3306
-
Short form:
curl http://database.shared:3306
DNS Format: <service>.<namespace>.svc.<cluster-domain>
Example:
# Frontend in 'web' namespace accessing backend in 'api' namespace
apiVersion: v1
kind: ConfigMap
metadata:
name: frontend-config
namespace: web
data:
API_URL: "http://backend-service.api.svc.cluster.local:8080"
Q5: What are resource quotas? Why and how to implement them?
A: Resource quotas limit resource consumption per namespace to prevent resource exhaustion.
Why needed:
- Prevent one team from using all cluster resources
- Ensure fair resource distribution
- Control costs in cloud environments
- Maintain cluster stability
Implementation:
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
requests.cpu: "10" # Total CPU requests
requests.memory: 20Gi # Total memory requests
limits.cpu: "20" # Total CPU limits
limits.memory: 40Gi # Total memory limits
pods: "50" # Max pod count
services: "10" # Max service count
persistentvolumeclaims: "20" # Max PVC count
Q6: Can you move resources between namespaces? If not, how to achieve it?
A: No, you cannot directly move resources between namespaces.
Workaround:
# 1. Export resource
kubectl get deployment myapp -n old-ns -o yaml > deployment.yaml
# 2. Edit namespace in YAML
sed -i 's/namespace: old-ns/namespace: new-ns/g' deployment.yaml
# 3. Delete from old namespace
kubectl delete deployment myapp -n old-ns
# 4. Create in new namespace
kubectl apply -f deployment.yaml
Q7: What happens when you delete a namespace? How to prevent accidental deletion?
A: All resources within the namespace are permanently deleted including:
- Pods, Services, Deployments
- ConfigMaps, Secrets
- PersistentVolumeClaims (PVs may remain)
Prevention methods:
# 1. Add finalizers
kubectl patch namespace production -p '{"metadata":{"finalizers":["custom-finalizer"]}}'
# 2. RBAC restrictions
# Don't give namespace delete permissions to regular users
# 3. Use admission controllers
# Implement OPA Gatekeeper policies to prevent deletion
Advanced Level Questions
Q8: How do you handle a namespace stuck in "Terminating" state?
A: This happens when finalizers prevent deletion or resources are stuck.
Diagnosis:
# Check namespace status
kubectl get namespace stuck-ns -o yaml
# Check remaining resources
kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -n stuck-ns
Solutions:
# Method 1: Remove finalizers
kubectl patch namespace stuck-ns -p '{"metadata":{"finalizers":[]}}' --type=merge
# Method 2: Force delete specific resources
kubectl delete pods --all -n stuck-ns --grace-period=0 --force
# Method 3: Edit namespace directly
kubectl edit namespace stuck-ns
# Remove finalizers manually
Q9: Explain RBAC in context of namespaces with practical example.
A: RBAC (Role-Based Access Control) provides fine-grained permissions per namespace.
Complete Example:
# 1. Create Role (namespace-specific permissions)
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: production
name: developer-role
rules:
- apiGroups: [""]
resources: ["pods", "services", "configmaps"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
- apiGroups: ["apps"]
resources: ["deployments", "replicasets"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
# 2. Create RoleBinding (assign role to users)
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: developer-binding
namespace: production
subjects:
- kind: User
name: john@company.com
apiGroup: rbac.authorization.k8s.io
- kind: Group
name: developers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: developer-role
apiGroup: rbac.authorization.k8s.io
Verification:
# Check permissions
kubectl auth can-i create pods -n production --as=john@company.com
kubectl auth can-i delete namespaces --as=john@company.com
Q10: What are the limitations of namespaces?
A:
- Cannot nest namespaces (no hierarchical structure)
- Some resources are cluster-scoped: Nodes, ClusterRoles, PersistentVolumes
- No cross-namespace secret/configmap references directly
- Network isolation requires additional tools (Network Policies)
- DNS overhead: Full FQDN required for cross-namespace communication
- API server overhead: Each namespace adds metadata overhead
Scenario-Based Questions
Q11: Design namespace strategy for a company with 3 teams (frontend, backend, data) and 3 environments (dev, staging, prod).
A:
# Strategy 1: Team-Environment Matrix
frontend-dev, frontend-staging, frontend-prod
backend-dev, backend-staging, backend-prod
data-dev, data-staging, data-prod
# Strategy 2: Shared Services
frontend-dev, frontend-staging, frontend-prod
backend-dev, backend-staging, backend-prod
data-dev, data-staging, data-prod
shared-monitoring # Shared across all teams
shared-logging # Shared across all teams
# Resource Quota Example:
# Production: Higher limits
# Staging: Medium limits
# Dev: Lower limits
Q12: How would you ensure team-A cannot access team-B's resources?
A: Implement namespace-based RBAC isolation:
# Team A Role - only access team-a namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: team-a
name: team-a-full-access
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["*"]
---
# Team A RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: team-a-binding
namespace: team-a
subjects:
- kind: Group
name: team-a-members
roleRef:
kind: Role
name: team-a-full-access
Additional Security:
# Network Policy - block cross-namespace traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-cross-namespace
namespace: team-a
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: team-a
Q13: How do you monitor resource usage across namespaces?
A:
# Resource usage by namespace
kubectl top pods -A
kubectl top nodes
# Quota usage
kubectl describe quota -n production
# Resource requests/limits across namespaces
kubectl get pods -A -o=jsonpath='{range .items[*]}{.metadata.namespace}{"\t"}{.metadata.name}{"\t"}{.spec.containers[*].resources.requests.memory}{"\n"}{end}'
# Using metrics server
kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/production/pods
Q14: What's the difference between Role vs ClusterRole in context of namespaces?
A:
- Role: Namespace-scoped permissions, only works within specific namespace
- ClusterRole: Cluster-wide permissions, can access resources across all namespaces
# Role - Limited to specific namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: production # Only works in production namespace
name: pod-reader
# ClusterRole - Works across all namespaces
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-pod-reader # No namespace field
Usage:
# RoleBinding with ClusterRole (limits ClusterRole to specific namespace)
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: production-admin
namespace: production
roleRef:
kind: ClusterRole # Using ClusterRole
name: admin # Built-in admin ClusterRole
# ClusterRoleBinding (cluster-wide access)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-admin-binding
roleRef:
kind: ClusterRole
name: cluster-admin # Full cluster access
π Essential Commands
# Namespace operations
kubectl create ns my-app
kubectl get ns
kubectl delete ns my-app # β οΈ Deletes everything inside
# Resource operations
kubectl apply -f app.yaml -n my-app
kubectl get pods -n my-app
kubectl get all -A
# Context switching
kubectl config set-context --current --namespace=my-app
# Debugging
kubectl describe ns my-app
kubectl describe quota -n my-app
Kubernetes Manual Pod Scheduling -
Core Concept: How Scheduling Works
Default Kubernetes Scheduling Process
-
Pod Creation: Pod is created without
nodeName
field -
Scheduler Detection: Scheduler finds pods with
nodeName
= null - Node Selection: Scheduling algorithm selects optimal node
-
Binding: Scheduler sets
nodeName
and creates binding object - Pod Placement: Pod gets scheduled on the assigned node
What Happens Without a Scheduler?
- Pods remain in Pending state indefinitely
- No automatic node assignment occurs
- Manual intervention required
Aap manually bhi ek pod ko kisi specific node par chalane ke liye nodeName field ka use kar sakte hain, lekin ye tabhi possible hai jab aap pod ko create kar rahe ho. Matlab, scheduler ko bypass karke aap seedha specify kar sakte hain ki pod kis node par chale.
Jab pod already kisi node par run kar raha ho, toh uska node assignment badalna allowed nahi hai. Kubernetes mein running pod ko aap direct kisi aur node par move nahi kar sakte. Agar aapko pod ko dusre node par chalana hai, toh pehle us pod ko delete karna padta hai aur phir naya pod create karna padta hai jisme nodeName ya koi nodeSelector specify kiya gaya ho.
Wahin agar pod abhi bhi Pending state mein ho aur abhi tak scheduler ne usse assign na kiya ho, toh aap Kubernetes ke Binding API ka use kar sakte hain. Is API ke zariye aap manually ek POST request karke pod ko kisi bhi node par assign kar sakte hain. Lekin ye sirf un pods ke liye kaam karta hai jo abhi scheduled nahi hue hain.
Toh overall, running pods ke node assignment ko badalne ka koi direct tareeka nahi hai. Uske liye aapko pod ko delete karna aur phir desired node par assign karte hue recreate karna padta hai. Pending pods ke liye aap manual binding kar sakte hain.
Manual Scheduling Methods
Method 1: Set nodeName During Pod Creation
Best Practice: Specify node during pod creation
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
nodeName: node01 # Manual node assignment
Key Points:
- Simple and straightforward approach
- Must be done at creation time only
- Cannot modify
nodeName
after pod creation
Method 2: Binding Object for Existing Pods
When pod already exists and needs manual assignment:
apiVersion: v1
kind: Binding
metadata:
name: nginx
target:
apiVersion: v1
kind: Node
name: node02
Process:
- Create binding object (YAML)
- Convert YAML to JSON format
- Send POST request to pod's binding API
- Mimics actual scheduler behavior
API Call Example:
curl --header "Content-Type: application/json" \
--request POST \
--data '{"apiVersion":"v1","kind":"Binding",...}' \
http://localhost:8001/api/v1/namespaces/default/pods/nginx/binding
Key Limitations & Rules
nodeName Field Restrictions
-
Read-only after creation: Cannot edit
nodeName
on existing pods -
Creation-time only: Must specify during
kubectl create
- No validation: Kubernetes doesn't verify if node exists
- Direct assignment: Bypasses scheduler completely
When Manual Scheduling is Needed
- No scheduler present in cluster
- Custom scheduling logic required
- Testing scenarios for specific node placement
- Troubleshooting scheduler issues
Common Interview Questions & Answers
Q1: What happens to pods when there's no scheduler?
Answer: Pods remain in Pending state indefinitely because no component assigns them to nodes. The nodeName
field stays empty, and pods cannot run without being placed on a node.
Q2: Can you change a pod's node assignment after creation?
Answer: No, you cannot modify the nodeName
field of an existing pod. If you need to move a pod, you must:
- Delete the existing pod
- Create a new pod with the desired
nodeName
- Or use a binding object with the binding API
Q3: What's the difference between scheduler assignment and manual assignment?
Answer:
- Scheduler: Uses algorithms to find optimal node based on resources, constraints, etc.
- Manual: Direct assignment bypassing all scheduling logic and resource checks
Q4: How does the binding object work?
Answer: The binding object mimics the scheduler's behavior by:
- Creating a binding between pod and target node
- Sending POST request to pod's binding API
- Setting the
nodeName
field programmatically - Allowing assignment to existing pods
Q5: What are the risks of manual scheduling?
Answer:
- Resource conflicts: May assign to nodes without sufficient resources
- No constraint checking: Bypasses node selectors, taints, tolerations
- Poor distribution: No load balancing across nodes
- Maintenance overhead: Manual tracking required
Practical Examples
Scenario 1: Create Pod on Specific Node
apiVersion: v1
kind: Pod
metadata:
name: web-app
spec:
containers:
- name: web
image: nginx
nodeName: worker-node-01 # Direct assignment
Scenario 2: Emergency Pod Placement
# When scheduler is down, create pod manually
kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: emergency-pod
spec:
containers:
- name: app
image: busybox
nodeName: node02
EOF
Scenario 3: Binding Existing Pod
# binding.yaml
apiVersion: v1
kind: Binding
metadata:
name: stuck-pod
target:
apiVersion: v1
kind: Node
name: worker-node-02
Best Practices & Tips
For Production
- Avoid manual scheduling unless absolutely necessary
- Use node selectors instead for controlled placement
- Implement proper monitoring for scheduler health
- Document manual assignments for troubleshooting
For Exams/Interviews
- Know both methods: nodeName and binding object
- Understand limitations: creation-time vs runtime assignment
- Practice YAML to JSON conversion for binding objects
- Remember API endpoints for binding operations
Troubleshooting Steps
- Check if scheduler is running:
kubectl get pods -n kube-system
- Verify node availability:
kubectl get nodes
- Check pod status:
kubectl describe pod <pod-name>
- Apply manual scheduling if needed
Key Commands & Operations
# Check scheduler status
kubectl get pods -n kube-system | grep scheduler
# Check nodes available for scheduling
kubectl get nodes
# Check pod scheduling status
kubectl get pods -o wide
# Describe pod for scheduling details
kubectl describe pod <pod-name>
# Create pod with manual node assignment
kubectl apply -f pod-with-nodename.yaml
Real-World Use Cases
When Manual Scheduling is Useful
- Hardware-specific workloads (GPU pods on specific nodes)
- Data locality requirements (pods near data storage)
- Testing and development environments
- Scheduler debugging and troubleshooting
- Emergency situations when scheduler fails
Alternatives to Consider
- Node Selectors: Label-based node selection
- Node Affinity: Advanced node selection rules
- Taints and Tolerations: Node exclusion mechanisms
- Custom Schedulers: Implementing specific scheduling logic
Kubernetes Taints & Tolerations - Complete Guide
π― Core Concepts
What Problem Do They Solve?
Imagine you have a Kubernetes cluster with different types of nodes:
- GPU nodes (expensive, for ML workloads)
- High-memory nodes (for databases)
- Regular nodes (for web apps)
- Master nodes (for cluster management)
Without taints/tolerations, Kubernetes might put a simple web app on your expensive GPU node - wasteful!
The Rule
- Taints = "Keep away unless you have permission"
- Tolerations = "I have permission to ignore this taint"
π§ Syntax Deep Dive
Taint Structure
key=value:effect
Components:
-
Key: Category (e.g.,
gpu
,memory
,app
) -
Value: Specific identifier (e.g.,
nvidia-v100
,high
,database
) - Effect: What happens to non-tolerating pods
Toleration Structure
tolerations:
- key: "gpu"
operator: "Equal" # or "Exists"
value: "nvidia-v100"
effect: "NoSchedule"
π Taint Effects Explained
1. NoSchedule
- What it does: Prevents NEW pods from being scheduled
- Existing pods: Stay running
- Use case: Gradual migration
2. PreferNoSchedule
- What it does: "Soft" restriction - avoid if possible
- Fallback: If no other nodes available, still schedule here
- Use case: Cost optimization
3. NoExecute
- What it does: Evicts existing pods + prevents new ones
- Immediate action: Kicks out non-tolerating pods
- Use case: Emergency maintenance, security isolation
ποΈ Real-World Scenarios
Scenario 1: GPU Node Dedication
Problem: You have 1 GPU node and 3 regular nodes. Want only ML workloads on GPU.
# Taint the GPU node
kubectl taint nodes gpu-node-1 hardware=gpu:NoSchedule
Pod without toleration (web app):
apiVersion: v1
kind: Pod
metadata:
name: web-app
spec:
containers:
- name: nginx
image: nginx
Result: Scheduled on regular nodes only β
Pod with toleration (ML workload):
apiVersion: v1
kind: Pod
metadata:
name: ml-training
spec:
tolerations:
- key: "hardware"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
containers:
- name: pytorch
image: pytorch/pytorch
Result: Can be scheduled on GPU node β
Scenario 2: Database Node Isolation
Setup: Dedicate node-2 for database workloads only.
kubectl taint nodes node-2 workload=database:NoSchedule
Regular app (gets rejected):
apiVersion: v1
kind: Pod
metadata:
name: frontend
spec:
containers:
- name: react-app
image: react-frontend
# No toleration = rejected from node-2
Database pod (gets accepted):
apiVersion: v1
kind: Pod
metadata:
name: postgres-db
spec:
tolerations:
- key: "workload"
operator: "Equal"
value: "database"
effect: "NoSchedule"
containers:
- name: postgres
image: postgres:13
Scenario 3: Emergency Node Evacuation
Situation: Node-3 has hardware issues, need to evacuate all pods immediately.
kubectl taint nodes node-3 status=maintenance:NoExecute
What happens:
- All existing pods without matching toleration get evicted immediately
- No new pods can be scheduled
- Only pods with
status=maintenance
toleration can stay
Scenario 4: Multi-Environment Setup
Setup: Same cluster for dev, staging, prod environments.
# Taint nodes by environment
kubectl taint nodes prod-node-1 env=production:NoSchedule
kubectl taint nodes staging-node-1 env=staging:NoSchedule
kubectl taint nodes dev-node-1 env=development:NoSchedule
Production deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: prod-api
spec:
template:
spec:
tolerations:
- key: "env"
value: "production"
effect: "NoSchedule"
operator: "Equal"
containers:
- name: api
image: myapp:v1.0
π Operator Types
Equal Operator
tolerations:
- key: "app"
operator: "Equal"
value: "database"
effect: "NoSchedule"
Matches: app=database:NoSchedule
Exists Operator
tolerations:
- key: "special-node"
operator: "Exists"
effect: "NoSchedule"
Matches: Any taint with key special-node
, regardless of value
Empty Key (Universal Toleration)
tolerations:
- operator: "Exists"
Matches: ALL taints on any node (use carefully!)
π― Advanced Patterns
aceful Eviction with Timeout
# Taint with toleration seconds
kubectl taint nodes node-1 maintenance=true:NoExecute
tolerations:
- key: "maintenance"
operator: "Equal"
value: "true"
effect: "NoExecute"
tolerationSeconds: 300 # Stay for 5 minutes then leave
Pattern 2: Multi-Taint Node
# Multiple taints on same node
kubectl taint nodes special-node hardware=gpu:NoSchedule
kubectl taint nodes special-node memory=high:NoSchedule
kubectl taint nodes special-node cost=expensive:PreferNoSchedule
Pod needs ALL tolerations:
tolerations:
- key: "hardware"
value: "gpu"
effect: "NoSchedule"
operator: "Equal"
- key: "memory"
value: "high"
effect: "NoSchedule"
operator: "Equal"
- key: "cost"
value: "expensive"
effect: "PreferNoSchedule"
operator: "Equal"
Pattern 3: Conditional Scheduling
Use case: Schedule pods only on nodes with SSD storage.
kubectl taint nodes fast-node storage=ssd:NoSchedule
apiVersion: v1
kind: Pod
metadata:
name: fast-database
spec:
tolerations:
- key: "storage"
value: "ssd"
effect: "NoSchedule"
containers:
- name: db
image: postgres
π¨ Common Pitfalls & Solutions
Pitfall 1: Forgetting Quotes in YAML
Wrong:
tolerations:
- key: app
value: database
Correct:
tolerations:
- key: "app"
value: "database"
Pitfall 2: Mismatched Taint/Toleration Values
Taint: kubectl taint nodes node-1 app=web:NoSchedule
Toleration: value: "webapp"
β
Must match exactly: value: "web"
β
Pitfall 3: Wrong Effect
Taint: app=db:NoSchedule
Toleration: effect: "NoExecute"
β
Must match: effect: "NoSchedule"
β
π Debugging Commands
View Node Taints
kubectl describe node <node-name>
# Look for "Taints" section
# Or get all taints across cluster
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
Check Why Pod Isn't Scheduled
kubectl describe pod <pod-name>
# Check "Events" section for taint-related messages
Remove Taint
# Add minus (-) at end
kubectl taint nodes node-1 app=database:NoSchedule-
πͺ Fun Examples
Example 1: "VIP Only" Node
kubectl taint nodes vip-node access=premium:NoSchedule
Example 2: "Night Shift" Workloads
kubectl taint nodes batch-node schedule=night:PreferNoSchedule
Example 3: "Experimental" Features
kubectl taint nodes test-node stability=experimental:NoSchedule
Kubernetes Node Scheduling: Node Selectors & Node Affinity
Problem Statement
- Challenge: By default, pods can be scheduled on any node in the cluster
- Need: Control which pods run on which nodes based on node capabilities
- Example: Run heavy data processing workloads only on high-resource nodes
1. Node Selectors (Simple Method)
What is Node Selector?
- Simple way to constrain pods to specific nodes
- Uses labels and selectors to match pods with nodes
- Easy to implement but limited functionality
Step-by-Step Implementation
Step 1: Label Your Nodes
kubectl label nodes <node-name> <key>=<value>
# Example:
kubectl label nodes node-1 size=large
kubectl label nodes node-2 size=small
kubectl label nodes node-3 size=medium
Step 2: Add Node Selector to Pod Definition
apiVersion: v1
kind: Pod
metadata:
name: data-processing-pod
spec:
containers:
- name: app
image: data-processor
nodeSelector:
size: large # This pod will only run on nodes labeled "size=large"
Node Selector Limitations
β Cannot use complex logic (OR, NOT conditions)
β Cannot say "large OR medium nodes"
β Cannot say "NOT small nodes"
β Only supports simple equality matching
2. Node Affinity (Advanced Method)
What is Node Affinity?
- Advanced way to control pod placement
- Supports complex expressions and conditions
- More flexible but more complex syntax
Basic Node Affinity Syntax
apiVersion: v1
kind: Pod
metadata:
name: advanced-pod
spec:
containers:
- name: app
image: my-app
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: size
operator: In
values:
- large
Node Affinity Operators
Operator | Description | Example |
---|---|---|
In |
Value must be in the list | size In [large, medium] |
NotIn |
Value must NOT be in the list | size NotIn [small] |
Exists |
Key must exist (ignore value) | gpu Exists |
DoesNotExist |
Key must NOT exist | maintenance DoesNotExist |
Advanced Examples
Example 1: Multiple Options (OR Logic)
# Place pod on large OR medium nodes
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: size
operator: In
values:
- large
- medium
Example 2: Exclude Specific Nodes
# Place pod on any node that is NOT small
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: size
operator: NotIn
values:
- small
Example 3: Check Label Existence
# Place pod only on nodes that have GPU label
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: gpu
operator: Exists
3. Node Affinity Types
Understanding the Lifecycle States
- During Scheduling: When pod is first created
- During Execution: When pod is already running
Available Types
1. requiredDuringSchedulingIgnoredDuringExecution
- During Scheduling: MUST find matching node or pod won't be scheduled
- During Execution: If node labels change, pod continues running
- Use Case: Critical placement requirements
2. preferredDuringSchedulingIgnoredDuringExecution
- During Scheduling: TRY to find matching node, but schedule anywhere if not found
- During Execution: If node labels change, pod continues running
- Use Case: Nice-to-have placement preferences
3. requiredDuringSchedulingRequiredDuringExecution (Planned)
- During Scheduling: MUST find matching node
- During Execution: If node labels change, pod will be evicted
- Use Case: Strict enforcement throughout pod lifecycle
Practical Examples
Required Affinity (Strict)
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- ssd
# Pod will NOT be scheduled if no SSD nodes available
Preferred Affinity (Flexible)
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: disktype
operator: In
values:
- ssd
# Pod prefers SSD nodes but will run on any node if needed
4. Combining Taints, Tolerations, and Node Affinity
The Complete Solution
To fully control pod placement and prevent unwanted scheduling:
Step 1: Apply Taints (Prevent unwanted pods)
kubectl taint nodes blue-node color=blue:NoSchedule
kubectl taint nodes red-node color=red:NoSchedule
kubectl taint nodes green-node color=green:NoSchedule
Step 2: Add Tolerations (Allow specific pods)
# Blue pod tolerates blue node
tolerations:
- key: "color"
operator: "Equal"
value: "blue"
effect: "NoSchedule"
Step 3: Add Node Affinity (Ensure correct placement)
# Blue pod prefers blue node
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: color
operator: In
values:
- blue
Why Use Both?
- Taints + Tolerations: Prevent other pods from running on your nodes
- Node Affinity: Prevent your pods from running on other nodes
- Together: Complete isolation and control
Quick Reference
When to Use What?
Scenario | Best Solution |
---|---|
Simple label matching | Node Selectors |
Complex logic (OR, NOT) | Node Affinity |
Critical placement | Required Node Affinity |
Preferred but flexible | Preferred Node Affinity |
Complete isolation | Taints + Tolerations + Node Affinity |
Common Commands
# Label a node
kubectl label nodes <node-name> <key>=<value>
# View node labels
kubectl get nodes --show-labels
# Remove a label
kubectl label nodes <node-name> <key>-
# View node taints
kubectl describe node <node-name>
Best Practices
- Start Simple: Use Node Selectors for basic requirements
- Plan Labels: Create a consistent labeling strategy
- Test Thoroughly: Verify pod placement after configuration
- Document Labels: Keep track of your node labeling scheme
- Monitor: Watch for scheduling failures in your cluster
- Combine Wisely: Use Taints + Node Affinity for complete control
Interview Questions & Answers
Conceptual Questions
Q1: What's the difference between Node Selector and Node Affinity?
A:
- Node Selector: Simple key-value matching, limited to equality only
- Node Affinity: Advanced matching with operators (In, NotIn, Exists), supports complex logic
- Use Node Selector for simple cases, Node Affinity for complex requirements
Q2: What happens if a pod with required node affinity can't find a matching node?
A: The pod remains in Pending state indefinitely. The scheduler won't place it on any node until a matching node becomes available. This is different from "preferred" affinity where the pod would be scheduled on any available node.
Q3: Can you combine Node Selector and Node Affinity in the same pod?
A: Yes, but it's redundant. If both are specified, BOTH conditions must be satisfied. However, it's better practice to use only Node Affinity as it can handle all Node Selector use cases and more.
Scenario-Based Questions
Q4: You have a cluster with 3 nodes: 2 CPU-only nodes and 1 GPU node. How would you ensure ML workloads only run on the GPU node?
A:
# Step 1: Label the GPU node
kubectl label nodes gpu-node-1 hardware=gpu
# Step 2: Use node affinity in ML pod
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hardware
operator: In
values:
- gpu
Q5: Your company has dev, staging, and prod nodes in the same cluster. How would you ensure prod pods never run on dev/staging nodes and vice versa?
A: Use Taints + Tolerations + Node Affinity:
# Step 1: Taint nodes
kubectl taint nodes prod-node env=production:NoSchedule
kubectl taint nodes dev-node env=development:NoSchedule
kubectl taint nodes staging-node env=staging:NoSchedule
# Step 2: Label nodes
kubectl label nodes prod-node environment=production
kubectl label nodes dev-node environment=development
kubectl label nodes staging-node environment=staging
# Step 3: Production pod configuration
spec:
tolerations:
- key: "env"
value: "production"
effect: "NoSchedule"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: environment
operator: In
values:
- production
Q6: You need to run a pod on either large or xlarge nodes, but never on small nodes. How would you configure this?
A:
# Option 1: Using In operator
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: size
operator: In
values:
- large
- xlarge
# Option 2: Using NotIn operator
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: size
operator: NotIn
values:
- small
- medium
Troubleshooting Questions
Q7: Your pod is stuck in Pending state. How would you troubleshoot node affinity issues?
A:
# Step 1: Check pod status and events
kubectl describe pod <pod-name>
# Step 2: Check if matching nodes exist
kubectl get nodes --show-labels | grep <your-label>
# Step 3: Check node capacity and resources
kubectl describe nodes
# Step 4: Verify node affinity syntax in pod spec
kubectl get pod <pod-name> -o yaml
Q8: After labeling a node, existing pods didn't move to it. Why?
A: Node affinity only affects new pod scheduling. Existing running pods are not moved automatically. The "IgnoredDuringExecution" part means changes to node labels don't affect already running pods. To move existing pods, you need to delete and recreate them.
Advanced Scenarios
Q9: You want to run a backup job on the least loaded node. How would you achieve this?
A: Use preferredDuringSchedulingIgnoredDuringExecution with multiple preferences:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: workload
operator: In
values:
- light
- weight: 50
preference:
matchExpressions:
- key: zone
operator: In
values:
- us-east-1a
Q10: How would you ensure a pod runs on nodes in specific availability zones during disaster recovery?
A:
# For multi-zone deployment
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- us-east-1a
- us-east-1b
# Prefer primary zone but allow secondary
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- us-east-1a
Performance & Best Practices
Q11: What's the performance impact of complex node affinity rules?
A:
- Complex affinity rules increase scheduling time
- Each matchExpression is evaluated for every node
-
Best Practices:
- Use fewer, broader labels when possible
- Combine multiple conditions in single matchExpressions
- Use "preferred" over "required" when flexibility is acceptable
- Monitor scheduler performance with complex rules
Q12: When would you use "Exists" operator instead of "In"?
A:
- Use "Exists" when you only care that a label is present, not its value
- Example: Checking if a node has GPU (regardless of GPU type)
- key: nvidia.com/gpu
operator: Exists
# vs
- key: nvidia.com/gpu
operator: In
values: ["tesla-v100", "tesla-k80", "rtx-3090"]
Real-world Implementation
Q13: Your application needs to be close to a database for low latency. How would you co-locate them?
A:
# Step 1: Label the node where database runs
kubectl label nodes db-node-1 database=mysql-primary
# Step 2: Configure application pod
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: database
operator: In
values:
- mysql-primary
Q14: How would you gradually migrate workloads from old nodes to new nodes?
A:
# Phase 1: Prefer new nodes but allow old ones
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-generation
operator: In
values:
- new
# Phase 2: Require new nodes only (after testing)
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-generation
operator: In
values:
- new
Kubernetes Resource Management - Interview Notes
1. Core Concepts
Resource Fundamentals
- Each node has CPU and memory resources available
- Every pod requires resources to run
- Kubernetes scheduler decides pod placement based on resource availability
- If insufficient resources exist, pod remains in PENDING state
- Error visible via
kubectl describe pod
: "Insufficient CPU"
Scheduler Behavior
- Takes into consideration pod resource requirements vs node availability
- Places pod on node with sufficient resources
- Holds back scheduling if no node has adequate resources
2. Resource Requests
Definition
- Minimum amount of CPU/memory requested by container
- Used by scheduler to identify suitable nodes
- Guarantees that amount of resources will be available
YAML Configuration
resources:
requests:
memory: "4Gi"
cpu: "2"
CPU Units
- 1 CPU = 1 vCPU (AWS) = 1 core (GCP/Azure) = 1 hyperthread
- Can specify decimal values: 0.1 or millicores: 100m
- Minimum value: 1m (0.001 CPU)
- Examples: 0.1, 100m, 0.5, 500m, 2, 5
Memory Units
- Decimal units (1000-based): K, M, G, T
- Binary units (1024-based): Ki, Mi, Gi, Ti
- Examples: 256Mi, 1Gi (1024 MiB), 500M (500 MB)
- Important: G β Gi (1000 vs 1024 based)
3. Resource Limits
Definition
- Maximum resources a container can consume
- Prevents resource starvation of other pods/processes
- Set per container within a pod
YAML Configuration
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "2Gi"
cpu: "2"
Behavior When Limits Exceeded
CPU Limits:
- System throttles CPU usage
- Container cannot exceed CPU limit
- No pod termination occurs
Memory Limits:
- Container can temporarily exceed memory limit
- If consistently exceeded: Pod terminated with OOM Kill
- OOM = Out of Memory Kill
4. Default Behavior and Configuration Scenarios
Default Kubernetes Behavior
- No requests or limits set by default
- Any pod can consume unlimited resources
- Can lead to resource starvation
Configuration Scenarios
No Requests, No Limits:
- Problem: One pod can consume all resources
- Other pods may be starved of resources
No Requests, Limits Set:
- Kubernetes automatically sets requests = limits
- Each pod gets guaranteed resources equal to limits
- More restrictive than necessary
Requests and Limits Both Set:
- Guaranteed minimum (requests) + maximum cap (limits)
- Good for predictable workloads
- May not utilize available extra resources efficiently
Requests Set, No Limits (Recommended):
- Best practice for most scenarios
- Guaranteed minimum resources via requests
- Can consume additional available resources when needed
- Critical: ALL pods must have requests set
When to Use Limits
- Multi-tenant environments (prevent resource abuse)
- Public/shared platforms
- Security concerns (prevent cryptocurrency mining, etc.)
- Predictable workloads with known resource patterns
5. Memory vs CPU Behavior Differences
CPU Resource Management
- Can be throttled when limit reached
- Pods can share available CPU cycles
- Requests guarantee minimum CPU availability
- No limits allows using extra cycles when available
Memory Resource Management
- Cannot be throttled like CPU
- Once assigned, only way to free memory is to kill pod
- If pod exceeds memory limits persistently: OOM Kill
- Memory cannot be easily reclaimed without termination
6. LimitRange
Purpose
- Sets default requests/limits for containers without explicit values
- Namespace-level object
- Defines minimum and maximum boundaries
YAML Example
apiVersion: v1
kind: LimitRange
metadata:
name: cpu-resource-constraint
spec:
limits:
- default:
cpu: "500m"
memory: "512Mi"
defaultRequest:
cpu: "500m"
memory: "512Mi"
max:
cpu: "1"
memory: "1Gi"
min:
cpu: "100m"
memory: "128Mi"
type: Container
Important Notes
- Only affects newly created pods
- Existing pods remain unchanged
- Applied per namespace
7. ResourceQuota
Purpose
- Sets total resource limits at namespace level
- Controls aggregate resource consumption across all pods
YAML Example
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
spec:
hard:
requests.cpu: "4"
requests.memory: "4Gi"
limits.cpu: "10"
limits.memory: "10Gi"
8. Interview Questions and Answers
Q: What happens if no resources are specified for a pod?
A: Pod can consume unlimited resources, potentially starving other pods. Kubernetes has no default limits.
Q: What's the difference between requests and limits?
A: Requests are minimum guaranteed resources used for scheduling. Limits are maximum resources a container can consume.
Q: How do CPU and memory limits behave differently?
A: CPU is throttled when limit reached (no termination). Memory causes pod termination if consistently exceeded (OOM Kill).
Q: What's the best practice for production workloads?
A: Set requests for all containers to guarantee resources. Set limits only when necessary to prevent resource abuse.
Q: How do you set default resources for all pods in a namespace?
A: Use LimitRange object to set defaults for pods without explicit resource specifications.
Q: What happens if you set limits but no requests?
A: Kubernetes automatically sets requests equal to limits.
Q: Why is "requests only, no limits" often recommended?
A: Provides guaranteed resources while allowing pods to use extra available resources when needed.
Q: What is OOM Kill?
A: Out of Memory Kill - when a pod consistently exceeds memory limits and gets terminated.
Q: Do LimitRange changes affect existing pods?
A: No, LimitRange only affects newly created pods.
Q: How do you limit total cluster resource usage?
A: Use ResourceQuota at namespace level to set hard limits on aggregate requests and limits.
9. Essential Commands
# Check pod resource usage and events
kubectl describe pod <pod-name>
# View node resource capacity
kubectl describe node <node-name>
# Check LimitRange in namespace
kubectl get limitrange
# Check ResourceQuota
kubectl get resourcequota
# View pod resource specifications
kubectl get pod <pod-name> -o yaml
Kubernetes DaemonSets - Complete Study Notes
What is a DaemonSet?
A DaemonSet is a Kubernetes controller that ensures a copy of a specific pod runs on every node in the cluster.
Key Characteristics:
- Runs one copy of a pod on each node
- Automatically adds pods to new nodes when they join the cluster
- Automatically removes pods when nodes are removed from the cluster
- Maintains exactly one pod per node (no more, no less)
DaemonSet vs ReplicaSet
Feature | ReplicaSet | DaemonSet |
---|---|---|
Pod Distribution | Spreads pods across multiple nodes | One pod per node |
Replica Count | Fixed number of replicas | One replica per node |
Scaling | Manual scaling by changing replica count | Scales automatically with cluster nodes |
Purpose | High availability of applications | System-level services on every node |
Common Use Cases
1. Monitoring Agents
- Deploy monitoring tools (like Prometheus Node Exporter) on every node
- Collect metrics from each worker node
- Examples: DataDog agents, New Relic agents
2. Log Collectors
- Deploy log collection agents on every node
- Collect logs from all containers running on each node
- Examples: Fluentd, Filebeat, Logstash agents
3. System Components
- kube-proxy: Required on every node for network routing
- Essential Kubernetes components that must run on all nodes
4. Networking Solutions
- Deploy network plugins and agents
- Examples: Calico, Flannel, Weave Net agents
- Ensure networking functionality on every node
5. Security Agents
- Deploy security monitoring tools
- Vulnerability scanners
- Compliance agents
DaemonSet Definition File Structure
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: monitoring-daemon
labels:
app: monitoring
spec:
selector:
matchLabels:
app: monitoring
template:
metadata:
labels:
app: monitoring
spec:
containers:
- name: monitoring-agent
image: monitoring-agent:latest
resources:
limits:
memory: "128Mi"
cpu: "100m"
Key Sections Explained:
-
apiVersion:
apps/v1
(standard for DaemonSets) -
kind:
DaemonSet
(specifies the resource type) - metadata: Name and labels for the DaemonSet
- spec.selector: Links DaemonSet to pods using labels
- spec.template: Pod specification that will be created on each node
Important: Labels in selector.matchLabels
must match labels in template.metadata.labels
Essential kubectl Commands
Create DaemonSet
kubectl create -f daemonset-definition.yaml
View DaemonSets
kubectl get daemonsets
kubectl get ds # Short form
View DaemonSet Details
kubectl describe daemonset <daemonset-name>
View Pods Created by DaemonSet
kubectl get pods -l app=<label-name>
Delete DaemonSet
kubectl delete daemonset <daemonset-name>
How DaemonSets Work Internally
Historical Approach (Before Kubernetes v1.12):
- Used nodeName property in pod specification
- Bypassed the Kubernetes scheduler completely
- Directly assigned pods to specific nodes
Modern Approach (Kubernetes v1.12+):
- Uses the default Kubernetes scheduler
- Implements Node Affinity rules
- More integrated with cluster scheduling mechanisms
- Better resource management and constraints handling
Node Affinity in DaemonSets
DaemonSets automatically set node affinity rules to ensure pods are scheduled on appropriate nodes:
spec:
template:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- target-node-name
Important Considerations
Resource Management
- DaemonSet pods consume resources on every node
- Plan resource allocation carefully
- Set resource limits and requests
Node Selectors and Taints
- Use nodeSelector to target specific nodes
- Handle node taints and tolerations
- Exclude master nodes if necessary
Updates and Rollouts
- DaemonSets support rolling updates
- Update strategy can be configured
- Monitor rollout status during updates
Networking
- DaemonSet pods can use hostNetwork for system-level access
- Be careful with port conflicts
- Consider security implications
Best Practices
- Resource Limits: Always set CPU and memory limits
- Security Context: Use appropriate security contexts
- Health Checks: Implement readiness and liveness probes
- Logging: Ensure proper logging configuration
- Monitoring: Monitor DaemonSet pod health across all nodes
- Updates: Plan for rolling updates and rollback strategies
Troubleshooting Common Issues
Pod Not Scheduled on All Nodes
- Check node taints and tolerations
- Verify resource availability on nodes
- Check node selectors and affinity rules
Resource Constraints
- Monitor node resource usage
- Adjust resource requests and limits
- Consider node capacity planning
Network Issues
- Verify hostNetwork settings
- Check for port conflicts
- Validate network policies
Summary
DaemonSets are essential for deploying system-level services that need to run on every node in a Kubernetes cluster. They automatically handle node additions and removals, making them perfect for monitoring, logging, networking, and security agents. Understanding DaemonSets is crucial for managing cluster-wide services and maintaining consistent system-level functionality across all worker nodes.
Static Pods in Kubernetes - Complete Guide
What are Static Pods? π€
Think of static Pods like self-sufficient containers that can run without needing the main Kubernetes control system.
Simple Analogy
Imagine you're a ship captain (kubelet) alone at sea:
- Normal situation: You get orders from headquarters (kube-apiserver) about what to do
- Static Pod situation: You're completely alone, but you still need to run the ship - so you follow pre-written instructions you keep in your cabin
How Normal Pods Work vs Static Pods
Normal Pods (Regular Process):
- kube-scheduler decides which node should run a pod
- kube-apiserver stores this decision in ETCD
- kubelet gets instructions from kube-apiserver
- kubelet creates the pod
Static Pods (Independent Process):
- kubelet reads pod definition files from a local folder
- kubelet creates pods directly (no API server needed!)
- That's it - much simpler!
Key Concepts
What Can You Create with Static Pods?
β Only Pods - that's it!
β Cannot create:
- ReplicaSets
- Deployments
- Services
- ConfigMaps
- etc.
Why? Because kubelet only understands Pods. All other Kubernetes objects need the control plane components.
How to Set Up Static Pods
Step 1: Configure the kubelet
You need to tell kubelet where to look for pod definition files:
Method 1: Direct configuration
# In kubelet.service file
--pod-manifest-path=/etc/kubernetes/manifests
Method 2: Config file (more common)
# In kubelet.service file
--config=/path/to/config.yaml
# In config.yaml file
staticPodPath: /etc/kubernetes/manifests
Step 2: Create Pod Definition Files
Put your pod YAML files in the configured directory:
# Example: /etc/kubernetes/manifests/my-static-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: my-static-app
spec:
containers:
- name: app
image: nginx
ports:
- containerPort: 80
Step 3: kubelet Does the Magic!
- Monitors the manifest folder continuously
- Creates pods from any YAML files it finds
- Restarts pods if they crash
- Updates pods if you modify the files
- Deletes pods if you remove the files
Viewing Static Pods
When kubelet is standalone (no cluster):
# Use docker commands
docker ps
When kubelet is part of a cluster:
# Use kubectl (they appear like normal pods!)
kubectl get pods
Important Behaviors
Mirror Objects πͺ
When kubelet is part of a cluster, it creates a "mirror object" in the API server:
- You can see the static pod via
kubectl get pods
- You cannot edit or delete it via kubectl
- Pod name gets node name appended (e.g.,
my-pod-node01
) - To modify/delete: change the actual file in the manifest folder
Real-World Use Case: Control Plane Components
The Big Question: How does Kubernetes bootstrap itself?
- To run Kubernetes, you need the control plane (API server, scheduler, etc.)
- But control plane components are also containers
- Chicken and egg problem! ππ₯
Solution: Static Pods to the Rescue!
- Install kubelet on master nodes
- Create pod definition files for control plane components:
- kube-apiserver.yaml
- kube-controller-manager.yaml
- kube-scheduler.yaml
- etcd.yaml
- Place these files in kubelet's manifest folder
- kubelet automatically runs the control plane as static pods!
This is exactly how kubeadm sets up Kubernetes clusters.
Static Pods vs DaemonSets
Aspect | Static Pods | DaemonSets |
---|---|---|
Created by | kubelet directly | DaemonSet controller via API server |
Dependency | No API server needed | Requires full control plane |
Scheduling | Ignored by scheduler | Ignored by scheduler |
Use case | Control plane components | System services on all nodes |
Management | File-based | API-based (kubectl) |
Practical Tips for Labs π‘
Finding the manifest folder:
-
Check kubelet service: Look for
--pod-manifest-path
-
Check config file: Look for
--config
option, then findstaticPodPath
-
Common locations:
/etc/kubernetes/manifests
/etc/kubelet/manifests
Troubleshooting:
- Pod not appearing? Check if kubelet service is running
- Pod not updating? Verify file syntax and kubelet logs
- Can't delete via kubectl? Remove/modify the source file instead
Summary π
Static Pods are:
- Pods managed directly by kubelet
- Independent of Kubernetes control plane
- Perfect for running control plane components
- Managed through files, not API calls
- Automatically restarted if they crash
Remember: Static Pods are like having a reliable assistant who follows written instructions even when the boss (API server) isn't around!
Kubernetes Admission Controllers - Complete Study Notes
π Table of Contents
- Overview & Context
- Request Flow in Kubernetes
- Authentication vs Authorization vs Admission Control
- What are Admission Controllers?
- Types of Admission Controllers
- Built-in Admission Controllers
- Configuration & Management
- Real-world Examples
- Interview Questions & Answers
- Practical Commands
π Overview & Context
Admission Controllers are plugins that act as gatekeepers in Kubernetes, intercepting requests to the API server after authentication and authorization but before the object is persisted in etcd.
π― Key Purpose
- Security Enhancement: Enforce policies beyond basic RBAC
- Configuration Validation: Ensure objects meet specific requirements
- Request Modification: Automatically modify or enrich requests
- Operational Enforcement: Apply organizational standards
π Request Flow in Kubernetes
kubectl create pod β API Server β Authentication β Authorization β Admission Controllers β etcd
β β β
Certificates RBAC Rules Policy Validation
User Identity Permissions Configuration Checks
Step-by-Step Flow:
- User Request: kubectl command sent to API server
- Authentication: Verify user identity (certificates, tokens)
- Authorization: Check permissions using RBAC
- Admission Control: Apply policies and validations
- Storage: Persist object in etcd database
π Authentication vs Authorization vs Admission Control
Phase | Purpose | Example | Focus |
---|---|---|---|
Authentication | Who are you? | Certificate validation | Identity verification |
Authorization | What can you do? | RBAC roles and permissions | API-level access control |
Admission Control | How should you do it? | Image registry restrictions | Configuration and policy enforcement |
RBAC Limitations (Solved by Admission Controllers):
- β Cannot validate image sources
- β Cannot enforce tag policies (no "latest" tags)
- β Cannot check security contexts
- β Cannot mandate labels/annotations
- β Cannot modify requests automatically
π‘οΈ What are Admission Controllers?
Definition
Admission controllers are pieces of code that intercept requests to the Kubernetes API server prior to persistence of the object, but after the request is authenticated and authorized.
Two Types of Operations:
- Validating: Check if request meets criteria (Accept/Reject)
- Mutating: Modify the request before processing
Capabilities:
- β Validate configuration files
- β Reject non-compliant requests
- β Modify/enrich requests automatically
- β Perform additional operations
- β Enforce organizational policies
π Types of Admission Controllers
1. Validating Admission Controllers
- Purpose: Validate requests against policies
- Action: Accept or Reject (no modifications)
-
Examples:
- SecurityContextDeny
- ResourceQuota
- PodSecurityPolicy
2. Mutating Admission Controllers
- Purpose: Modify requests before validation
- Action: Change request content
-
Examples:
- DefaultStorageClass
- NamespaceLifecycle
- DefaultTolerationSeconds
3. Custom Admission Controllers
- Admission Webhooks: External services for validation/mutation
-
Types:
- ValidatingAdmissionWebhook
- MutatingAdmissionWebhook
ποΈ Built-in Admission Controllers
Always Enabled (Default):
-
NamespaceLifecycle
- Prevents deletion of system namespaces
- Rejects requests to non-existent namespaces
- Replaces deprecated NamespaceExists and NamespaceAutoProvision
-
NodeRestriction
- Restricts kubelet's ability to modify Node/Pod objects
- Security enhancement for node permissions
-
ServiceAccount
- Implements automation for service accounts
- Adds default service account to pods
Commonly Used:
-
AlwaysPullImages
- Forces image pull on every pod creation
- Prevents using cached images
- Security benefit: Always gets latest image
-
DefaultStorageClass
- Automatically adds default storage class to PVCs
- Simplifies persistent volume management
-
EventRateLimit
- Limits API server request rate
- Prevents API server flooding
-
ResourceQuota
- Enforces resource quotas in namespaces
- Prevents resource exhaustion
-
LimitRanger
- Enforces min/max resource limits on pods
- Sets default resource requests/limits
Security-Focused:
-
PodSecurityPolicy (Deprecated β Pod Security Standards)
- Controls security-sensitive aspects of pods
- Enforces security contexts, capabilities, volumes
-
ImagePolicyWebhook
- External validation of container images
- Can enforce approved image registries
βοΈ Configuration & Management
View Enabled Admission Controllers:
# For kubeadm clusters
kubectl exec -n kube-system kube-apiserver-<node-name> -- kube-apiserver -h | grep enable-admission-plugins
# For binary installations
kube-apiserver -h | grep enable-admission-plugins
Enable Additional Controllers:
# In kube-apiserver manifest (/etc/kubernetes/manifests/kube-apiserver.yaml)
spec:
containers:
- command:
- kube-apiserver
- --enable-admission-plugins=NodeRestriction,ResourceQuota,NamespaceLifecycle,DefaultStorageClass
Disable Controllers:
# Add to kube-apiserver configuration
- --disable-admission-plugins=DefaultStorageClass,AlwaysPullImages
π Real-world Examples
Example 1: Namespace Auto-Creation
Scenario: Creating pod in non-existent namespace
Without NamespaceAutoProvision:
kubectl create pod test-pod --image=nginx -n blue
# Error: namespace "blue" not found
With NamespaceAutoProvision:
kubectl create pod test-pod --image=nginx -n blue
# Success: Namespace "blue" created automatically
# Pod created in new namespace
Example 2: Image Policy Enforcement
Policy: Only allow images from internal registry
Configuration Example:
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
image: docker.internal.com/myapp:v1.2 # β
Allowed
# image: nginx:latest # β Rejected
Example 3: Security Context Enforcement
Policy: Containers cannot run as root
apiVersion: v1
kind: Pod
spec:
securityContext:
runAsNonRoot: true # β
Required
runAsUser: 1000
containers:
- name: app
image: myapp:latest
securityContext:
runAsUser: 0 # β Would be rejected
π€ Interview Questions & Answers
Q1: What is the difference between Authentication, Authorization, and Admission Control?
Answer:
- Authentication verifies who you are (certificates, tokens)
- Authorization determines what you can do (RBAC permissions)
- Admission Control enforces how you should do it (configuration policies)
They work sequentially: Authentication β Authorization β Admission Control β etcd storage.
Q2: Why can't RBAC handle image registry restrictions?
Answer:
RBAC works at the API level - it can control if you can CREATE a pod, but it cannot inspect the pod's configuration to validate image sources, tags, security contexts, or labels. Admission controllers operate on the actual object content and can enforce configuration-level policies.
Q3: What's the difference between Validating and Mutating admission controllers?
Answer:
-
Mutating: Modify/enrich the request (run first)
- Example: DefaultStorageClass adds storage class to PVC
-
Validating: Accept or reject without modification (run after mutating)
- Example: ResourceQuota checks if request exceeds limits
Q4: How would you implement a policy to reject pods with "latest" image tags?
Answer:
Use a ValidatingAdmissionWebhook that:
- Intercepts pod creation requests
- Inspects each container's image field
- Rejects if any image uses "latest" tag
- Returns appropriate error message
Q5: What happens if an admission controller fails?
Answer:
- Request is rejected by default
- No object is created in etcd
- User receives error message
- This fail-safe approach ensures policies are always enforced
Q6: How do you troubleshoot admission controller issues?
Answer:
- Check API server logs:
kubectl logs -n kube-system kube-apiserver-<node>
- Verify admission controller configuration
- Test with simple objects first
- Use
kubectl auth can-i
for authorization issues - Check webhook endpoints if using custom controllers
Q7: What's the difference between PodSecurityPolicy and Pod Security Standards?
Answer:
- PodSecurityPolicy: Deprecated admission controller, complex configuration
-
Pod Security Standards: New built-in approach with three levels:
- Privileged: Unrestricted
- Baseline: Minimally restrictive
- Restricted: Heavily restricted
Q8: Can admission controllers modify requests?
Answer:
Yes, Mutating Admission Controllers can:
- Add default values (DefaultStorageClass)
- Inject sidecar containers
- Add labels/annotations
- Modify security contexts
- Set resource limits
Q9: How do you create a custom admission controller?
Answer:
- Admission Webhook: External HTTP service
- ValidatingAdmissionWebhook or MutatingAdmissionWebhook resource
- Webhook receives AdmissionReview requests
- Returns AdmissionResponse (allow/deny + optional patches)
- Configure webhook in cluster
Q10: What's the order of admission controller execution?
Answer:
- Mutating admission controllers (run first, can modify)
- Object schema validation
- Validating admission controllers (run last, validate only)
Within each phase, controllers run in alphabetical order by name.
π» Practical Commands
View Current Configuration:
# List enabled admission controllers
kubectl exec -n kube-system kube-apiserver-master -- kube-apiserver -h | grep enable-admission-plugins
# Check API server configuration
kubectl get pod -n kube-system kube-apiserver-master -o yaml
Test Namespace Creation:
# Test namespace auto-creation
kubectl create pod test-pod --image=nginx -n nonexistent-namespace
# Verify namespace was created
kubectl get namespaces
Resource Quota Testing:
# Create namespace with quota
kubectl create namespace test-quota
kubectl apply -f - <<EOF
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: test-quota
spec:
hard:
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi
persistentvolumeclaims: "1"
EOF
# Test quota enforcement
kubectl run nginx --image=nginx -n test-quota --requests=cpu=500m,memory=512Mi
Custom Admission Webhook Example:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionWebhook
metadata:
name: image-policy-webhook
webhooks:
- name: image-policy.example.com
clientConfig:
service:
name: image-policy-webhook
namespace: default
path: "/validate"
rules:
- operations: ["CREATE", "UPDATE"]
apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
admissionReviewVersions: ["v1", "v1beta1"]
π― Key Takeaways for Interviews
- Understand the Flow: Authentication β Authorization β Admission Control β etcd
- Know the Differences: RBAC vs Admission Controllers capabilities
- Practical Examples: Be able to explain namespace lifecycle, image policies
- Security Focus: Admission controllers are primarily security tools
- Configuration Knowledge: How to enable/disable controllers
- Troubleshooting: Know how to debug admission controller issues
- Modern Replacements: PSP β Pod Security Standards, webhook approaches
Kubernetes Multiple Schedulers - Complete Notes
π― What & Why Multiple Schedulers?
Default Scheduler Limitations
- Default behavior: Distributes pods evenly across nodes
- Built-in features: Taints, tolerations, node affinity
- Problem: What if you need custom scheduling logic?
- Solution: Create your own scheduler with custom conditions and checks
Key Concept
- Kubernetes is extensible - you can write custom schedulers
- Multiple schedulers can run simultaneously in one cluster
- Each application can choose which scheduler to use
ποΈ Architecture Overview
Kubernetes Cluster
βββ Default Scheduler (default-scheduler)
βββ Custom Scheduler 1 (my-custom-scheduler)
βββ Custom Scheduler 2 (ml-scheduler)
βββ Applications choose which scheduler to use
π Configuration Fundamentals
Scheduler Naming
- Every scheduler must have a unique name
- Default scheduler:
default-scheduler
- Custom schedulers: any unique name (e.g.,
my-custom-scheduler
)
Configuration File Structure
apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: my-custom-scheduler
leaderElection:
leaderElect: false # true for HA setups
π Deployment Methods
Method 1: Binary Deployment (Rarely Used)
# Download and run kube-scheduler binary
./kube-scheduler --config=/path/to/custom-config.yaml
Method 2: Pod Deployment
apiVersion: v1
kind: Pod
metadata:
name: my-custom-scheduler
namespace: kube-system
spec:
containers:
- name: kube-scheduler
image: k8s.gcr.io/kube-scheduler:v1.28.0
command:
- kube-scheduler
- --config=/etc/kubernetes/scheduler-config.yaml
volumeMounts:
- name: config-volume
mountPath: /etc/kubernetes
volumes:
- name: config-volume
configMap:
name: scheduler-config
Method 3: Deployment (Recommended)
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-custom-scheduler
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: my-custom-scheduler
template:
metadata:
labels:
app: my-custom-scheduler
spec:
serviceAccountName: my-scheduler-sa
containers:
- name: kube-scheduler
image: k8s.gcr.io/kube-scheduler:v1.28.0
command:
- kube-scheduler
- --config=/etc/kubernetes/scheduler-config.yaml
volumeMounts:
- name: config-volume
mountPath: /etc/kubernetes
volumes:
- name: config-volume
configMap:
name: scheduler-config
π§ Using ConfigMaps for Configuration
Create ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: scheduler-config
namespace: kube-system
data:
scheduler-config.yaml: |
apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: my-custom-scheduler
leaderElection:
leaderElect: false
π― Using Custom Schedulers
Specify Scheduler in Pod
apiVersion: v1
kind: Pod
metadata:
name: nginx-custom
spec:
schedulerName: my-custom-scheduler # Key field!
containers:
- name: nginx
image: nginx
Specify Scheduler in Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp
spec:
template:
spec:
schedulerName: my-custom-scheduler # Key field!
containers:
- name: webapp
image: webapp:latest
π Troubleshooting & Verification
Check Pod Status
kubectl get pods
# If pending, scheduler might not be configured correctly
Describe Pod for Details
kubectl describe pod <pod-name>
# Look for scheduling events and errors
View Events with Scheduler Info
kubectl get events -o wide
# Look for "Scheduled" events
# Source column shows which scheduler was used
Check Scheduler Logs
kubectl logs -n kube-system <scheduler-pod-name>
β‘ Leader Election (HA Setup)
What is Leader Election?
- Used in multi-master setups
- Prevents conflicts when multiple scheduler replicas run
- Only one scheduler instance is active at a time
Configuration
leaderElection:
leaderElect: true
leaseDuration: 15s
renewDeadline: 10s
retryPeriod: 2s
resourceLock: leases
resourceName: my-custom-scheduler
resourceNamespace: kube-system
π Prerequisites for Custom Schedulers
Service Account
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-scheduler-sa
namespace: kube-system
ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: my-scheduler-role
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
# ... other required permissions
ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: my-scheduler-binding
subjects:
- kind: ServiceAccount
name: my-scheduler-sa
namespace: kube-system
roleRef:
kind: ClusterRole
name: my-scheduler-role
apiGroup: rbac.authorization.k8s.io
π Common Interview Questions & Answers
Q1: Why would you need multiple schedulers?
Answer: When default scheduling logic doesn't meet specific requirements like:
- Custom resource allocation algorithms
- Application-specific placement rules
- Integration with external systems
- Specialized workload requirements (ML, batch processing)
Q2: How does Kubernetes know which scheduler to use?
Answer: Through the schedulerName
field in pod spec. If not specified, uses default-scheduler
.
Q3: What happens if custom scheduler is not available?
Answer: Pod remains in Pending
state. Check with kubectl describe pod
for events and errors.
Q4: Can you run multiple instances of same custom scheduler?
Answer: Yes, but need leader election enabled to prevent conflicts. Only one instance will be active.
Q5: How to verify which scheduler scheduled a pod?
Answer: Use kubectl get events -o wide
and look for "Scheduled" events with source showing scheduler name.
π― Quick Command Reference
# Deploy custom scheduler
kubectl apply -f custom-scheduler.yaml
# Check schedulers running
kubectl get pods -n kube-system | grep scheduler
# Create pod with custom scheduler
kubectl apply -f pod-with-custom-scheduler.yaml
# Check events
kubectl get events -o wide
# View scheduler logs
kubectl logs -n kube-system <scheduler-pod-name>
# Describe pod for scheduling info
kubectl describe pod <pod-name>
Kubernetes Deployments: Updates & Rollbacks - Complete Notes
π― Fundamentals: Rollouts & Versioning
What is a Rollout?
- Rollout: Process of deploying or updating an application
-
Triggered when:
- First deployment creation
- Container image updates
- Configuration changes
- Creates: New deployment revision each time
Deployment Revisions
Deployment Creation β Rollout β Revision 1
Application Update β Rollout β Revision 2
Another Update β Rollout β Revision 3
Key Commands for Rollout Management
# Check rollout status
kubectl rollout status deployment/myapp-deployment
# View rollout history
kubectl rollout history deployment/myapp-deployment
# View specific revision details
kubectl rollout history deployment/myapp-deployment --revision=2
π Deployment Strategies
1. Recreate Strategy
How it works:
- Destroy all old pods first
- Then create all new pods
- Downtime: Yes (application unavailable during update)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deployment
spec:
strategy:
type: Recreate # Explicit recreate strategy
replicas: 5
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myapp:v1
Visual Flow:
Old Pods: [P1] [P2] [P3] [P4] [P5]
β (all destroyed)
Old Pods: [ ] [ ] [ ] [ ] [ ]
β (all created)
New Pods: [P1] [P2] [P3] [P4] [P5]
2. Rolling Update Strategy (Default)
How it works:
- Update pods one by one (or in small batches)
- Always maintain minimum available pods
- Downtime: None (zero-downtime deployment)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deployment
spec:
strategy:
type: RollingUpdate # Default strategy
rollingUpdate:
maxSurge: 1 # Max pods above desired count
maxUnavailable: 1 # Max pods unavailable during update
replicas: 5
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myapp:v2
Visual Flow:
Step 1: [P1-v1] [P2-v1] [P3-v1] [P4-v1] [P5-v1]
Step 2: [P1-v2] [P2-v1] [P3-v1] [P4-v1] [P5-v1]
Step 3: [P1-v2] [P2-v2] [P3-v1] [P4-v1] [P5-v1]
...
Final: [P1-v2] [P2-v2] [P3-v2] [P4-v2] [P5-v2]
π Update Methods
Method 1: Using kubectl apply (Recommended)
# 1. Edit deployment YAML file
vim myapp-deployment.yaml
# 2. Update image version
# containers:
# - name: myapp
# image: myapp:v2 # Changed from v1 to v2
# 3. Apply changes
kubectl apply -f myapp-deployment.yaml
Method 2: Using kubectl set image
# Direct command to update image
kubectl set image deployment/myapp-deployment \
myapp=myapp:v2 --record
# Note: This doesn't update your YAML file!
Method 3: Using kubectl edit
# Edit deployment directly
kubectl edit deployment myapp-deployment
# Change image version in the editor
Method 4: Using kubectl patch
# Patch specific fields
kubectl patch deployment myapp-deployment \
-p '{"spec":{"template":{"spec":{"containers":[{"name":"myapp","image":"myapp:v2"}]}}}}'
ποΈ Under the Hood: How Deployments Work
ReplicaSet Management
Deployment
βββ ReplicaSet v1 (old) β 0 pods
βββ ReplicaSet v2 (new) β 5 pods
Step-by-Step Process
- Create Deployment β Creates ReplicaSet-1 β Creates Pods
- Update Deployment β Creates ReplicaSet-2 β Gradually scales up new pods
- Scaling Process β Scales down old ReplicaSet β Scales up new ReplicaSet
- Completion β Old ReplicaSet has 0 pods β New ReplicaSet has desired pods
Viewing ReplicaSets
# List all ReplicaSets
kubectl get replicasets
# Example output:
# NAME DESIRED CURRENT READY AGE
# myapp-deployment-old 0 0 0 10m
# myapp-deployment-new 5 5 5 2m
βͺ Rollbacks
Why Rollback?
- New version has bugs
- Performance issues
- Failed health checks
- Business requirements changed
Rollback Commands
# Rollback to previous version
kubectl rollout undo deployment/myapp-deployment
# Rollback to specific revision
kubectl rollout undo deployment/myapp-deployment --to-revision=2
# Check rollback status
kubectl rollout status deployment/myapp-deployment
Rollback Process
Before Rollback:
ReplicaSet v1 β 0 pods
ReplicaSet v2 β 5 pods
After Rollback:
ReplicaSet v1 β 5 pods (restored)
ReplicaSet v2 β 0 pods (scaled down)
π Monitoring & Troubleshooting
Essential Commands
# Check deployment status
kubectl get deployments
# Detailed deployment info
kubectl describe deployment myapp-deployment
# Check pods
kubectl get pods
# View rollout history
kubectl rollout history deployment/myapp-deployment
# Check events
kubectl get events --sort-by=.metadata.creationTimestamp
# View logs
kubectl logs deployment/myapp-deployment
Deployment Status States
# Healthy deployment
status:
availableReplicas: 5
readyReplicas: 5
replicas: 5
updatedReplicas: 5
conditions:
- type: Available
status: "True"
- type: Progressing
status: "True"
π― Complete Example Workflow
Initial Deployment
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.16
ports:
- containerPort: 80
Deploy and Monitor
# 1. Create deployment
kubectl apply -f deployment.yaml
# 2. Check status
kubectl rollout status deployment/nginx-deployment
# 3. View deployment
kubectl get deployments
kubectl get replicasets
kubectl get pods
Update Application
# Method 1: Edit YAML and apply
# Change image: nginx:1.16 β nginx:1.17
kubectl apply -f deployment.yaml
# Method 2: Direct command
kubectl set image deployment/nginx-deployment \
nginx=nginx:1.17 --record
# Monitor update
kubectl rollout status deployment/nginx-deployment
Rollback if Needed
# Check history
kubectl rollout history deployment/nginx-deployment
# Rollback to previous version
kubectl rollout undo deployment/nginx-deployment
# Verify rollback
kubectl rollout status deployment/nginx-deployment
π Advanced Configuration
Rolling Update Parameters
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25% # Max 25% extra pods during update
maxUnavailable: 25% # Max 25% pods can be unavailable
Revision History Limit
spec:
revisionHistoryLimit: 10 # Keep only 10 old ReplicaSets
Progress Deadline
spec:
progressDeadlineSeconds: 600 # Fail if update takes >10 minutes
π€ Common Interview Questions & Answers
Q1: What's the difference between Recreate and Rolling Update strategies?
Answer:
- Recreate: Destroys all old pods first, then creates new ones. Causes downtime.
- Rolling Update: Updates pods gradually, maintaining availability. Zero downtime. Default strategy.
Q2: How do you rollback a deployment to a specific revision?
Answer:
kubectl rollout undo deployment/myapp --to-revision=3
Q3: What happens to old ReplicaSets after an update?
Answer: Old ReplicaSets are kept (scaled to 0) for rollback purposes. Number kept is controlled by revisionHistoryLimit
.
Q4: How do you check the history of deployments?
Answer:
kubectl rollout history deployment/myapp
Q5: What's maxSurge and maxUnavailable in rolling updates?
Answer:
- maxSurge: Maximum number of pods that can be created above desired replica count
- maxUnavailable: Maximum number of pods that can be unavailable during update
Q6: How do you monitor a deployment update in real-time?
Answer:
kubectl rollout status deployment/myapp -w
π― Quick Command Reference
# Deployment Management
kubectl create deployment nginx --image=nginx:1.16
kubectl apply -f deployment.yaml
kubectl get deployments
kubectl describe deployment myapp
# Updates
kubectl set image deployment/myapp app=myapp:v2 --record
kubectl edit deployment myapp
# Rollouts
kubectl rollout status deployment/myapp
kubectl rollout history deployment/myapp
kubectl rollout undo deployment/myapp
kubectl rollout undo deployment/myapp --to-revision=2
# Scaling
kubectl scale deployment myapp --replicas=5
# Debugging
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl logs deployment/myapp
kubectl describe pods -l app=myapp
Docker Commands & Arguments - Super Simple Notes π₯
π€ Pehle Basic Problem Samjhte Hain
Ubuntu Container Problem
# Ye command run karo
docker run ubuntu
# Container turant band ho jaata hai! Why??
Reason: Ubuntu container start hone ke baad bash command chalti hai, but terminal nahi milta, so bash exit ho jaata hai, container bhi band!
π― Container Kaise Kaam Karta Hai?
Simple Rule:
Container = Process chalne tak alive
Process band = Container band
Examples:
-
NGINX container:
nginx
command chalti hai β web server running β container alive -
MySQL container:
mysqld
command chalti hai β database running β container alive -
Ubuntu container:
bash
command chalti hai β terminal nahi mila β bash exits β container exits
π‘ CMD vs ENTRYPOINT - Simple Difference
1. CMD Instruction
Kya hota hai: Default command set karta hai
# Dockerfile
FROM ubuntu
CMD sleep 5
# Build karo
docker build -t my-ubuntu .
# Run karo
docker run my-ubuntu
# Result: 5 seconds sleep, then exits
# Command override kar sakte ho
docker run my-ubuntu sleep 10
# Result: 10 seconds sleep (CMD completely replaced!)
CMD Rule: Command line se jo bhi pass karo, pura CMD replace ho jaata hai!
2. ENTRYPOINT Instruction
Kya hota hai: Fixed command set karta hai, parameters append hote hain
# Dockerfile
FROM ubuntu
ENTRYPOINT ["sleep"]
# Build karo
docker build -t sleeper .
# Run karo
docker run sleeper 10
# Result: "sleep 10" command chalegi
# Bina parameter
docker run sleeper
# Result: Error! "sleep" needs parameter
ENTRYPOINT Rule: Command line parameters append hote hain!
π Real Examples - Step by Step
Example 1: Simple CMD
FROM ubuntu
CMD ["echo", "Hello World"]
# Build
docker build -t hello .
# Run
docker run hello
# Output: Hello World
# Override CMD
docker run hello echo "Bye World"
# Output: Bye World
Example 2: Simple ENTRYPOINT
FROM ubuntu
ENTRYPOINT ["echo"]
# Build
docker build -t echo-app .
# Run
docker run echo-app "Hello Bhai"
# Output: Hello Bhai (echo + "Hello Bhai")
# Bina parameter - error nahi, but empty echo
docker run echo-app
# Output: (empty line)
Example 3: ENTRYPOINT + CMD Combo π₯
FROM ubuntu
ENTRYPOINT ["sleep"]
CMD ["5"]
# Build
docker build -t smart-sleeper .
# Default behavior
docker run smart-sleeper
# Result: "sleep 5" (ENTRYPOINT + CMD)
# With parameter
docker run smart-sleeper 10
# Result: "sleep 10" (ENTRYPOINT + your parameter, CMD ignored)
π Quick Comparison Table
Scenario | CMD | ENTRYPOINT | ENTRYPOINT + CMD |
---|---|---|---|
No parameters | Uses CMD | Error (usually) | Uses both |
With parameters | Replaces CMD | Appends to ENTRYPOINT | Replaces CMD part |
Use case | Default command | Fixed command + flexible params | Best of both! |
π― JSON vs Shell Format
Shell Format (Don't use!)
# Wrong way - shell format
CMD echo hello world
ENTRYPOINT echo hello
# Problem: Creates extra shell process
JSON Format (Correct way!)
# Right way - JSON array format
CMD ["echo", "hello", "world"]
ENTRYPOINT ["echo", "hello"]
# Benefits: Direct process execution, no shell overhead
Rule: Hamesha JSON format use karo! ["command", "param1", "param2"]
π¨ Common Mistakes & Solutions
Mistake 1: Wrong JSON Format
# β Wrong
CMD ["sleep 5"]
# β
Correct
CMD ["sleep", "5"]
Mistake 2: Mixing Formats
# β Wrong
ENTRYPOINT sleep
CMD ["5"]
# β
Correct
ENTRYPOINT ["sleep"]
CMD ["5"]
Mistake 3: No Default Value
# β Problem: Error if no parameter
ENTRYPOINT ["sleep"]
# β
Solution: Add default
ENTRYPOINT ["sleep"]
CMD ["5"]
π€ Interview Questions & Answers
Q1: CMD aur ENTRYPOINT mein kya difference hai?
Answer:
- CMD: Command line parameters se completely replace ho jaata hai
- ENTRYPOINT: Command line parameters append hote hain, fixed command rahta hai
Q2: ENTRYPOINT + CMD together kaise kaam karta hai?
Answer:
- No parameters: ENTRYPOINT + CMD both used
- With parameters: ENTRYPOINT + your parameters (CMD ignored)
Q3: JSON format kyun use karna chahiye?
Answer: Shell format extra shell process banata hai, JSON format direct process run karta hai - faster and cleaner!
Q4: Container immediately exit kyun ho jaata hai?
Answer: Container tab tak alive rahta hai jab tak main process running hai. Process exit = Container exit.
π‘ Key Takeaways
- Container = Process - Process alive toh container alive
- CMD = Replaceable - Command line se override ho sakta hai
- ENTRYPOINT = Fixed - Parameters append hote hain
- Best Practice: ENTRYPOINT + CMD combo use karo
-
Format: Hamesha JSON array format
["cmd", "param"]
- Real World: ENTRYPOINT for executable, CMD for default parameters
π― Kubernetes Mein Commands & Arguments
Docker β Kubernetes Mapping
Docker | Kubernetes Pod | Purpose |
---|---|---|
ENTRYPOINT |
command |
Fixed executable |
CMD |
args |
Default parameters |
Simple Rule:
-
Kubernetes
command
= DockerENTRYPOINT
override -
Kubernetes
args
= DockerCMD
override
π Real Examples - Kubernetes Pods
Example 1: Override CMD (args)
# Docker image has: ENTRYPOINT ["sleep"], CMD ["5"]
apiVersion: v1
kind: Pod
metadata:
name: ubuntu-sleeper
spec:
containers:
- name: sleeper
image: ubuntu-sleeper
args: ["10"] # Override CMD: now sleeps for 10 seconds
# Result: Container runs "sleep 10"
# ENTRYPOINT (sleep) + args (10) = sleep 10
Example 2: Override ENTRYPOINT (command)
# Docker image has: ENTRYPOINT ["sleep"], CMD ["5"]
apiVersion: v1
kind: Pod
metadata:
name: ubuntu-printer
spec:
containers:
- name: printer
image: ubuntu-sleeper
command: ["echo"] # Override ENTRYPOINT
args: ["Hello World"] # Override CMD
# Result: Container runs "echo Hello World"
# command (echo) + args (Hello World) = echo Hello World
Example 3: Only Override CMD
# Keep ENTRYPOINT, change CMD
apiVersion: v1
kind: Pod
metadata:
name: long-sleeper
spec:
containers:
- name: sleeper
image: ubuntu-sleeper
args: ["100"] # Sleep for 100 seconds instead of 5
Example 4: Override Both
# Change both ENTRYPOINT and CMD
apiVersion: v1
kind: Pod
metadata:
name: custom-app
spec:
containers:
- name: app
image: ubuntu-sleeper
command: ["python3"] # New executable
args: ["app.py", "--port", "8080"] # New parameters
Top comments (0)