π― Goal
By the end, you will viscerally know:
- why some data disappears
- why some data survives pod restarts
- when to use each volume
- why databases require PVCs
π§ The 3 volume types we will test
| Volume | Data survives pod restart? | Use case |
|---|---|---|
emptyDir |
β NO | temp files, cache |
hostPath |
β οΈ SOMETIMES | node-specific (dangerous) |
PVC |
β YES | databases, user data |
π Project structure
k8s-volumes-lab/
βββ emptydir.yaml
βββ hostpath.yaml
βββ pvc.yaml
βββ pod-pvc.yaml
π§© PART 1 β emptyDir (DATA DIES WITH POD)
π emptydir.yaml
apiVersion: v1
kind: Pod
metadata:
name: emptydir-demo
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "sleep 3600"]
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
emptyDir: {}
βΆοΈ Apply
kubectl apply -f emptydir.yaml
kubectl exec -it emptydir-demo -- sh
βοΈ Create data
echo "HELLO FROM EMPTYDIR" > /data/file.txt
cat /data/file.txt
β Kill the pod
kubectl delete pod emptydir-demo
kubectl apply -f emptydir.yaml
kubectl exec -it emptydir-demo -- cat /data/file.txt
π₯ Result
No such file or directory
π§ Lesson
emptyDir lives only as long as the pod lives
π PROJECT: Understanding hostPath in Kubernetes
What is a Node?
A Node is a real machine (VM or physical server) with its own disk.
- Examples:
kind-worker,kind-worker2 - Each node has its own filesystem
- Nodes do NOT share disks
What is a Pod?
A Pod is a temporary runtime object that runs containers on a node.
- Pods are ephemeral
- Pods can be deleted and recreated
- Pods can move between nodes
π Key Rule
Pods move. Nodes donβt share storage.
πΌοΈ Visual Diagram β Where hostPath Lives
Node A (kind-worker)
βββ /tmp/hostpath-demo/data.txt β REAL FILE (on disk)
Pod
βββ /data ββββββββββββββββ
Another node:
Node B (kind-worker2)
βββ /tmp/hostpath-demo/ β EMPTY (different disk)
π§© Why does hostPath exist?
Why Kubernetes allows hostPath
hostPath exists for special cases only:
- Log collectors (DaemonSets)
- Node-level monitoring agents
- Debugging
- Single-node clusters
- Learning / labs
π It gives containers direct access to node disk.
β οΈ Important Warning
hostPath bypasses Kubernetes storage safety.
Kubernetes does NOT protect, replicate, or move this data.
π§ͺ HANDS-ON LAB (FULL PROJECT)
STEP 0 β Preconditions
-
kindor any multi-node cluster - At least 2 worker nodes
Verify:
kubectl get nodes
STEP 1 β Create Pod with hostPath
π hostpath.yaml
apiVersion: v1
kind: Pod
metadata:
name: hostpath-demo
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "sleep 3600"]
volumeMounts:
- name: host
mountPath: /data
volumes:
- name: host
hostPath:
path: /tmp/hostpath-demo
type: DirectoryOrCreate
Who creates this directory?
π kubelet on the node creates:
/tmp/hostpath-demo
STEP 2 β Find the RIGHT NODE
kubectl apply -f hostpath.yaml
kubectl get pod hostpath-demo -o wide
Example:
NODE: kind-worker
This nodeβs disk is where data will be written.
STEP 3 β Write Data (Inside Pod)
kubectl exec -it hostpath-demo -- sh
Inside container:
echo "HOSTPATH DATA" > /data/data.txt
cat /data/data.txt
exit
Where is the file REALLY stored?
kind-worker:/tmp/hostpath-demo/data.txt
Not in Kubernetes.
Not in etcd.
Not in the Pod.
STEP 4 β Delete the Pod (DATA STAYS)
kubectl delete pod hostpath-demo
kubectl apply -f hostpath.yaml
Check node again:
kubectl get pod hostpath-demo -o wide
π Pod lands again on same node.
Verify:
kubectl exec -it hostpath-demo -- cat /data/data.txt
β Result:
HOSTPATH DATA
Deleting a Pod does NOT delete node files.
The data stays because the node did not change.
STEP 5 β Delete / Drain the NODE (DATA IS LOST)
Drain the node where Pod is running
kubectl drain kind-worker \
--ignore-daemonsets \
--delete-emptydir-data \
--force
STEP 6 β Recreate Pod
kubectl apply -f hostpath.yaml
kubectl get pod hostpath-demo -o wide
Now Pod runs on:
kind-worker2
STEP 7 β Check Data Again (FAIL EXPECTED)
kubectl exec -it hostpath-demo -- cat /data/data.txt
β Output:
No such file or directory
π§ WHY DATA IS GONE (Explain Clearly)
- Data still exists on old node disk
- New node has a different disk
- Kubernetes does not copy hostPath data
π Pod vs Node (CLEAR DIFFERENCE TABLE)
| Concept | Pod | Node |
|---|---|---|
| What it is | Runtime container wrapper | Real machine |
| Lifetime | Short | Long |
| Can be deleted | Yes | Yes |
| Moves around | Yes | No |
| Has disk | β | β |
| hostPath stored here | β | β |
β Why hostPath is NOT for Production
- Node failure = data loss
- Pod reschedule = data loss
- No replication
- No HA
- No backups
- Security risk
hostPath = node disk, not Kubernetes storage
β FINAL TAKEAWAYS
-
hostPathdata lives on the node filesystem - Pod deletion does not delete node data
- Node deletion or Pod movement loses data
-
hostPathis unsafe for databases - Use PVC for real applications
π€ Interview-Ready One-Liner
βhostPath mounts node-local storage into a Pod. Data survives Pod deletion but is lost when the Pod moves to another node, which breaks high availability.β
1οΈβ£ The Core Problem (Why PV & PVC Exist)
Pods are ephemeral.
If a Pod:
- restarts
- gets rescheduled
- node dies
π ALL data inside the container filesystem is LOST
Databases, uploads, logs, stateful apps cannot survive this.
So Kubernetes introduced Persistent Storage Abstraction.
2οΈβ£ Mental Model (THIS is what to remember)
Think like DevOps, not YAML first.
| Role | Think of it as | Who owns it |
|---|---|---|
| PV (PersistentVolume) | Actual disk | Cluster / Infra |
| PVC (PersistentVolumeClaim) | Disk request | Application |
| Pod | Uses disk | Runtime |
3οΈβ£ PersistentVolume (PV) β What It Really Is
PV = Real Storage
A PV represents:
- EBS
- EFS
- NFS
- Local disk
- Cloud disk
Created by:
- Admin
- StorageClass (dynamic)
Key properties:
- Capacity (10Gi, 100Gi)
- Access mode (RWO, RWX)
- Reclaim policy (Retain / Delete)
- Backed by real storage
π΄ Apps NEVER talk to PV directly
4οΈβ£ PersistentVolumeClaim (PVC) β What Apps Use
PVC = Request for Storage
PVC says:
βI need 10Gi, ReadWriteOnce, fast diskβ
Kubernetes:
- Finds a matching PV
- Binds PVC β PV
- Locks it (exclusive)
Pod only knows:
- volume name
- mount path
β Pod does NOT know:
- disk type
- cloud provider
- node location
5οΈβ£ Binding Flow (VERY IMPORTANT)
Order of events
- PV exists (or StorageClass ready)
- PVC is created
- Kubernetes binds PVC β PV
- Pod mounts PVC
- Pod reads/writes data
β If PVC is not bound β Pod stuck Pending
π₯ FULL HANDS-ON PROJECT
π― Goal
Prove that:
- Pod dies β
- Data survives β
π Project Structure
pv-pvc-lab/
βββ pv.yaml
βββ pvc.yaml
βββ pod.yaml
STEP 1οΈβ£ Create a PersistentVolume (PV)
# pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: demo-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: /mnt/demo-data
Apply:
kubectl apply -f pv.yaml
kubectl get pv
Expected:
STATUS: Available
STEP 2οΈβ£ Create a PersistentVolumeClaim (PVC)
# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: demo-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Apply:
kubectl apply -f pvc.yaml
kubectl get pvc
Expected:
STATUS: Bound
π΄ If not bound β size / accessMode mismatch
STEP 3οΈβ£ Create Pod Using the PVC
# pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: pvc-demo-pod
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "sleep 3600"]
volumeMounts:
- mountPath: /data
name: storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: demo-pvc
Apply:
kubectl apply -f pod.yaml
kubectl get pod
STEP 4οΈβ£ Write Data into the Volume
kubectl exec -it pvc-demo-pod -- sh
Inside Pod:
echo "Kubernetes storage works" > /data/test.txt
cat /data/test.txt
STEP 5οΈβ£ DELETE THE POD (Important)
kubectl delete pod pvc-demo-pod
Now recreate:
kubectl apply -f pod.yaml
Check again:
kubectl exec -it pvc-demo-pod -- cat /data/test.txt
π DATA IS STILL THERE
STEP 6οΈβ£ Delete PVC & Observe Reclaim Policy
kubectl delete pvc demo-pvc
kubectl get pv
Because:
persistentVolumeReclaimPolicy: Retain
Result:
STATUS: Released
π‘ Data still exists on disk
Admin must manually clean or reuse
6οΈβ£ What DevOps MUST Know (Interview GOLD)
β Why not Deployment for DB?
- Pod identity changes
- Volume attachment breaks
- Ordering not guaranteed
π Use StatefulSet + PVC
β Why PVC instead of direct disk?
- Decouples app from infra
- Enables portability
- Enables dynamic provisioning
β Why Pod Pending?
Most common reasons:
- PVC not bound
- No matching PV
- Wrong access mode
- StorageClass missing
7οΈβ£ Access Modes (Critical)
| Mode | Meaning | Example |
|---|---|---|
| RWO | One node | EBS |
| RWX | Many nodes | EFS / NFS |
| ROX | Read only | Shared config |
8οΈβ£ Production Mapping (REAL WORLD)
| Kubernetes | AWS |
|---|---|
| PV | EBS / EFS |
| PVC | Disk request |
| StorageClass | Disk template |
| StatefulSet | Database |










Top comments (0)