DEV Community

Aisalkyn Aidarova
Aisalkyn Aidarova

Posted on

Project: Understanding Kubernetes Volume Types

🎯 Goal

By the end, you will viscerally know:

  • why some data disappears
  • why some data survives pod restarts
  • when to use each volume
  • why databases require PVCs

🧠 The 3 volume types we will test

Volume Data survives pod restart? Use case
emptyDir ❌ NO temp files, cache
hostPath ⚠️ SOMETIMES node-specific (dangerous)
PVC βœ… YES databases, user data

πŸ“ Project structure

k8s-volumes-lab/
β”œβ”€β”€ emptydir.yaml
β”œβ”€β”€ hostpath.yaml
β”œβ”€β”€ pvc.yaml
└── pod-pvc.yaml
Enter fullscreen mode Exit fullscreen mode

🧩 PART 1 β€” emptyDir (DATA DIES WITH POD)

πŸ“„ emptydir.yaml

apiVersion: v1
kind: Pod
metadata:
  name: emptydir-demo
spec:
  containers:
    - name: app
      image: busybox
      command: ["sh", "-c", "sleep 3600"]
      volumeMounts:
        - name: data
          mountPath: /data
  volumes:
    - name: data
      emptyDir: {}
Enter fullscreen mode Exit fullscreen mode

▢️ Apply

kubectl apply -f emptydir.yaml
kubectl exec -it emptydir-demo -- sh
Enter fullscreen mode Exit fullscreen mode

✍️ Create data

echo "HELLO FROM EMPTYDIR" > /data/file.txt
cat /data/file.txt
Enter fullscreen mode Exit fullscreen mode

❌ Kill the pod

kubectl delete pod emptydir-demo
kubectl apply -f emptydir.yaml
kubectl exec -it emptydir-demo -- cat /data/file.txt
Enter fullscreen mode Exit fullscreen mode

πŸ’₯ Result

No such file or directory
Enter fullscreen mode Exit fullscreen mode

🧠 Lesson

emptyDir lives only as long as the pod lives

πŸ“˜ PROJECT: Understanding hostPath in Kubernetes


What is a Node?

A Node is a real machine (VM or physical server) with its own disk.

  • Examples: kind-worker, kind-worker2
  • Each node has its own filesystem
  • Nodes do NOT share disks

What is a Pod?

A Pod is a temporary runtime object that runs containers on a node.

  • Pods are ephemeral
  • Pods can be deleted and recreated
  • Pods can move between nodes

πŸ”‘ Key Rule

Pods move. Nodes don’t share storage.


πŸ–ΌοΈ Visual Diagram β€” Where hostPath Lives

Image

Image

Image

Node A (kind-worker)
└── /tmp/hostpath-demo/data.txt   ← REAL FILE (on disk)

Pod
└── /data  β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

Another node:

Node B (kind-worker2)
└── /tmp/hostpath-demo/   ← EMPTY (different disk)
Enter fullscreen mode Exit fullscreen mode

🧩 Why does hostPath exist?

Why Kubernetes allows hostPath

hostPath exists for special cases only:

  • Log collectors (DaemonSets)
  • Node-level monitoring agents
  • Debugging
  • Single-node clusters
  • Learning / labs

πŸ‘‰ It gives containers direct access to node disk.


⚠️ Important Warning

hostPath bypasses Kubernetes storage safety.
Kubernetes does NOT protect, replicate, or move this data.


πŸ§ͺ HANDS-ON LAB (FULL PROJECT)


STEP 0 β€” Preconditions

  • kind or any multi-node cluster
  • At least 2 worker nodes

Verify:

kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

STEP 1 β€” Create Pod with hostPath

πŸ“„ hostpath.yaml

apiVersion: v1
kind: Pod
metadata:
  name: hostpath-demo
spec:
  containers:
    - name: app
      image: busybox
      command: ["sh", "-c", "sleep 3600"]
      volumeMounts:
        - name: host
          mountPath: /data
  volumes:
    - name: host
      hostPath:
        path: /tmp/hostpath-demo
        type: DirectoryOrCreate
Enter fullscreen mode Exit fullscreen mode

Who creates this directory?

πŸ‘‰ kubelet on the node creates:

/tmp/hostpath-demo
Enter fullscreen mode Exit fullscreen mode

STEP 2 β€” Find the RIGHT NODE

kubectl apply -f hostpath.yaml
kubectl get pod hostpath-demo -o wide
Enter fullscreen mode Exit fullscreen mode

Example:

NODE: kind-worker
Enter fullscreen mode Exit fullscreen mode

This node’s disk is where data will be written.


STEP 3 β€” Write Data (Inside Pod)

kubectl exec -it hostpath-demo -- sh
Enter fullscreen mode Exit fullscreen mode

Inside container:

echo "HOSTPATH DATA" > /data/data.txt
cat /data/data.txt
exit
Enter fullscreen mode Exit fullscreen mode

Where is the file REALLY stored?

kind-worker:/tmp/hostpath-demo/data.txt
Enter fullscreen mode Exit fullscreen mode

Not in Kubernetes.
Not in etcd.
Not in the Pod.


STEP 4 β€” Delete the Pod (DATA STAYS)

kubectl delete pod hostpath-demo
kubectl apply -f hostpath.yaml
Enter fullscreen mode Exit fullscreen mode

Check node again:

kubectl get pod hostpath-demo -o wide
Enter fullscreen mode Exit fullscreen mode

πŸ‘‰ Pod lands again on same node.

Verify:

kubectl exec -it hostpath-demo -- cat /data/data.txt
Enter fullscreen mode Exit fullscreen mode

βœ… Result:

HOSTPATH DATA
Enter fullscreen mode Exit fullscreen mode

Deleting a Pod does NOT delete node files.
The data stays because the node did not change.


STEP 5 β€” Delete / Drain the NODE (DATA IS LOST)

Drain the node where Pod is running

kubectl drain kind-worker \
  --ignore-daemonsets \
  --delete-emptydir-data \
  --force
Enter fullscreen mode Exit fullscreen mode

STEP 6 β€” Recreate Pod

kubectl apply -f hostpath.yaml
kubectl get pod hostpath-demo -o wide
Enter fullscreen mode Exit fullscreen mode

Now Pod runs on:

kind-worker2
Enter fullscreen mode Exit fullscreen mode

STEP 7 β€” Check Data Again (FAIL EXPECTED)

kubectl exec -it hostpath-demo -- cat /data/data.txt
Enter fullscreen mode Exit fullscreen mode

❌ Output:

No such file or directory
Enter fullscreen mode Exit fullscreen mode

🧠 WHY DATA IS GONE (Explain Clearly)

  • Data still exists on old node disk
  • New node has a different disk
  • Kubernetes does not copy hostPath data

πŸ†š Pod vs Node (CLEAR DIFFERENCE TABLE)

Concept Pod Node
What it is Runtime container wrapper Real machine
Lifetime Short Long
Can be deleted Yes Yes
Moves around Yes No
Has disk ❌ βœ…
hostPath stored here ❌ βœ…

❌ Why hostPath is NOT for Production

  • Node failure = data loss
  • Pod reschedule = data loss
  • No replication
  • No HA
  • No backups
  • Security risk

hostPath = node disk, not Kubernetes storage


βœ… FINAL TAKEAWAYS

  1. hostPath data lives on the node filesystem
  2. Pod deletion does not delete node data
  3. Node deletion or Pod movement loses data
  4. hostPath is unsafe for databases
  5. Use PVC for real applications

🎀 Interview-Ready One-Liner

β€œhostPath mounts node-local storage into a Pod. Data survives Pod deletion but is lost when the Pod moves to another node, which breaks high availability.”

1️⃣ The Core Problem (Why PV & PVC Exist)

Pods are ephemeral.

If a Pod:

  • restarts
  • gets rescheduled
  • node dies

πŸ‘‰ ALL data inside the container filesystem is LOST

Databases, uploads, logs, stateful apps cannot survive this.

So Kubernetes introduced Persistent Storage Abstraction.


2️⃣ Mental Model (THIS is what to remember)

Think like DevOps, not YAML first.

Role Think of it as Who owns it
PV (PersistentVolume) Actual disk Cluster / Infra
PVC (PersistentVolumeClaim) Disk request Application
Pod Uses disk Runtime

3️⃣ PersistentVolume (PV) β€” What It Really Is

Image

Image

Image

PV = Real Storage

A PV represents:

  • EBS
  • EFS
  • NFS
  • Local disk
  • Cloud disk

Created by:

  • Admin
  • StorageClass (dynamic)

Key properties:

  • Capacity (10Gi, 100Gi)
  • Access mode (RWO, RWX)
  • Reclaim policy (Retain / Delete)
  • Backed by real storage

πŸ”΄ Apps NEVER talk to PV directly


4️⃣ PersistentVolumeClaim (PVC) β€” What Apps Use

Image

Image

Image

PVC = Request for Storage

PVC says:

β€œI need 10Gi, ReadWriteOnce, fast disk”

Kubernetes:

  • Finds a matching PV
  • Binds PVC β†’ PV
  • Locks it (exclusive)

Pod only knows:

  • volume name
  • mount path

βœ… Pod does NOT know:

  • disk type
  • cloud provider
  • node location

5️⃣ Binding Flow (VERY IMPORTANT)

Image

Image

Order of events

  1. PV exists (or StorageClass ready)
  2. PVC is created
  3. Kubernetes binds PVC β†’ PV
  4. Pod mounts PVC
  5. Pod reads/writes data

❗ If PVC is not bound β†’ Pod stuck Pending


πŸ”₯ FULL HANDS-ON PROJECT

🎯 Goal

Prove that:

  • Pod dies ❌
  • Data survives βœ…

πŸ“ Project Structure

pv-pvc-lab/
β”œβ”€β”€ pv.yaml
β”œβ”€β”€ pvc.yaml
β”œβ”€β”€ pod.yaml
Enter fullscreen mode Exit fullscreen mode

STEP 1️⃣ Create a PersistentVolume (PV)

# pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: demo-pv
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: /mnt/demo-data
Enter fullscreen mode Exit fullscreen mode

Apply:

kubectl apply -f pv.yaml
kubectl get pv
Enter fullscreen mode Exit fullscreen mode

Expected:

STATUS: Available
Enter fullscreen mode Exit fullscreen mode

STEP 2️⃣ Create a PersistentVolumeClaim (PVC)

# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: demo-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
Enter fullscreen mode Exit fullscreen mode

Apply:

kubectl apply -f pvc.yaml
kubectl get pvc
Enter fullscreen mode Exit fullscreen mode

Expected:

STATUS: Bound
Enter fullscreen mode Exit fullscreen mode

πŸ”΄ If not bound β†’ size / accessMode mismatch


STEP 3️⃣ Create Pod Using the PVC

# pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: pvc-demo-pod
spec:
  containers:
  - name: app
    image: busybox
    command: ["sh", "-c", "sleep 3600"]
    volumeMounts:
    - mountPath: /data
      name: storage
  volumes:
  - name: storage
    persistentVolumeClaim:
      claimName: demo-pvc
Enter fullscreen mode Exit fullscreen mode

Apply:

kubectl apply -f pod.yaml
kubectl get pod
Enter fullscreen mode Exit fullscreen mode

STEP 4️⃣ Write Data into the Volume

kubectl exec -it pvc-demo-pod -- sh
Enter fullscreen mode Exit fullscreen mode

Inside Pod:

echo "Kubernetes storage works" > /data/test.txt
cat /data/test.txt
Enter fullscreen mode Exit fullscreen mode

STEP 5️⃣ DELETE THE POD (Important)

kubectl delete pod pvc-demo-pod
Enter fullscreen mode Exit fullscreen mode

Now recreate:

kubectl apply -f pod.yaml
Enter fullscreen mode Exit fullscreen mode

Check again:

kubectl exec -it pvc-demo-pod -- cat /data/test.txt
Enter fullscreen mode Exit fullscreen mode

πŸŽ‰ DATA IS STILL THERE


STEP 6️⃣ Delete PVC & Observe Reclaim Policy

kubectl delete pvc demo-pvc
kubectl get pv
Enter fullscreen mode Exit fullscreen mode

Because:

persistentVolumeReclaimPolicy: Retain
Enter fullscreen mode Exit fullscreen mode

Result:

STATUS: Released
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ Data still exists on disk
Admin must manually clean or reuse


6️⃣ What DevOps MUST Know (Interview GOLD)

❓ Why not Deployment for DB?

  • Pod identity changes
  • Volume attachment breaks
  • Ordering not guaranteed

πŸ‘‰ Use StatefulSet + PVC


❓ Why PVC instead of direct disk?

  • Decouples app from infra
  • Enables portability
  • Enables dynamic provisioning

❓ Why Pod Pending?

Most common reasons:

  • PVC not bound
  • No matching PV
  • Wrong access mode
  • StorageClass missing

7️⃣ Access Modes (Critical)

Mode Meaning Example
RWO One node EBS
RWX Many nodes EFS / NFS
ROX Read only Shared config

8️⃣ Production Mapping (REAL WORLD)

Kubernetes AWS
PV EBS / EFS
PVC Disk request
StorageClass Disk template
StatefulSet Database

Top comments (0)