Stateful workloads need storage that outlives pods. In Kubernetes, that means Persistent Volumes (PV) and Persistent Volume Claims (PVC) — a PV is the actual storage, a PVC is a pod's request for it. Kubernetes matches them and handles the binding. The interesting question is what backs those PVs.
I started with Longhorn, realized it was too heavy for my cluster, benchmarked alternatives, and switched to OpenEBS. Here's the full story with numbers.
Longhorn: Good, But Overkill
Longhorn is easy to install and comes with a solid UI, snapshots, backups, and synchronous replication across nodes. I installed it with Helm:
helm repo add longhorn https://charts.longhorn.io
helm repo update
helm install longhorn longhorn/longhorn \
--namespace longhorn-system \
--create-namespace \
--version 1.7.1
It worked. But on a 3-node cluster with limited resources, Longhorn consumes around 1.5GB of memory just for its own components — Instance Manager, CSI plugins, Longhorn Manager, and the UI.
The bigger issue: my stateful apps (PostgreSQL, ScyllaDB) already handle their own replication. ScyllaDB replicates across nodes at the application level. PostgreSQL does the same. Adding storage-level replication on top is redundant — double the replication overhead, double the latency, for no benefit.
I set replicas to 1 to avoid redundant replication:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: longhorn-single-replica
provisioner: driver.longhorn.io
parameters:
numberOfReplicas: "1"
dataLocality: "best-effort"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
Even with single replica, the 1.5GB memory overhead remained. For a small cluster where every GB matters, that's hard to justify.
Benchmarking the Options
Before switching, I ran proper benchmarks using FIO on my actual cluster — 3-node CentOS VMs, the same hardware running everything else.
FIO Pod
Same pod spec used across all three storage options, just swapping the PVC:
apiVersion: v1
kind: Pod
metadata:
name: fio-test
spec:
restartPolicy: Never
containers:
- name: fio
image: ljishen/fio
command: ["fio"]
args:
- --name=pg-test
- --filename=/data/testfile
- --size=200M
- --bs=8k
- --rw=randrw
- --rwmixread=70
- --ioengine=libaio
- --iodepth=16
- --runtime=60
- --numjobs=1
- --time_based
- --group_reporting
resources:
requests:
cpu: "1"
memory: "256Mi"
limits:
cpu: "2"
memory: "512Mi"
volumeMounts:
- mountPath: /data
name: testvol
volumes:
- name: testvol
persistentVolumeClaim:
claimName: longhorn-pvc # swap for local-pvc or openebs-pvc
The FIO config simulates a database-like workload — 8k block size, 70/30 read/write mix, random I/O, 16 queue depth.
Longhorn Setup
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: longhorn-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn-single-replica
resources:
requests:
storage: 1Gi
Local PV Setup
More manual — create the directory on the node first:
sudo mkdir -p /mnt/disks/localdisk1
sudo chmod 777 /mnt/disks/localdisk1
Then create the PV and PVC manually:
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
storageClassName: local-storage
local:
path: /mnt/disks/localdisk1
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s3
persistentVolumeReclaimPolicy: Delete
volumeMode: Filesystem
When using local PVs, the pod also needs node affinity to land on the right node:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s3
This is the problem with local PVs at scale — every new volume needs manual directory creation, a manually written PV manifest, and node affinity on every pod that uses it. No dynamic provisioning. Painful to manage.
Results
| Metric | Longhorn | Local PV | OpenEBS |
|---|---|---|---|
| Read IOPS | 811 | 7757 | 7401 |
| Read Bandwidth | 6.3 MiB/s | 60.6 MiB/s | 57.8 MiB/s |
| Read Latency (avg) | 14,189 µs | 1,467 µs | 1,539 µs |
| Write IOPS | 346 | 3328 | 3177 |
| Write Bandwidth | 2.7 MiB/s | 26.0 MiB/s | 24.8 MiB/s |
| Write Latency (avg) | 12,913 µs | 1,377 µs | 1,440 µs |
| CPU Usage (sys) | 4.71% | 26.25% | 26.05% |
| Memory Overhead | ~1.5 GB | none | ~180 MB |
| Backend | User-space | Kernel block device | Kernel block device |
Longhorn's numbers are significantly worse — 10x higher latency, ~10x lower IOPS. That's the cost of going through a user-space storage layer for every I/O operation. Local PV and OpenEBS both go through the kernel block device directly, which is why they're close to each other.
Local PV wins on raw performance but loses on everything else — no dynamic provisioning, manual node affinity management, manual directory creation on each node. It doesn't scale.
OpenEBS: The Sweet Spot
OpenEBS with hostpath provisioner gives us performance close to local PV with actual automation. It handles provisioning, metrics, and lifecycle. Memory overhead is ~180MB for the whole stack — 8x less than Longhorn.
k3s has a built-in local-path provisioner that's similar, but it also requires manually creating directories on each node and gives less control over the storage lifecycle. OpenEBS handles that automatically.
Install:
helm repo add openebs https://openebs.github.io/openebs
helm repo update
helm install openebs --namespace openebs openebs/openebs \
--set engines.replicated.mayastor.enabled=false \
--create-namespace
-
-set engines.replicated.mayastor.enabled=falsedisables Mayastor, OpenEBS's replicated storage engine. I don't need it — my apps handle their own replication. Disabling it keeps the footprint small.
Create the base directory once on each node:
sudo mkdir -p /var/openebs/local
Then PVCs just reference the openebs-hostpath storage class:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: openebs-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: openebs-hostpath
resources:
requests:
storage: 1Gi
No manual PV creation, no node affinity on pods, no directory management per volume. OpenEBS handles it.
Current State
Everything stateful on the cluster — PostgreSQL, ScyllaDB, Redis, NATS — uses OpenEBS with openebs-hostpath. Longhorn is gone.
Top comments (0)