Kubernetes Storage on Bare Metal: Longhorn in Practice

#kubernetesstorage #longhorn #baremetal #homelab

I spent nearly a week trying to get Kubernetes storage working on my bare metal cluster before I finally figured out the right combination of Longhorn settings, node labels, and storage classes that made it stable. Turns out, most of the pain came from assumptions I made about how storage should work—not the tool itself.

If you're running Kubernetes on physical hardware and need persistent storage—especially for databases, media servers, or AI agents that need state—this is for you. You’ll find the gotchas I hit, the config I used, and why it works better than most other bare-metal storage setups I've tried.

My setup involved a small 3-node Kubernetes cluster, all running on bare metal. I wanted to run PostgreSQL, some AI agents, and a few stateful services like Nextcloud and MinIO. The cluster was deployed with Kubespray and Proxmox as the hypervisor. I tried using hostPath for a while, but that was a nightmare when nodes died or storage filled up.

I tried a few other tools first—Rook-Ceph, RBD with Ceph, and even some DIY iSCSI solutions. None of them clicked the way I needed them to. Rook-Ceph was too heavy, and RBD required a full Ceph cluster. iSCSI was flaky and required a dedicated storage node, which I didn’t want to add. I needed something simple, self-hosted, and easy to manage—and that’s when I turned to Longhorn.

I assumed that Longhorn would be a drop-in solution. I followed a few tutorials and tried to deploy it as-is. The first problem was that my nodes weren’t labeled properly for storage scheduling. I didn’t set node.kubernetes.io/role=worker or longhorn.io/host-attached-storage, so the storage manager wouldn’t assign volumes correctly.

Then I tried to create a PVC and it failed with ProvisioningFailed. I thought it was a storage class issue, but it turned out my disks weren’t labeled correctly in Longhorn. I had three drives on each node, but I forgot to set them to used—Longhorn wasn’t picking them up for volume placement.

I also tried using numberOfReplicas: 3 on a 3-node cluster, which sounded logical. But when I tried to write data to a volume, it started failing because it couldn’t replicate across all three nodes. I had to manually adjust the replicas and found that a 1 replica was actually more stable in my setup.

Here’s how I got Longhorn working reliably. This is based on a 3-node Kubernetes cluster with Proxmox VMs, each with a single storage disk. You can adjust the number of replicas and storage classes based on your cluster size and reliability needs.

Step 1: Install Longhorn via Helm

First, install Longhorn with Helm. I used the official chart from the Longhorn repo.

helm repo add longhorn https://charts.longhorn.io
helm repo update
helm install longhorn longhorn/longhorn --namespace longhorn --create-namespace

This will deploy the Longhorn manager, engine, and UI components.

Step 2: Label Your Nodes

Longhorn uses node labels to determine where to place volumes. Make sure your nodes are labeled properly.

kubectl label nodes <node-name> longhorn.io/host-attached-storage=true
kubectl label nodes <node-name> node-role.kubernetes.io/worker=true

Repeat for all worker nodes. This tells Longhorn that the node has storage available and is suitable for scheduling.

Step 3: Configure Storage Disks in Longhorn UI

Open the Longhorn UI (usually at http://<your-cluster-ip>:30000) and navigate to the Nodes section. For each node, you'll see a list of disks. Make sure each disk you want to use is marked as used.

You can set the disk to used directly in the UI. This tells Longhorn that the disk is available for volume creation.

Step 4: Create a Storage Class

Now create a storage class in Kubernetes. I used the following for a single-replica setup (ideal for 3-node clusters with minimal redundancy but better performance):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: longhorn-slow
parameters:
  numberOfReplicas: "1"
  staleReplicaTimeout: "2880" # 48 hours
  fromBackup: ""
provisioner: driver.longhorn.io
reclaimPolicy: Delete
volumeBindingMode: Immediate

Apply it with:

kubectl apply -f longhorn-slow.yaml

You can also create a longhorn-fast class with numberOfReplicas: 2 if you want more redundancy.

Step 5: Create a PVC

Now create a PersistentVolumeClaim that uses the storage class you just defined.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: longhorn-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi
  storageClassName: longhorn-slow

Apply it:

kubectl apply -f longhorn-pvc.yaml

And then you can use it in a pod like this:

apiVersion: v1
kind: Pod
metadata:
  name: longhorn-pod
spec:
  containers:
    - name: app
      image: nginx
      volumeMounts:
        - name: longhorn-volume
          mountPath: /usr/share/nginx/html
  volumes:
    - name: longhorn-volume
      persistentVolumeClaim:
        claimName: longhorn-pvc

Apply that and you should have a pod with a persistent volume backed by Longhorn.

Longhorn works well on bare metal because it abstracts away the need for a full storage cluster. It doesn’t require a shared filesystem or a dedicated storage node—just one or more storage disks per node. The key to stability is labeling your nodes correctly and configuring the storage class with the right number of replicas.

Longhorn uses a distributed engine to manage volume replication and snapshotting. When you write data to a volume, Longhorn creates a block device that mirrors the data across replicas. The more replicas you have, the more redundant the data is—but also the more resources it uses.

By setting numberOfReplicas: 1, I was able to get better performance and avoid the issues I had with multi-replica setups in a small cluster. Longhorn’s snapshot system is also great—it allows you to take snapshots and roll back to them if needed, which is especially useful for databases or stateful apps.

Another thing that helps is the staleReplicaTimeout setting. This tells Longhorn how long to wait before removing a stale replica that’s no longer in sync. I set it to 2880 (48 hours) to give myself time to investigate issues before cleanup.

Here’s what I learned after running this setup for a few weeks:

Don’t assume all nodes can host storage: I had one node that wasn’t labeled properly, and that caused volumes to fail to schedule. Make sure all nodes are labeled correctly.
Replicas = redundancy, not performance: Setting numberOfReplicas: 3 in a 3-node cluster didn't improve performance—it just caused more overhead and failures. Stick to 1 or 2 replicas unless you really need it.
Storage classes matter: I had a few PVCs that failed because I didn’t use the right storage class. Make sure your pods are using the correct class for their use case.
Snapshots are powerful but not automatic: Longhorn snapshots work well, but you have to manage them manually unless you set up an external tool like Velero or Kubernetes Event Watcher.
Disk labels are important: If your disks aren’t marked as used in Longhorn, it won’t schedule volumes on them. That was a big gotcha I hit early on.

Overall, I’m happy with how Longhorn works on bare metal. It’s lightweight, reliable, and doesn’t require a full storage cluster. It’s not perfect—there are some quirks regarding scheduling and snapshot management—but for a small to medium-sized cluster, it’s one of the best options I’ve found.