Kubernetes - Adventures - Pilot

#gitops #devops #kubernetes #cicd

My Kubernetes Journey: From a Fragile Single-Node Setup to a Stable GitOps-Powered Homelab Cluster

A few years back, I dipped my toes into Kubernetes by spinning up a simple lab cluster: one control-plane node, one worker, and a handful of basic services. To my surprise, that minimal setup ran reliably with almost zero maintenance for about two and a half years. Then it suddenly broke.

That failure became the catalyst for a complete rebuild. I set out to create a proper development cluster—one control-plane node and three workers—using a vanilla kubeadm installation. Over the next three months, I methodically reproduced every common pitfall I could find, reinstalling repeatedly until the process felt rock-solid and repeatable.

Early on, I shifted toward GitOps practices. I built CI/CD pipelines that handled every cluster change and installation step, relying heavily on kubectl to apply manifests directly. This approach gave me confidence and speed, turning deployments into automated, version-controlled events.

The CNI Rollercoaster

Networking proved to be the biggest source of headaches. I started with Weave Net, which worked nicely at first—but the project is no longer actively maintained, so it eventually became unsustainable.
I tried several other CNIs, and each came with its own quirks. Flannel was straightforward but fell short in certain scenarios. Kube-router performed decently overall, yet I could never get Ingress resources to behave reliably. Worse, one day it began overwriting my firewall rules, bringing the entire cluster down hard.

After multiple experiments, I landed on Calico. With some BGP tweaks to fit my environment, it delivered the stability I needed. Today, Calico remains my go-to CNI in the lab.

Handling Ingress and External Traffic

Ingress troubles pushed me toward an alternative: HAProxy running as an external load balancer on a separate VM. HAProxy handles TLS termination and forwards traffic to NodePort services. Because all my VMs (including the HAProxy host) are managed with Puppet, I can easily maintain proxy rules, renew certificates via Certbot, and keep everything consistent.
This hybrid setup let me migrate many workloads off older standalone VMs and onto Kubernetes, which saved noticeable hosting costs.

What Fits Kubernetes—and What Doesn't (Yet)

Not every service thrives in containers with my current constraints, especially on Hetzner Cloud where storage options are limited and latency-sensitive.
Stateful databases like PostgreSQL and MariaDB, along with Prometheus (which demands low-latency writes), struggled enough that I moved them back to dedicated VMs. The same went for Jenkins: the main controller performs better outside Kubernetes, though its agents run happily as pods.

Storage Solutions That Actually Work

For persistent storage, I first looked at Rook/Ceph—but it proved far too resource-heavy for my low-budget setup at Hetzner. Longhorn turned out to be the sweet spot: lightweight, feature-rich, and equipped with solid backup capabilities.

For ultra-low-cost volumes I sometimes fall back to SSHFS mounts. While SSHFS has clear performance and reliability limits, pairing it with Longhorn covers most use cases reasonably well.

The Cluster Today

Right now the cluster runs 18 internal services. Infrastructure components include Keycloak for identity, OpenBao for secrets, Trivy for vulnerability scanning, Longhorn itself, and PgBouncer for connection pooling. Application workloads cover SonarQube, Gitea, OpenProject, Wiki.js, and several others.

The combination delivers excellent stability. Deployments feel effortless thanks to CI/CD pipelines, while Puppet keeps the surrounding VM layer tidy. Puppet doesn't play nicely with container orchestration, though, so Ansible is steadily taking on more responsibility in the environment.

Tools like Argo CD (which I deployed just yesterday and already love), Kustomize, and plain YAML manifests round out the picture. I still lean toward raw manifests for simplicity and direct control, but GitOps patterns are clearly the future.

Looking Ahead: GitOps, Argo CD, and Beyond

With agentic development gaining momentum, I'm now looking into building applications designed to run as future SaaS offerings inside production-grade clusters. Kubernetes has proven to be the right foundation—even if there's still plenty left to master.

The lessons from this homelab will fuel an ongoing series. The next post dives straight into migrating to Argo CD and embracing proper GitOps workflows, which are becoming essential in modern software delivery.

If you're running Kubernetes at home, experimenting in the cloud, or just curious about real-world trade-offs, stay tuned. There's a lot more to share.
What challenges have you hit with Kubernetes in your own setups? Drop a comment—I'd love to hear your stories.

Did you find this post helpful? You can support me.