Code In, Cluster Out: Building Reproducible Edge Kubernetes with NixOS, K3s, and Forgejo

ces0712 — Sat, 25 Apr 2026 18:09:23 +0000

What if your entire Kubernetes edge cluster, from the kernel to the workload, was a single reproducible function?

No drift. No snowflakes. No, "this node got fixed manually six months ago, and nobody remembers how." Just code in, cluster out.

That question led me into a project that combines:

infrastructure-nixos for the Raspberry Pi-hosted Forgejo control path
edge-cluster-infra for Oracle networking, compute, and block storage
infrastructure-secrets for the shared SOPS-managed secret layer
nix-k3s-edge-cluster for the NixOS + K3s runtime and workload layer
RustDesk as a real workload proof point
A Raspberry Pi-hosted Forgejo instance, a Mac mini runner, and an Oracle edge node as the deployed target

This post is the practical version of that story: what I built, what actually worked, what hurt, and why I think the most interesting thing here is not Nix syntax, but where the source of truth lives.

At a glance

I split the system into four repositories with clear boundaries: control plane, infrastructure, secrets, and runtime.
Forgejo on a Raspberry Pi is the canonical GitOps path. GitHub is only a public push mirror.
Oracle infrastructure is provisioned separately, then handed off to a NixOS + K3S runtime repo.
RustDesk is the end-to-end workload proof because it forces real decisions around networking, persistence, and access.
The system only started to feel trustworthy once backup and restore became visible, testable, and boring.

Why edge gets weird fast

Edge environments are where nice Kubernetes assumptions go to die.

They run on constrained hardware. Networks are unreliable. Access is awkward. Sometimes direct SSH is temporary, risky, or unavailable. The failure mode is not just "my pod crashed." The failure mode is "this node became special."

That is what I wanted to avoid.

Traditional approaches can get you pretty far:

cloud-init
shell scripts
Ansible
golden images
imperative kubeadm or post-install hand tuning

But the deeper I went, the more obvious the problem became: cluster-level reproducibility starts too late.

If the node itself is still imperative, you are already carrying drift before Kubernetes even starts.

That was the core problem I wanted to solve.

The model I wanted instead

I wanted the deployment artifact to be bigger than a container image.

I wanted a declared system that includes:

Operating system state
Kubernetes runtime state
Workload intent
Secret wiring
Backup and restore readiness

That is where NixOS and K3S became interesting together.

K3S gives me a lightweight Kubernetes distribution that fits the edge reality much better than a heavyweight control plane.

NixOS gives me a declarative host where packages, services, users, storage, networking, and system behavior are all expressed as code and activated atomically.

The result is not "immutable infrastructure" in the buzzword sense. It is something more practical:

the host, the cluster, and the workload converge from the same source of truth

The architecture

This is the deployed shape I ended up with:

The important part is the split:

Forgejo on Raspberry Pi with NixOS is the control path
Mac mini runner executes the workflows
Oracle edge node is the NixOS + K3s target
RustDesk is the workload used to prove the system end-to-end

This is not really a story about "installing a cluster." It is a story about building an operational path that starts in Git and ends in a reproducible edge node.

Why I split it across four repositories

One of the more useful lessons from this work is that the system became easier to reason about when I stopped trying to force everything into a single repo.

The split is intentional:

infrastructure-nixos owns the Raspberry Pi running Forgejo on NixOS
edge-cluster-infra owns the Oracle infrastructure only
infrastructure-secrets owns the encrypted secret material
nix-k3s-edge-cluster owns the runtime host, K3s, apps, and validation logic

That gave me cleaner boundaries:

Infra does not pretend it knows about workloads
Secrets do not get buried inside runtime repos
The Git control path has its own declared host
The edge runtime repo can focus on host + cluster + workload

For me, that was a better shape than one giant repo full of mixed concerns.

It also made the system easier to explain to other engineers, which is usually a good sign that the boundaries are doing real work.

What each repository is really doing

`infrastructure-nixos`

This is the Raspberry Pi side of the story.

It is not just "the box that runs Forgejo." It is a declared NixOS host with its own deploy, validate, backup, and restore flow. That matters because the GitOps engine is also part of the reproducible system, not an external assumption.

`edge-cluster-infra`

This repo owns only the Oracle layer:

Networking
Compute
Block storage

That means Infra planning and applying stay explicit, and the handoff to the runtime repo is intentional instead of magical.

Technically, this repo is the place where I wanted all cloud-specific concerns to stop:

OpenTofu and Terramate orchestration
OCI networking, storage, and compute
Centralized local state
Runner-local var files
Pre-apply state backup

This prevents the runtime repo from turning into a hidden infrastructure repo.

It also let me be explicit about something people often blur away in demos: the infrastructure state is local on purpose.

For a single-operator system, I preferred a centralized local OpenTofu state path on the runner, with timestamped pre-apply archives and off-machine copies, over prematurely pretending I needed a remote backend just to look more cloud-native.

That gave me a simpler operational model:

One clear operator state location
Automatic state backup before real infra changes
An explicit restore path for the state itself

That is not a forever design for every team, but it was the right design for this phase of the system.

`infrastructure-secrets`

This is the shared SOPS repository.

It keeps the secret model stable across both the Forgejo Pi and the Oracle edge node, including:

Tailscale auth material
Backup credentials
Service secrets

That consistency ended up mattering more than I expected.

It also kept the secret transport story stable across repos:

Edit once in the SOPS repo
Decrypt at deploy time through sops-nix
Stage the age key on the target where required
Reuse the same secret names and mental model between hosts

`nix-k3s-edge-cluster`

This is the runtime repo.

It owns:

Host definitions
Reusable modules
App declarations
K3s manifests
Deploy and validate scripts
Backup and restore checks

This is where the edge node stops being a one-off machine and starts becoming a platform.

That repo owns the behaviors that actually define the edge box:

Bootstrap mode
Tailscale-first access
K3S server enablement
RustDesk manifest generation
Backup validation
Restore checks

In other words, this is the repo that answers the question "what is this node supposed to be?"

Cross-repo control flow

The architecture became much easier to explain once I started treating the handoffs as first-class design elements.

The actual control flow looks like this:

edge-cluster-infra provisions Oracle resources and produces runtime handoff values.
Forgejo, hosted through infrastructure-nixos, stores the Git history and workflow definitions.
The Mac mini runner executes the workflows using runner-local config and state.
nix-k3s-edge-cluster consumes the host/runtime intent and converges the Oracle node through Colmena.
infrastructure-secrets provides the encrypted secret layer consumed through sops-nix.

That split gave me a system where each repo has a clear question it answers:

What should exist in the cloud?
What stores and runs the Git control plane?
What secrets are allowed into the system?
What should the runtime node actually do?

That is a better architecture conversation than "here is my monorepo."

More importantly, it created explicit interfaces between layers instead of hidden coupling.

At a high level, the interfaces look like this:

edge-cluster-infra exports cloud facts
- host identity
- private IP
- block storage attachment details
- region and subnet context
infrastructure-secrets exports secret facts
- Tailscale auth material
- backup credentials
- service secrets
nix-k3s-edge-cluster consumes those facts and turns them into runtime behavior
infrastructure-nixos provides the self-hosted Git control path that stores the automation driving everything else

That may sound obvious, but it changed the system's maintainability. Once the handoffs are explicit, you can reason about changes without reloading the whole stack into your head.

What the runtime repo actually looks like

I tried hard to avoid a magical repo where every concern is buried in nested abstractions.

The runtime repo is intentionally legible:

That shape matters because it keeps the boundaries visible:

hosts/ describes the target machine
modules/ captures reusable NixOS behavior
apps/ declares workload intent
scripts/ handles the operational glue

This is the kind of structure that helps when you come back six months later and need to answer: where does this behavior actually come from?

It also fits the repo split cleanly:

Oracle provisioning lives elsewhere
Secrets live elsewhere
Forgejo lives elsewhere
This repo focuses on the runtime contract for the edge node

Technically, that contract is a composition of:

host-specific config in hosts/cloud-edge-1
reusable platform modules in modules/
app-specific Nix modules in apps/
operator wrappers in scripts/
workflows in .forgejo/workflows/

That structure let me keep the slide-sized code excerpts honest. I was not cherry-picking from a giant, ambiguous config file. The repo actually has that shape.

This also improves failure analysis.

When something breaks, the search space is narrower:

cloud provisioning bug?
- look in edge-cluster-infra.
secret resolution bug?
- look in infrastructure-secrets and sops-nix wiring
host convergence bug?
- look in nix-k3s-edge-cluster/modules or deploy scripts
Git/control-plane bug?
- look in infrastructure-nixos.

That kind of fault isolation is hard to achieve when every concern shares the same repo and abstractions.

For a platform that spans cloud provisioning, secrets, host convergence, and workload deployment, that reduction in search space is a real operational advantage.

Host declaration vs cluster declaration

One subtle advantage of this model is that NixOS and K3S give you two distinct but adjacent declaration layers.

At the host layer, I can declare things like:

Bootstrap behavior
Root login policy
Tailscale access
Local state directories
Backup paths

At the Kubernetes layer, I can declare workload intent through services.k3s.manifests.

In many setups, the control plane is where declarative intent starts. In this setup, declarative intent starts one layer lower:

Disk and filesystem assumptions
Bootloader behavior
SSH posture
VPN access plane
K3S control-plane role
Workload manifests

That does not replace Kubernetes. It gives Kubernetes a more deterministic substrate to stand on.

The difference is subtle, but it changes how you think about reliability. Container reproducibility matters. Host reproducibility matters too.

A real workload, not a hello world

I did not want to prove this with a toy deployment.

RustDesk was a better example because it forces me to care about:

Network exposure
Host integration
Persistent data
Real end-to-end usage

In this model, workload intent is still declared in Nix and handed to K3S as manifests.

That is the key idea:

not "Nix installs K3S."
but "Nix declares what the cluster should run."

In this case, the workload design also forced a few concrete runtime decisions:

hostNetwork = true
host-backed persistent state
explicit RustDesk server containers
a private access model through Tailscale

That is exactly why RustDesk was useful here. It is opinionated enough to flush out whether the system is real.

Architecturally, it also exercised the kinds of assumptions that usually get postponed:

How the host state is mounted into the workload runtime
Which ports and network model does the workload expect
Whether the access plane is public, private, or tunneled
How workload identity and persistent state survive rebuilds

That made it a much better proving ground than a simple HTTP deployment would have been.

Why I used Colmena for runtime convergence

I wanted runtime deployment to be obviously Nix-native.

Colmena fits because it keeps the host convergence model close to the flake and module structure, rather than introducing a second orchestration abstraction for the runtime layer.

That gave me a clean separation:

OpenTofu/Terramate for cloud infra
Colmena for host/runtime convergence
K3S manifests for workload state

I like that split because each tool owns its own distinct layer, rather than competing for the same responsibilities.

That tool separation gave me a layered control model:

OpenTofu/Terramate answer: "What cloud resources should exist?"
Colmena answers: "What should this host converge to?"
K3S answers: "What should run inside the cluster?"

Because those questions stay distinct, the implementation stays more legible.

GitOps path: visible, staged, and boring

One of the best parts of the final setup is that the GitOps path is visible.

Not hidden in shell history. Not living in a one-off laptop script. Not dependent on a human remembering the right sequence.

The workflow surface in Forgejo looks like this:

And the infra apply path is explicit:

What I like about this is not just that it is automated. It is that the automation has shape:

checks
plan
apply
handoff values

You can explain it to another engineer without having to narrate a shell session.

And because Forgejo itself lives in infrastructure-nixos, the GitOps path is also part of the same self-hosted story.

The important detail here is that the runner is not pretending to be stateless.

The Mac mini is deliberately the trust anchor for:

SSH identities
Local OCI auth
Centralized OpenTofu state
The handoff between infra and runtime repos

I did not try to turn that into fake cloud-native purity. I made it explicit and automated around it.

That tradeoff is important.

I am not claiming that the Mac mini is an ideal universal model. I am saying that making the trust anchor explicit was better than pretending I had a stateless control plane while quietly depending on a stateful operator machine anyway.

In practice, that produced a more honest system:

Runner-local SSH identities
Runner-local OCI auth
Centralized local OpenTofu state
Explicit workflow-to-runtime handoff

That honesty made the automation simpler, not weaker.

Self-hosted control plane, public mirror

One detail that matters a lot in practice is that I did not want GitHub to become the canonical source of truth just because I wanted public visibility.

Forgejo, running on the Raspberry Pi through infrastructure-nixos, remains the primary remote.

GitHub is a downstream push mirror.

Architecturally, that distinction matters:

Workflows live in the self-hosted control plane
Operational history lives in the self-hosted control plane
The mirror gives public visibility without taking control away from the self-hosted path

In my setup, that mirror is configured as a native Forgejo push mirror over HTTPS using a fine-grained GitHub token scoped to the destination repository.

That is a boring detail, but a useful one. It means I get:

A self-hosted GitOps path on NixOS
Public repository visibility on GitHub
No need to pretend GitHub is the control plane when it is not

For me, that turned out to be the right compromise between operational ownership and public sharing.

It also kept the public story aligned with the operational one.

People can discover the work on GitHub, but the real automation path still begins with a self-hosted Forgejo instance on NixOS. That means I am not maintaining one narrative for public code and another for actual deployment.

The hard part was not Nix syntax

Nix has a learning curve. That part is real.

But the hardest part of this project was not writing Nix expressions. The hardest part was operational sequencing.

This is the part that took the most real engineering:

The tricky bits were things like:

Temporary OCI bootstrap SSH
nixos-infect and bootloader behavior
First deployment while bootstrap access still exists
Reboot validation
Switching to steady-state Tailscale-first operations
Keeping the repo boundaries clean even while the bootstrap path was still awkward

That is where the real learning was.

The host was not "done" when the config evaluated. The host was done when the recovery and access model made sense after reboot.

That sequencing point is worth emphasizing because it is easy to miss in declarative systems. A config can be correct but still not operationally safe if the transition order is wrong.

That is the sort of problem that shows up all the time in real infrastructure work:

The final state is reasonable
The transition path is what bites you

For edge systems, those transition paths matter even more because recovery access, bootstrap access, and steady-state access are often different systems.

That is also why I no longer think of bootstrap as a small implementation detail. It is part of the architecture.

Proof that the workload really landed

I wanted the talk and this post to prove real runtime state, not just repo beauty.

Inside the cluster:

That screenshot is important because it shows the two expected RustDesk server containers:

hbbs
hbbr

That ties directly back to the declared workload.

And from the user side:

That is the level of proof I wanted:

Declared in code
Deployed through the workflow
Visible in the cluster
Usable by a real client

For me, that combination mattered more than a polished demo. It tied together:

Repo intent
Deploy path
Cluster state
User-visible result

Backups are layered, not singular

One thing I do not want to undersell is that "backups" here are not one checkbox.

The platform ended up with multiple backup types because each layer has a different failure mode.

Forgejo control plane backups

On the Raspberry Pi side, infrastructure-nixos uses:

Restic -> BorgBase
- daily append-only backups for Forgejo repositories, custom files, and database snapshots
Rclone -> pCloud
- weekly backups of Forgejo LFS objects

That split is intentional.

The Forgejo host is both the Git control plane and the workflow surface, so protecting it means protecting:

Repository data
The database state
Large file storage

Treating those as a single undifferentiated blob would have been simpler on paper, but worse in practice.

Edge runtime backups

On the Oracle edge node side, nix-k3s-edge-cluster reuses the shared Restic credentials from infrastructure-secrets and backs up:

RustDesk runtime state
the K3s server token

That is deliberately narrower than "back up the whole cluster filesystem."

The current goal is to protect the runtime state that matters for rebuild and access continuity, without pretending I have a fully solved embedded-etcd disaster-recovery story yet.

Infrastructure state backups

The infra layer has a different recovery model again.

edge-cluster-infra creates timestamped OpenTofu state archives, writes a checksum alongside them, and uploads them to pCloud before applying operations that change saved plans.

That means the state for Oracle provisioning is also recoverable without turning the Git repo into a fake state backend.

So the backup model is really three different recovery contracts:

Forgejo control-plane backups
runtime/workload backups
infrastructure state backups

That is closer to how the system actually behaves than saying "I have backups."

More importantly, those layers are recoverable in different ways for different reasons:

Control-plane recovery protects the Git and workflow surface
Runtime recovery protects workload continuity and host-side state
Infra-state recovery protects the declarative record of what exists in Oracle

That is a much healthier model than hoping one backup mechanism will magically cover every layer.

If there is one pattern I would reuse immediately in another platform, it is this one: separate the backup story by failure domain, then make each one operational.

Recovery is part of the story

A reproducible system that cannot prove its backup readiness remains fragile.

That became another important boundary for the project: backup and restore could not remain undocumented "future work."

This is the recovery proof I ended up using:

Why that matters:

Backup readiness is visible in automation
Repository access is validated before the failure day
Restore confidence is operational, not tribal knowledge

That is the point where the system starts to feel trustworthy.

This is another place where reusing the proven patterns from infrastructure-nixos helped. The edge node did not need a completely different recovery philosophy. It needed the same discipline carried into a new runtime.

That is also why I kept backup and restore in the runtime repo instead of leaving them as external runbook folklore. If the runtime contract matters, the recovery contract matters too.

That lesson came directly from the Forgejo Pi work.

The control-plane host was not "done" when backups existed. It was only done once backup validation, restore checks, and a real restore path had been exercised end-to-end. In the Forgejo case, that meant testing the restore flow on spare media, not just trusting that the timer had run.

That distinction changed how I treated the Oracle edge node as well. Backup configuration was not enough. I wanted recovery to be visible, testable, and boring.

From an architectural perspective, backup validation became another declarative boundary.

It was not enough to say:

The host is declared
The cluster is declared
The workload is declared

I also wanted to be able to say:

Backup readiness is testable
restore prerequisites that are testable
Recovery is not a purely manual institution

That is a different reliability bar than "we probably have a backup."

What I would improve next

If I keep evolving this design, the next things I want to deepen are:

More explicit restore rehearsal for the K3S control-plane side
Broader workload examples beyond RustDesk
Clearer multi-node stories using the same runtime model
Continued reduction of bootstrap awkwardness at the host boundary

The current system is already useful, but it also makes the next engineering questions very visible, which I see as a good sign.

Reproducibility is not real until recovery is boring.

That sentence sounds simple, but it changed the way I evaluated the whole system.

Why this matters beyond my own lab

I do not find this interesting just because it worked on my hardware.

I find it interesting because the pattern generalizes to environments where:

Nodes are remote or operationally awkward
Rebuilding confidence is more important than convenience
Infrastructure teams need tighter control over host drift
Kubernetes alone is not enough to describe the real system boundary

That includes a lot of practical scenarios:

remote edge nodes
air-gapped or semi-connected environments
regulated systems where configuration provenance matters
smaller platform teams that need deterministic rebuilds without building a giant internal platform product

The implementation here is specific. The architectural lesson is broader.

When this approach makes sense

I do not think NixOS + K3S is a universal answer.

I think it is a strong fit when:

Nodes are remote
Rebuilding confidence matters
You want host + cluster + workload under one model
Multi-arch or odd hardware is part of the story
Operational consistency matters more than lowest-friction onboarding

I think it is a weaker fit when:

Your team only needs app-level GitOps
The node itself is disposable
You need the lowest-complexity path for a broad team quickly
Nix would become the main project instead of supporting the real project

That is the tradeoff I would be most honest about with any team considering this approach: the determinism is real, but so is the complexity budget.

The part I find most interesting

The most interesting thing here is not Nix by itself.

It is moving the source of truth down to the node boundary.

If you can declare:

The host
The cluster
The workload

In a single coherent system, you get a different kind of reliability than you do from container reproducibility alone.

That, to me, is the real promise of this model.

Repos

If you want, I can also publish a follow-up focused just on:

The bootstrap/reboot sequence
How I split infra vs runtime repos
Or the backup and restore workflow

DEV Community: ces0712