DEV Community

Cover image for I Got Tired of Docker Eating My Raspberry Pi's RAM — So I Built My Own Container Orchestrator
Mohamed Zrouga
Mohamed Zrouga

Posted on

I Got Tired of Docker Eating My Raspberry Pi's RAM — So I Built My Own Container Orchestrator

A few months ago I noticed something that genuinely annoyed me.

I was running tiny services across my Raspberry Pi lab:

  • webhook workers
  • monitoring agents
  • lightweight APIs
  • ETL processing tasks

Small, focused workloads. The kind of things a Pi is actually good at.

But the infrastructure stack underneath them? It looked like this:

Docker Engine
 └─ containerd
     └─ runc
         └─ CNI plugins (all of them, even the ones I'd never touch)
             └─ orchestration layer
                 └─ service networking
                     └─ monitoring sidecars
                         └─ my actual 8MB process
Enter fullscreen mode Exit fullscreen mode

At some point I sat down and ran ps aux and free -h on a freshly booted node before deploying anything.

The infrastructure was already using more RAM than my applications.

That felt wrong. So I started pulling threads.


What Actually Runs a Container?

Strip everything back. What does "running a container" actually require?

  1. An OCI image unpacked to disk
  2. A rootfs (overlayfs layers)
  3. Linux namespaces (pid, net, mount, uts, ipc)
  4. cgroup resource limits ( cgroup V2 )
  5. A process in that environment

That's it. That's the whole thing.

Everything else — the daemon, the socket, the plugin ecosystem, the CNI chain, the abstraction layers — exists to make that easier to manage at scale.

Scale I don't have on a Raspberry Pi.

So I asked: what's the absolute minimum secure OCI stack that can do this properly?

That question became nyxd https://github.com/zrougamed/nyxd.git .

nyx in action


nyxd: What It Is and What It Isn't

nyxd is a lightweight container daemon I built in Go. It uses:

  • crun as the OCI runtime (not runc — more on this shortly)
  • overlayfs directly via syscall for rootfs
  • no CNI plugins
  • nftables for NAT and port mapping
  • seccomp + NoNewPrivileges enforced on every container

It is not:

  • Kubernetes
  • another Docker replacement
  • "another Podman"
  • trying to be any of those things

The philosophy is reduction. Every component you remove is an attack surface that disappears, a dependency you don't have to update, a CVE you'll never have to patch.


Why crun Instead of runc

This is the first question people would ask.

Here's the honest answer: crun has a remarkably clean security history.

crun CVE history (complete, as of 16-05-2026):

CVE Impact / Type Affected Fixed In
CVE-2026-30892 Local Privilege Escalation via crun exec -u 1 root parsing flaw. 1.19 – 1.26 1.27
CVE-2025-24965 Host Filesystem Escape via the krun architecture handler. < 1.20 1.20
CVE-2022-27650 Privilege jump inside container via leaked Inheritable Capabilities. < 1.4.4 1.4.4
CVE-2019-18837 Host Directory Traversal via malicious symlinks in a crafted image. < 0.10.5 0.10.5

Very few CVEs. In its entire history.

Compare that to November 2025, when runc had three container-escape vulnerabilities disclosed in a single week: CVE-2025-31133, CVE-2025-52565, and CVE-2025-52881. All critical. All allowing container escape to host.

Beyond security, crun has practical advantages for edge/ARM workloads:

  • Lower memory footprint per container
  • Faster startup (C runtime, not Go)
  • Excellent cgroup v2 support from day one
  • Smaller binary (~800KB vs runc's several MB)

On a Pi with 1-4GB RAM running 20+ containers, that adds up fast.


The CVE Surface Nobody Benchmarks

Here's something I started thinking about that I hadn't seen anyone measure properly:

People benchmark CPU and RAM constantly. Almost nobody benchmarks security surface area.

But when you're running edge infrastructure — systems with limited ops coverage, longer uptimes, harder recovery paths — the CVE surface is arguably the most important operational metric.

So let me be direct about the current state as of 16-05-2026:

containernetworking/plugins CVE history (recent):

CVE Plugin Issue Fixed
CVE-2025-67499 portmap nftables backend intercepts unintended traffic v1.9.0
CVE-2025-52881 selinux dep container escape via procfs write misdirection v1.9.1
CVE-2024-34156 all Go stdlib encoding/gob v1.4.0-6
CVE-2023-45290 all Go stdlib net/http v1.4.0-3

And this is just the CNI plugin layer — the binaries a lot of stacks still exec on every container ADD. nyxd's default path doesn't run those plugins at all; I still think the table matters as a reminder of what you inherit when you opt into the full CNI distribution. Add Docker Engine, containerd, runc, BuildKit, and their respective dependency trees and you're tracking dozens of CVEs per year across a stack that most people never fully audit.

The rough security surface comparison:

Stack Binary count External deps Rough CVE exposure/yr
nyxd + crun 2 3 Go modules Very low
Podman + crun ~8 Large Low-medium
Docker Engine full 15+ Very large Medium-high
containerd + runc + full CNI 20+ Massive High
Kubernetes node 30+ Enormous Very high

The numbers aren't precise science. But the trend is real.

Simplicity is a security feature. I genuinely believe that now.


The Networking Decision

Most of the container networking ecosystem exists to solve problems at scale. VXLAN overlays, BGP route propagation, multi-cluster service meshes.

None of that applies to a Raspberry Pi lab.

nyxd's default networking is implemented inside the daemon, in Go, using raw netlink — not by exec'ing the usual CNI plugin chain under /opt/cni/bin.

What that stack actually does:

  • Bridge + veth — brings up nyxbr0, creates the pair, moves the peer into the container netns, names it eth0
  • File-backed IPAM — hands out addresses from the same 10.88.0.0/16-style slice you'd expect from a tiny lab bridge
  • Loopbacklo up inside the netns
  • Port publishing — host→container maps via nftables (we shell out to nft for the rules; that's the one small networking helper we deliberately keep external)

What we don't use (on that default path):

  • The bridge / host-local / loopback / portmap binaries from containernetworking/plugins
  • Flannel
  • Calico
  • kube-proxy
  • Weave
  • Cilium (for this use case)
  • Any overlay mesh networking

If you want the traditional CNI exec path — because you already ship a conflist or you're mirroring another environment — nyxd -net-driver=cni is still there. Same supervisor and API; you bring /opt/cni/bin and your plugins. That's optional complexity, not what you get out of the box.

Zero CNI plugin binaries on the default path means:

  • Nothing to install under /opt/cni/bin on a fresh Pi
  • Nothing to keep updated in that directory for homelab-sized deployments
  • No binary-level CVE surface in that layer for the stack I'm actually running day to day
  • Faster container startup (no fork/exec chain per CNI ADD)

Kernel Requirements (the honest list)

Running nyxd requires these kernel modules:

overlay          # overlayfs for container rootfs
bridge           # nyxbr0 bridge interface
veth             # virtual ethernet pairs
br_netfilter     # iptables/nftables sees bridged traffic
ip_tables        # iptables core
iptable_nat      # NAT table
nf_nat           # connection tracking NAT
nf_conntrack     # stateful packet tracking
nft_masq         # nftables masquerade
seccomp          # syscall filtering
Enter fullscreen mode Exit fullscreen mode

And these sysctls:

net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.conf.all.rp_filter = 0
Enter fullscreen mode Exit fullscreen mode

All of this works on a standard Raspberry Pi OS kernel (6.1 LTS). No custom kernel needed.


Benchmarking This Honestly

Here's how I'm measuring whether nyxd actually delivers on its promises.

Startup latency:

hyperfine --warmup 3 \
  'docker run --rm alpine true' \
  'nyx run alpine true'
Enter fullscreen mode Exit fullscreen mode

On my Pi 5: nyx vs Docker

nyx vs Docker Benchmark

Attack surface check:

# On a Docker host: how many CNI plugins are sitting on disk
ls -la /opt/cni/bin/ 2>/dev/null | wc -l

# Compare to nyxd default: no plugin dir required
lsof -p $(pgrep -x nyxd) | wc -l   # open file descriptors
ss -tlnp | grep nyxd                 # listening sockets (usually just the Unix control socket)
Enter fullscreen mode Exit fullscreen mode

Docker opens more sockets, more file descriptors, and maintains more persistent background connections than a Pi workload typically justifies.

Dependency tree:

# nyxd go.mod external deps
cat go.mod | grep -v "^//"
# 3 modules: image-spec, runtime-spec, golang.org/x/sys
Enter fullscreen mode Exit fullscreen mode

Three. External. Modules. That's the entire dependency graph for the runtime and networking layer.


What the Raspberry Pi Actually Taught Me

The Pi has an interesting property: it forces you to care about things cloud infrastructure lets you ignore.

Thermal throttling — when your CPU is running hot because your container daemon is doing background work, your actual workloads slow down. Less daemon overhead means cooler, more consistent performance.

SD card / SSD wear — fewer writes from logging, state management, and plugin communication extends storage life meaningfully on embedded deployments.

Boot time — when a power cut hits an edge node, boot-to-operational time matters. A lighter stack comes up faster and can rejoin the network sooner.

Debugging under pressure — when something breaks at 2AM on a remote node you can't physically access, a simpler stack is dramatically easier to reason about. Fewer layers means fewer places to look.

Power consumption — I've measured ~1.5W difference in idle power between a full Docker stack and nyxd on a Pi 5. Across 10 nodes running 24/7, that's ~130kWh/year. Not enormous, but real.


Where nyxd Is Right Now

This section separates what the daemon and nyx client exercise end-to-end from what exists mainly as libraries, stubs, or unfinished wiring.

Working (end-to-end in the daemon + nyx client)

  • OCI distribution pull for public images: raw HTTP against registries, layer blobs, and SHA-256 digest verification on ingest (digest mismatch fails the pull).
  • Overlayfs upper/work/merged layout and teardown on container exit paths.
  • Container lifecycle via crun from the supervisor: create/run (including the foreground attach path), stop, kill, delete, and state polling; the control API supports nyx exec.
  • Restart policies: always, on-failure, unless-stopped, and never (empty or unrecognized API values are normalized to conservative defaults in the control layer).
  • Structured JSON lines per container under the daemon data directory (log collector plus GET /v1/containers/{id}/logs), with optional plain-text decoding for attached clients.
  • Default in-process networking (-net-driver=native): bridge, veth, file-locked IPAM, nftables-based -p / publish, and host-side NAT — no CNI plugin binaries on disk.
  • Optional CNI exec path: -net-driver=cni with -cni-bin-dir, -cni-conf-dir, and -network for a traditional plugin-driven stack instead of native.

Implemented but not product-complete (nuance)

  • Health checks (exec / HTTP / TCP)internal/health implements a full checker (intervals, timeouts, retries, start period), but it is not hooked into supervisor: no checker goroutine started with each container, and no unhealthy → stop/restart policy wiring. The code is library-ready, not operator-ready.
  • Prometheus-style metricsinternal/telemetry implements a text OpenMetrics/Prometheus-style /metrics handler, but cmd/nyxd does not call ServeMetrics, so counters and gauges are not exposed unless another binary wires the package in. The format exists; the default daemon does not listen for scrape traffic.

Still being hardened / incomplete

  • Registry auth beyond anonymous public pulls — the current path centers on the Docker Hub anonymous token flow; private registries, stored credentials, and arbitrary OAuth/OCI auth flows are not first-class yet.
  • Compose — YAML parsing is real (gopkg.in/yaml.v3): internal/compose reads a strict subset of compose-shaped fields, validates stacks, and can compute dependency order. What is missing is daemon-side orchestration (no nyx compose up-style command in the shipped mains): a parser is not a multi-service scheduler.
  • Control API and nyx CLI — Unix-socket HTTP for run, ps, logs, stop, remove, pull, images, and exec is usable, but error messages, edge cases, and long-term API stability are still evolving.
  • Seccomp — generated bundles set capabilities, noNewPrivileges, masked paths, and related hardening fields; there is no curated, versioned seccomp JSON checked into the bundle generator yet. Anything beyond the explicit JSON is whatever crun and the host apply by default.
  • systemd-notify — example units may use Type=notify, but nyxd does not emit READY=1 (or reload state) via sd_notify. Production units should use Type=simple until notify is implemented, or notify must be added to the daemon.
  • NFT / port publishing on unusual host sysctl values, exotic dual-stack setups, and odd bridge topologies — the native backend is real code, but it benefits from more soak time outside typical laptop and homelab bridges.

Bottom line

nyxd is not positioned as a drop-in, production-grade Docker replacement today. The largest gaps between documentation and reality are health automation (implemented package, not integrated) and metrics (handler exists, not served by default). Registry authentication and compose orchestration remain intentionally narrow.

The stack does run real workloads in lab settings along the paths above: pull, overlay, crun, native networking, structured logs, and restart policies. The architectural bet — native networking by default, optional CNI when operators want plugin ecosystems — remains coherent even where polish and operational breadth still lag.


The Bigger Question

I keep coming back to this:

Are we over-engineering edge infrastructure?

The modern container ecosystem was designed to solve orchestration at Google-scale. Kubernetes, CNI, CRI, OCI — these are all excellent standards that solved real problems.

But those standards got adopted at every layer of the stack, including layers where the complexity isn't warranted.

A Raspberry Pi running an Go API doesn't need the same infrastructure as a 10,000-node Kubernetes cluster.

An industrial IoT gateway doesn't need BuildKit.

A homelab monitoring stack doesn't need containerd's full plugin system.

The standards are fine. The problem is using the full weight of the enterprise implementation everywhere, including at the edge where resource constraints and operational simplicity matter most.

nyxd is my attempt to find where the floor is. How small can a correct, secure, production-capable OCI runtime stack actually be?

I don't think we've found it yet.


Try It / Follow Along

nyxd is being developed openly. The codebase is Go 1.26.

If you're running:

  • Raspberry Pi clusters
  • ARM64 edge nodes
  • Self-hosted systems
  • VM , QEmu , Proxmox ...
  • Any environment where RAM and attack surface actually matter

I'd genuinely love to hear what you're running and what constraints you're working within.

A few questions for the comments:

  • What does your container stack look like on ARM systems today?
  • What's your idle memory baseline before deploying any workloads?
  • Have you ever audited the CVE history of your CNI plugins?
  • Would you trade orchestration features for a meaningfully smaller attack surface?
  • Is your container stack heavier than your actual workloads?

If there's enough interest, follow-up posts will cover:

  • Full architecture walkthrough with the native network stack (and when I'd still flip on CNI)
  • Memory profiling methodology for container daemons
  • Security surface comparison methodology
  • Deploying nyxd on a Pi cluster from scratch
  • The case for writing your own IPAM in 200 lines of Go

nyxd is made by the community for the community - free to use for personal use

Top comments (1)

Collapse
 
alexmchughdev profile image
Alex McHugh

Very impressive!