DEV Community

Cover image for Turn your OpenWrt router into a quorum device for Proxmox VE cluster
Osmium
Osmium

Posted on • Originally published at osmiumsilver.github.io

Turn your OpenWrt router into a quorum device for Proxmox VE cluster

Why Bother

I have a two-node Proxmox VE cluster, and two-node clusters have an inherent distributed systems problem: if either machine goes down, the surviving node can't tell whether it should keep running or not. Running a quorum usually requires a majority vote, which normally means at least three voters. With only two machines, one vote each, lose one and you're down to one vote. No quorum and your cluster stops.

The solution is a quorum device. Proxmox supports corosync-qnetd, a lightweight daemon from the corosync project that acts as a third-party tiebreaker. A qnetd doesn't participate in storage or compute, it just casts the deciding vote during a split-brain scenario. By bringing the total vote count to three, the cluster maintains a majority requirement and consequently, if any node fails, the surviving node and the router’s vote can collectively sustain quorum, making the cluster keeps running.

The ideal host for something like this is a device in a home network that basically never gets turned off, like a router.

I had a Xiaomi Mini router running OpenWrt, MT7620 chip, 128MB DDR2 memory, 16MB SPI Flash, mipsel_24kc architecture. It runs 24/7, draws a few watts, perfect for the job. The problem: there's no corosync-qnetd package for mipsel in the official OpenWrt feeds. :(

An online image of Xiaomi Mini Router

Finding a Starting Point

After some searching, I found someone on GitHub who had compiled corosync-qnetd for aarch64 (ARM 64-bit) OpenWrt, targeting GL.iNet routers and using the old opkg/ipk package format. A good starting point.

I had essentially zero experience with C, Makefiles, or cross-compilation packaging. I knew C programs need a Makefile to build, but that was about it. So my approach was simply throw it at Claude, hit an error, ask Claude about the error, hit the next error, ask about that one like a no-brainer. Six hours of tweaking compile parameters over and over, eventually it compiled successfully (phew). I installed it on the router, and it worked!

The Dependency Investigation

But I needed to understand why it works. Time to do a proper investigation.

I started digging into the Makefile syntax, the build parameters, and the dependency structure. But mastering every parameter would be impossible because many of them are deeply encapsulated, documented in their own specific docs with their own quirks.

My approach is simply upstream-first, and I wanted to use official packages whenever possible to maintain generality. The original project compiled custom versions of NSS, NSPR, libknet, and libqb, plus a homebrewed corosync-nss-tools and that's a lot of custom packages to maintain.

To verify what was actually necessary, I used a trick from Claude during our earlier sessions: running ldd against a binary shows you exactly which shared libraries it actually links to at runtime. So I ran it against the corosync-qnetd binary on the router:

ldd /usr/sbin/corosync-qnetd
  /lib/ld-musl-mipsel-sf.so.1
  libnss3.so => /usr/lib/libnss3.so
  libssl3.so => /usr/lib/libssl3.so
  libnspr4.so => /usr/lib/libnspr4.so
  libgcc_s.so.1 => /lib/libgcc_s.so.1
  libc.so => /lib/ld-musl-mipsel-sf.so.1
  libnssutil3.so => /usr/lib/libnssutil3.so
  libplc4.so => /usr/lib/libplc4.so
  libplds4.so => /usr/lib/libplds4.so
Enter fullscreen mode Exit fullscreen mode

And that's pretty much it, NSS and NSPR. Nothing else.

Which means that of the four dependencies bundled in the original project, two of them, libknet and libqb, were completely unnecessary for qnetd.

libknet handles inter-node network communication for the corosync cluster daemon itself; libqb provides logging and IPC for the corosync main process.

But the qnetd server is an independent program, it doesn't use either of them. Not to mention the author's project README said it was for qnetd, yet it compiled and bundled libraries that only the cluster daemon needs. My best guess is that they looked at corosync's upstream README, which lists all dependencies for the full corosync project (including the qdevice client), and just compiled everything at once.

The Portability Principle

My goal became: compile only corosync-qnetd itself, and let the package manager handle everything else through official repositories.

The final dependency chain is clean:

corosync-qnetd (custom-compiled, the only custom package)
├── libnss    ← official OpenWrt package
├── nspr   ← official OpenWrt package (pulled in by libnss)
├── nss-utils ← official OpenWrt package (includes `certutil`, `pk12util` used for certificate management)
└── openssl-util ← official OpenWrt package (needed by the `certutil` signing script)
Enter fullscreen mode Exit fullscreen mode

The Technical Challenges

With the "what" and "why" covered, here are the significant technical problems I encountered and how they were solved, in roughly the order I hit them.

NSS Cross-Compilation: The Deepest Pit

corosync-qnetd depends on Mozilla NSS (Network Security Services) for TLS. NSS doesn't use autoconf, it has its own build system called coreconf, driven by environment variables to determine the target platform. The key variables are defined in coreconf/arch.mk:

OS_ARCH := $(subst /,_,$(shell uname -s))   # e.g. "Linux"
OS_TEST := $(shell uname -m)      # e.g. "x86_64", "mipsel"
Enter fullscreen mode Exit fullscreen mode

When OS_ARCH equals Linux, the build system loads coreconf/Linux.mk for platform-specific flags. The problem is: in a cross-compilation environment, uname -s and uname -m return the host machine's values, not the target's. If you don't override these variables, NSS would happily detects my x86_64 build server, uses the host's cc, and produces x86 object files. The link stage then explodes because you're trying to link x86 objects against mipsel libraries.

The original aarch64 project had set OS_ARCH=aarch64 with USE_64=1, which was non-standard, the OpenWrt official NSS Makefile uses OS_ARCH=Linux and OS_TEST=$(ARCH) instead. The aarch64 build likely worked despite the non-standard OS_ARCH value due to other variables compensating, but that approach couldn't be carried over to mipsel.

Diagnosing this took a while. The build log showed a wall of link errors that initially looked like missing libraries. The real clue was checking the object files with file command, they were all x86 binaries, not mipsel.

The fix was to follow the same pattern as OpenWrt's official NSS package, explicitly tell NSS everything it needs to know about the target:

OS_ARCH=Linux        # load coreconf/Linux.mk
OS_TARGET=Linux      # what NSS actually checks internally
OS_TEST=mipsel       # equivalent to uname -m on the target
CPU_ARCH=mipsel      # target CPU architecture
# USE_64 not set    # mipsel_24kc is 32-bit
Enter fullscreen mode Exit fullscreen mode

Through all of this I also learned some C compilation basics, every dependency's headers and libraries need to be present in the build environment before anything will compile, and how OpenWrt's feed system works to pull in those dependencies automatically.

NSPR Header File Mess

Corosync depends on nss, and nss depends on NSPR (Netscape Portable Runtime), NSS's sub-makefiles hardcode the search paths for NSPR headers and libraries, paths like dist/<OBJDIR>/include and dist/public/nspr. Setting the NSPR_INCLUDE_DIR environment variable isn't enough because different NSS build modules look for NSPR in different places using different methods.

The fix was to manually stage NSPR's headers and libraries into the directory structure NSS expects before compilation:

mkdir -p $(PKG_BUILD_DIR)/dist/target.OBJ/include
cp -fpRL $(STAGING_DIR)/usr/include/nspr/. \
    $(PKG_BUILD_DIR)/dist/target.OBJ/include/
mkdir -p $(PKG_BUILD_DIR)/dist/public/nspr
cp -fpRL $(STAGING_DIR)/usr/include/nspr/. \
    $(PKG_BUILD_DIR)/dist/public/nspr/
mkdir -p $(PKG_BUILD_DIR)/dist/target.OBJ/lib
for lib in libnspr4.so libplc4.so libplds4.so; do \
    cp -fpL $(STAGING_DIR)/usr/lib/$$lib \
        $(PKG_BUILD_DIR)/dist/target.OBJ/lib/ 2>/dev/null || true; \
done
Enter fullscreen mode Exit fullscreen mode

A host-native version of nsinstall (a build tool NSS needs to run on the host during compilation) also had to be compiled first, otherwise it would try to run a mipsel binary on the host machine.

The ar Parameter

All .c files compiled correctly with the cross-compiler, but linking produced an error:

mipsel-openwrt-linux-musl-ar cr target.OBJ/arena.o target.OBJ/error.o ...
mipsel-openwrt-linux-musl-ar: target.OBJ/arena.o: file format not recognized
Enter fullscreen mode Exit fullscreen mode

ar was treating the first .o file as an existing archive. The cause: NSS's makefile adds cr flags when invoking ar, but I had also set AR="$(TARGET_CROSS)ar cr" in the OpenWrt Makefile. Two crs stacked together, mangling the command-line arguments entirely. The fix was changing AR from $(TARGET_CROSS)ar cr to just $(TARGET_CROSS)ar.

Trimming NSS: Cut to the Bone

A full NSS build is painfully slow and drags in sqlite3 and a bunch of other dependencies. corosync-qnetd only uses a small subset of NSS functionality. Only the six .so files qnetd actually needs were kept: libnss3, libssl3, libsmime3, libnssutil3, libsoftokn3, and libfreebl3. The original project already had the idea here, it used NSS_DISABLE_DBM=1 to eliminate the sqlite3 dependency and NSS_DISABLE_LIBPKIX=1 to skip PKIX certificate path validation. I kept both of these flags. But the package relationship required careful handling, so I created a separate package called nss-qnetd instead of reusing the original's NSS build, for two reasons.

First, naming. The package needed to coexist with the official libnss in OpenWrt's package system without conflicting. nss-qnetd declares PROVIDES:=libnss so OpenWrt's package manager can resolve the libnss3.so dependency at compile time. But at runtime on the router, corosync-qnetd's runtime dependency points to the official libnss from the feeds — compile with the stripped version for headers and libraries, run with the package from upstream.

Second, improved robustness. The original project used patch files to modify NSS source files like nss/cmd/manifest.mn. Patches depend on exact line numbers, when I upgraded to NSS 3.112, the line counts didn't match and the patch failed with malformed patch at line 3, so by replacing all patches with direct shell commands in Build/Compile gives it a more robust way of injecting params.

The custom corosync-nss-tools

Then there was corosync-nss-tools, a package the original author created that doesn't exist upstream. It bundled certutil and pk12util (NSS command-line tools for managing certificate databases) together with custom shell wrapper scripts that reimplemented the certificate initialisation logic. But OpenWrt's official feeds already have an nss-utils package that provides certutil and pk12util. And the upstream corosync-qdevice project already ships its own corosync-qnetd-certutil script. There was no need to repackage binaries that are already available or to rewrite scripts that upstream already maintains. Removing corosync-nss-tools entirely and depending on the official nss-utils was the straightforward improvement.

The musl dladdr Trap

This fix was inherited from the original project and is worth documenting. NSS's libsoftokn3.so calls dladdr() at runtime to find its own file path, then uses that path to locate and load libfreebl3.so. This works fine on glibc, but musl's dladdr() implementation behaves differently — in certain cases it can't correctly resolve the path of an already-loaded library, causing softokn to fail to find freebl. Mozilla bug tracker Bug 511312 confirmed this.

The original author's fix was to use patchelf after compilation to add an explicit dependency on libfreebl3.so to libsoftokn3.so:

find $(PKG_BUILD_DIR)/dist -name libsoftokn3.so -exec \
    $(STAGING_DIR_HOST)/bin/patchelf --add-needed libfreebl3.so {} \;
Enter fullscreen mode Exit fullscreen mode

This makes the dynamic linker automatically load freebl when loading softokn, bypassing the dladdr() issue entirely. Notably, the OpenWrt official NSS Makefile uses a related approach — it sets FREEBL_NO_DEPEND=1 to handle the same softokn-freebl loading relationship.

OpenWrt 25.12: Big Changes to Package Management

With cross-compilation sorted, I ran into a series of changes in the latest OpenWrt 25.12.

The biggest change was the package manager switch from opkg to apk (Alpine Package Keeper). Output format changed from .ipk to .apk. Fortunately, the OpenWrt SDK 25.12 natively outputs apk format, I just needed to update all .ipk references to .apk in the build scripts.

But after installing the corosync-qnetd apk, more insidious problems were at runtime. Running the certificate initialization script corosync-qnetd-certutil -i produced a barrage of errors:

stat: applet not found
chown: unrecognized option '--reference'
sha1sum: command not found
ps: unrecognized option: e
Enter fullscreen mode Exit fullscreen mode

OpenWrt's userspace tools are mostly BusyBox, stripped-down versions with limited options. BusyBox's chown doesn't support --reference, the stat applet wasn't compiled into this firmware's BusyBox, ps doesn't support the -e flag. The certutil script was written for a full GNU/Linux environment and fell apart in BusyBox land.

Full GNU versions of these tools were needed. But here was another trap: OpenWrt 25.12 splits coreutils into dozens of individual sub-packages (coreutils-chown, coreutils-stat, coreutils-sha1sum, ...). The main coreutils package is an empty shell — installing it installs nothing. apk info -L coreutils outputs exactly one line: lib/apk/packages/coreutils.list.

The final list of runtime dependencies that needed to be installed individually: bash, coreutils-chown, coreutils-stat, coreutils-sha1sum, procps-ng (for a full ps), openssh-sftp-server (PVE's scp defaults to SFTP mode — without sftp-server on the router, certificate copying fails), and openssl-util (the certutil signing script needs the openssl command-line tool).

All of these were added to the Makefile's DEPENDS, so a apk add corosync-qnetd pulls in everything automatically.

The Final Result

After all the pitfalls were navigated, the end product is one clean OpenWrt package. Installation only takes a single command and all dependencies are pulled automatically. Initialise the certificate database (key generation takes about five to ten minutes on mipsel hardware), start the service, then run pvecm qdevice setup <router IP> on any PVE node. Certificates exchange automatically, and the cluster recognises the Qdevice.

The core changes in the final Makefile: corrected NSS architecture parameters from aarch64 to the proper mipsel 32-bit combination; trimmed NSS to compile only the libraries qnetd needs; added all the runtime dependencies and removed unnecessary libknet and libqb; migrated from ipk to apk.

Porting corosync-qnetd to mipsel means thousands of MIPS-based routers running OpenWrt can now serve as quorum arbitrators for Proxmox clusters.

A full day's work, morning to night. Worth it.

GitHub Releases: https://github.com/osmiumsilver/corosync-qnetd-openwrt-mipsel/releases

More Links

NSS Build System

  • NSS coreconf/arch.mk — Where OS_ARCH, OS_TEST, and OS_TARGET are resolved. Shows that Linux.mk is included whenever OS_ARCH=Linux, regardless of CPU architecture.
  • NSS coreconf/Linux.mk — Platform-specific flags for Linux targets, including the AR definition that causes the double-cr issue.
  • NSS coreconf/detect_host_arch.py — Host architecture detection script; recognizes mips/mips64 but only for host detection, not cross-compilation.
  • OpenWrt official NSS Makefile — Reference implementation for cross-compiling NSS on OpenWrt. Uses OS_ARCH=Linux, OS_TEST=$(ARCH), and FREEBL_NO_DEPEND=1.

corosync-qdevice / corosync-qnetd

Community Discussions

Top comments (0)