Chris

Posted on May 30 • Originally published at mpdc.dev

The Habit That Was The Bug

#security #selfhosted #docker #linux

The first time it happened, I lost 45 minutes diagnosing it. By the tenth or so, I was muscle-memory typing the fix before the symptoms finished registering.

That's the moment I should have flagged. I didn't. I just kept typing.

THE BITE

Ghost went down for 45 minutes. The CMS running mpdc.dev. Site unreachable. docker inspect showed the container healthy. docker logs showed a clean boot. curl localhost:2368 returned nothing. From the LAN, nothing. Browser, nothing.

The container was up. The port was bound. ss -tlnp confirmed docker-proxy was listening. Traffic just didn't reach the application.

That returns-nothing failure mode is one of the worst symptoms in self-hosted infrastructure. It looks identical to ten unrelated problems. No error code. No log line. No popup. The thing you're talking to acts like the thing you're talking to doesn't exist, except you can prove it exists. The kernel says yes, the daemon says yes, your eyes say no.

After 45 minutes of poking I found it. The cause was specific. It bites every self-hoster running an application firewall on a Docker host, and almost nobody documents it.

WHY THIS HAPPENS

Docker doesn't actually do network routing for published ports. When you write ports: - "127.0.0.1:2368:2368" in your compose file, Docker spawns a tiny helper process called docker-proxy for each published port. The proxy listens on the host-side address and shovels bytes between the host socket and the container's internal port. It's a userspace process. It runs as root. It has a PID. The kernel sees it like any other process.

OpenSnitch is a host-level application firewall. It inserts an nftables hook in the inet mangle output chain that diverts packets to a userspace queue. The OpenSnitch daemon reads the queue, looks up the process that owns each connection, evaluates the matching rule, and decides allow or deny. The killer detail: OpenSnitch attributes connections to processes by PID. Process identification is the central feature of the tool. That's why anyone runs it instead of plain iptables.

Default policy on my box is deny. Every connection that doesn't match an explicit allow rule gets dropped silently. No log unless verbose mode is on. No popup unless interactive UI is connected. The packet just disappears at the netfilter layer.

The allow rule for Ghost is permanent: process path /usr/bin/docker-proxy, destination port 2368, allow forever. The rule matches against process path, not PID. So in principle a new docker-proxy spawned for the same port should match the same rule.

In practice, OpenSnitch caches the PID-to-rule binding at the moment a connection is first evaluated. When the container restarts, Docker kills the old docker-proxy and spawns a new one with a new PID. The new PID has no cached binding yet. The first connection through the new PID hits the rule evaluation cold. There's a race between OpenSnitch's process attribution and the packet's transit through the queue. Depending on the kernel version, the eBPF probe state, and a half-dozen other factors that change between releases, the new PID gets attributed reliably or it doesn't. When it doesn't, the connection falls through to default-deny.

Result: container restart kills the site. No logs. No alarms. Just a curl returning nothing.

THE PATTERN

That should have been the end of it. It wasn't.

Every container restart on the box did the same thing. Vaultwarden after a power-loss reboot. The MCP server after a force-recreate. Ghost any time the compose file got touched. The PID rotates. OpenSnitch defaults to denying the new process. The service goes silent. Manual kill -HUP. Service comes back.

Once a week, sometimes more, the same dance.

kill -HUP $(pgrep opensnitchd) works because SIGHUP forces opensnitchd to reload its rules and re-walk the current process table. Every running docker-proxy gets its allow-rule binding reapplied against its current PID. From the next packet forward, the rule matches. The site comes back without a container touch.

I built a reflex for it. Container restart, run the HUP. Within ten minutes of every restart I was reaching for that command without conscious thought. The diagnosis got faster. The fix got faster. The whole loop tightened into a routine.

That tightening is the moment I missed.

THE OBVIOUS FIXES THAT DON'T WORK

Every reader hitting this for the first time tries the same wrong things in the same order. Here are the ones that don't work.

Disable OpenSnitch. Yes the proxy connectivity returns. No you're not running a firewall anymore. You went from "occasionally inconvenienced by a structural bug" to "fully exposed application surface, by choice." Bad trade.

Switch the container to --network host. The container shares the host's network namespace, no docker-proxy involved, no PID rotation problem. You also collapsed your network segmentation. Every container on host networking can reach every other host-bound port without going through any of your inter-container ACLs. If your threat model includes lateral movement, you just gave up the perimeter you spent a year building.

Add a wildcard process allow rule. match any process named docker-proxy, allow. Works. Also means anything that can spawn a process called docker-proxy gets a free pass through your firewall. There are non-malicious binaries with that name. There are malicious ones. You don't want to find out which by allowing all of them.

Run a cron job that HUPs opensnitchd every minute. Functional. Now your firewall reloads its entire rule state sixty times an hour whether anything changed or not. The HUP is cheap but not free. The bigger cost is that you've taught yourself the wrong lesson: that the answer to event-driven problems is brute-force polling.

Each of those was a fast path. Each ended in a worse posture than the failure they fixed.

THE FIX THAT ALMOST WORKED

The first structural attempt was a polling-based systemd service. After boot, wait for docker-proxy processes to appear, send a HUP, sleep, repeat for two minutes, exit. Solves the boot-time PID rotation cleanly because containers always come up at boot.

It does not solve mid-life PID rotation. If you docker compose restart ghost at 3pm, the boot-time poller already exited four hours ago. Site goes down. Manual HUP back in the rotation.

I ran that polling version for a few weeks. It cut the boot-time outages cleanly and convinced me I'd solved the whole problem. I hadn't. I'd solved one half and built a blind spot for the other half. Polling-on-a-timer is a fix shaped like the problem at one moment in time, not over the full lifecycle of the system.

The right shape is event-driven.

THE FIX THAT ACTUALLY WORKED

Two files.

/usr/local/bin/opensnitch-docker-watchdog.sh
/etc/systemd/system/opensnitch-docker-watchdog.service

The script subscribes to the Docker event stream and reacts to container start and restart events. When one fires, it sends SIGHUP to opensnitchd. OpenSnitch re-reads its rules against the current process table. The new docker-proxy is recognized. Traffic flows from the moment the container is healthy, not after a manual intervention I might or might not be awake for.

Roughly:

#!/usr/bin/env bash
set -euo pipefail

docker events \
  --filter event=start \
  --filter event=restart \
  --format '{{.Time}} {{.Actor.Attributes.name}}' | \
while read -r event; do
    logger -t opensnitch-watchdog "Container event: $event"
    kill -HUP "$(pgrep opensnitchd)" 2>/dev/null || true
done

The systemd unit ties it to the boot lifecycle and to the Docker service.

[Unit]
Description=OpenSnitch docker-proxy PID rotation watchdog
After=docker.service opensnitch.service
Requires=docker.service opensnitch.service

[Service]
Type=simple
ExecStart=/usr/local/bin/opensnitch-docker-watchdog.sh
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

Boot order matters. After=docker.service opensnitch.service means systemd brings up Docker and OpenSnitch first. If the watchdog starts before either of them is ready, the events stream returns nothing useful and OpenSnitch isn't there to HUP. The whole thing does nothing for the rest of the uptime. Requires= means if either prerequisite fails to come up, the watchdog stops too. Better than running blind.

The systemd unit name for OpenSnitch on Debian is opensnitch.service, not opensnitchd.service. systemctl status opensnitchd returns "could not be found." If your Requires= clause names the wrong unit, the watchdog never starts and you discover it the next time a container restart kills your site. Check with systemctl list-unit-files | grep snitch before trusting any unit-file copy you find on the internet, including this one.

VERIFY IT WORKS

Service up:

systemctl status opensnitch-docker-watchdog

Should show active (running). Should NOT show recent restarts in the log. If it's crashlooping, your After= or your script path is wrong.

Live watch:

journalctl -t opensnitch-watchdog -f

Now restart any container with a localhost port binding. Should see the HUP event fire in the log within a second.

Prove it from the failure side. Stop the watchdog. Restart Ghost. Hit the site from the LAN. It'll fail. Start the watchdog again. Restart Ghost. Hit the site. It'll work. That's the receipt that you didn't just deploy something. You deployed the right thing.

Most people never do that second test. They deploy the fix, see no symptoms, declare victory, and find out six months later it was broken the whole time.

Build the verification into the deployment. Without it, you don't have a fix. You have a hope.

WHAT THIS DOESN'T FIX

The watchdog handles container start and restart. That's all it handles.

It doesn't help if opensnitchd itself crashes or deadlocks (which it can, on this kernel, under specific power-event recovery conditions). Separate problem, separate fix.

It doesn't help if the OpenSnitch rule file gets corrupted or deleted out from under the daemon. A rule that doesn't exist can't be reloaded.

It doesn't help if the underlying compose file changes the port binding from 127.0.0.1:PORT to 0.0.0.0:PORT, which spawns a completely different docker-proxy invocation that may need its own rule.

It doesn't help if Docker itself stops emitting events properly. Rare on stable releases, possible on edge builds.

Be honest about scope. A fix that solves one class of failure cleanly is worth more than a fix that pretends to solve all of them.

THE PATTERN GENERALIZES

The shape is broader than OpenSnitch.

Anywhere you have a long-running daemon that caches state about other processes or resources, and those processes or resources can restart out of the daemon's view, you have the same problem. Certificate auto-renewal that doesn't reload the dependent service after the renewal completes. Log rotation that doesn't notify the writers, leaving them holding deleted file descriptors. DNS cache that doesn't invalidate after a record changes upstream. Reverse proxy whose backend resolution gets stuck on a stale DNS lookup. Any time you've muttered "I just need to restart X every once in a while to clear it out," you're describing the same pattern.

The watchdog template:

Find the event source that announces the change (Docker events, inotify, dbus, udev, the daemon's own pub-sub if it has one).
Find the cheap signal that re-syncs the consumer (SIGHUP, a reload endpoint, a control socket command).
Tie them together in the smallest possible script.
Put it under systemd with explicit After= and Requires= to its prerequisites.
Add the verification ritual that proves it works.

That's it. The whole thing is fifteen lines per failure class. The hard part isn't the writing. The hard part is recognizing the recurring failure as a class instead of as a series of unrelated incidents.

THE TRADE

There's a pattern I keep finding in self-hosted operations. Every recurring failure has two repair paths in front of it. The fast path is the manual fix you already know works. The slow path is the structural fix that requires investigation, design, and testing. The fast path closes the immediate ticket. The slow path closes the underlying class of ticket forever.

Both feel productive in the moment. Only one of them frees you.

I picked the fast path on this one for weeks. I HUP'd opensnitchd dozens of times before I sat down to write the watchdog. The fast path is cheaper per occurrence. The slow path is cheaper in aggregate after about five repeats, and it gets cheaper from there forever. I was past the break-even point long before I built the fix.

They feel like competence. They train you to be the workaround instead of fixing the workaround.

Reflexes built around recurring failures are the single highest-ROI target in any operations stack you're maintaining solo. Every reflex you eliminate is hours you reclaim, attention you redirect, and one less load-bearing dependency on you being awake and at the keyboard.

If you have a kill -HUP reflex of your own running anywhere in your stack, that's the work order. The reflex is the smell. The workaround is the bug. The fix is the absence of you having to do it next time.

The watchdog hasn't required intervention since the day it shipped. The reflex it replaced has finally died.

I'll take that trade.

DEV Community