Every few months someone on Reddit asks "how do I run a Node.js process in the background." The answers are always PM2, forever, or systemd. All fine. But if you're shipping a CLI tool that users install on their own machines, you can't assume any of those exist.
I have a CLI that starts a local HTTP daemon. my-tool start forks into the background, user closes their terminal, daemon keeps running. About 700 lines for the whole thing, 150 of which are just the fork/PID/signal plumbing. Fastify 5 for the HTTP layer.
Fork, detach, forget
import { fork } from "node:child_process";
import * as fs from "node:fs";
import * as path from "node:path";
import * as os from "node:os";
const DATA_DIR = path.join(os.homedir(), ".my-tool");
const LOG_PATH = path.join(DATA_DIR, "daemon.log");
function startDaemon(): void {
fs.mkdirSync(DATA_DIR, { recursive: true });
const logFd = fs.openSync(LOG_PATH, "a");
const child = fork(__filename, ["start", "--_daemon"], {
detached: true,
stdio: ["ignore", logFd, logFd, "ipc"],
env: { ...process.env },
});
child.disconnect();
child.unref();
console.log(`Daemon started (PID ${child.pid}).`);
}
detached: true gives the child its own process group. child.unref() lets the parent exit. child.disconnect() drops the IPC channel.
stdout/stderr go straight to a log file via the fd. Append mode so restarts don't clobber history.
The --_daemon flag is how the child process knows it's the daemon and not the CLI. Underscore prefix to keep it out of --help.
When the child starts with that flag, it writes a PID file and hooks signals:
const PID_FILE = path.join(DATA_DIR, "daemon.pid");
if (process.argv.includes("--_daemon")) {
writePid(process.pid);
process.on("SIGTERM", () => {
removePid();
process.exit(0);
});
process.on("SIGINT", () => {
removePid();
process.exit(0);
});
startServer();
}
SIGTERM comes from stop. SIGINT is ctrl-c if you run it in foreground for debugging. I don't handle SIGHUP. Some daemons use it for config reload. Mine reads config at startup. Change config, restart.
PID files and stale processes
function writePid(pid: number): void {
fs.mkdirSync(path.dirname(PID_FILE), { recursive: true });
fs.writeFileSync(PID_FILE, String(pid), "utf-8");
}
function readPid(): number | null {
try {
const pid = parseInt(
fs.readFileSync(PID_FILE, "utf-8").trim(),
10,
);
process.kill(pid, 0); // signal 0 = just check if alive
return pid;
} catch {
return null;
}
}
function removePid(): void {
try { fs.unlinkSync(PID_FILE); } catch { /* gone already */ }
}
process.kill(pid, 0) doesn't send a signal. It checks if the process exists. If the PID file says 12345 but that process is dead, kill throws and readPid() returns null.
Stale PID files happen constantly. Hard crash, kill -9, OOM killer, power loss. Without the signal-0 check, start would refuse to run because of a leftover file from a daemon that died three days ago.
Fastify as the daemon core
import Fastify from "fastify";
async function startServer(): Promise<void> {
const app = Fastify({ logger: false });
const PORT = parseInt(process.env.MY_TOOL_PORT ?? "17890", 10);
const HOST = process.env.MY_TOOL_HOST ?? "127.0.0.1";
app.get("/health", async () => ({
status: "ok",
pid: process.pid,
uptime: process.uptime(),
memory: Math.round(process.memoryUsage().rss / 1024 / 1024),
}));
app.addHook("onClose", async () => {
// stop things that produce events first
clearInterval(healthTimer);
connector?.disconnect();
// then flush storage
await store.close();
});
await app.listen({ port: PORT, host: HOST });
console.log(`Listening on ${HOST}:${PORT}`);
}
logger: false because stdout already goes to the log file via fd redirect. Fastify's pino would double-log everything. I just use console.log with a [component] prefix. Not pretty, works fine for a local tool.
127.0.0.1 not 0.0.0.0. Local daemon, no reason to expose it to the network.
Health check after fork
/health does double duty. It's a monitoring endpoint, but it's also how start confirms the daemon actually booted:
async function cmdStart(): Promise<void> {
const existing = readPid();
if (existing) {
console.log(`Already running (PID ${existing}).`);
return;
}
startDaemon();
// ugly but necessary
await new Promise((r) => setTimeout(r, 1000));
try {
const res = await fetch(`http://127.0.0.1:${PORT}/health`, {
signal: AbortSignal.timeout(3000),
});
if (res.ok) {
const data = await res.json();
console.log(`Daemon running. PID ${data.pid}, port ${PORT}.`);
}
} catch {
console.log("Daemon forked but not responding yet. Check the logs.");
}
}
The 1-second sleep is ugly. The child needs time to import modules and bind the port. Without it you get ECONNREFUSED every time.
I tried IPC ("child sends 'ready' to parent") but that means keeping the IPC channel open, which means the parent can't exit cleanly. Sleep + HTTP is dumber. Works.
I learned the hard way why this health check matters. Early version didn't have it. Two instances on the same port:
$ my-tool start
Daemon started (PID 48291).
$ cat ~/.my-tool/daemon.log
Error: listen EADDRINUSE: address already in use 127.0.0.1:17890
Parent already printed success and exited. Daemon is actually dead. User thinks it's running. Now the health check catches this.
Shutdown ordering matters
This is where I wasted actual time. First version of the onClose hook:
// wrong
app.addHook("onClose", async () => {
await store.close(); // flush to disk
clearInterval(healthTimer); // stop health checks
connector?.disconnect(); // drop WebSocket
});
store.close() flushes pending writes. But the health checker was still running and could trigger a write during the flush. Race condition. Store got corrupted about once a week. Always on shutdown, always a half-written JSON file.
Took me three corrupted files to connect the dots. Added a --foreground flag to run the daemon in the current terminal, caught it within an hour.
Fixed version is in the Fastify setup above. Stop producers first, then flush consumers.
start / stop / status / restart
async function cmdStop(): Promise<void> {
const pid = readPid();
if (!pid) {
console.log("Daemon is not running.");
return;
}
try {
process.kill(pid, "SIGTERM");
removePid();
console.log(`Stopped (PID ${pid}).`);
} catch {
removePid();
console.log("Process already gone. Cleaned up PID file.");
}
}
No wait for exit. SIGTERM fires the handler, handler calls process.exit(0), Fastify's onClose runs, done. If that chain takes more than a second or two, something else is broken.
restart = stop, sleep 500ms (port release), start. status = read PID + hit /health. If the PID file exists but health check fails, the daemon crashed without cleanup:
$ my-tool status
PID file says 48291 but daemon not responding.
Interval callbacks will kill your daemon
Most daemons run periodic tasks. Health checks, cache cleanup, token refresh.
const CHECK_INTERVAL = 30_000;
class HealthChecker {
private timer: ReturnType<typeof setInterval> | null = null;
start(): void {
this.stop();
this.timer = setInterval(() => {
this.checkAll().catch(console.error);
}, CHECK_INTERVAL);
}
stop(): void {
if (this.timer) {
clearInterval(this.timer);
this.timer = null;
}
}
}
That .catch(console.error) is load-bearing. Without it, a rejected promise inside the interval is an unhandled rejection. Node 22 crashes the process on those.
My daemon ran fine for a day, then a DNS timeout in the health checker produced an unhandled rejection. Dead process, stale PID file, nobody noticed until the next morning. Added the .catch, hasn't died since.
Why not PM2 / systemd / Docker
PM2 adds a dependency and has its own process management (PID files, logs, restart policies) that can conflict with yours.
systemd is great if you control the box. But this runs on developer laptops. I'm not going to ask macOS users to write a launchd plist.
Docker assumes Docker is installed. On a lot of dev machines, it's not.
Fork works everywhere Node runs. macOS, Linux, Windows (add windowsHide: true to the fork options or you get a console window flash).
Log rotation
One more thing I didn't think about until a test machine had a 200MB log file.
function rotateLog(): void {
try {
const stats = fs.statSync(LOG_PATH);
if (stats.size > 10 * 1024 * 1024) {
const backup = LOG_PATH + ".1";
if (fs.existsSync(backup)) fs.unlinkSync(backup);
fs.renameSync(LOG_PATH, backup);
}
} catch { /* first run, no log yet */ }
}
Call this before opening the log fd in startDaemon(). One backup, 10MB cap. Could use logrotate on Linux but again, can't assume it's configured.
Fastify 5 boots in under 50ms, which matters when the user is staring at a terminal. The fork + PID + signal + health check pattern has been running on about a dozen machines for a couple months now with zero babysitting. That's the whole point of a daemon, I guess.
Top comments (0)