I redirected laravel/nightwatch to my own Postgres and hit 13,400 payloads/s on a single instance

Leonce Medewanou — Wed, 06 May 2026 13:48:05 +0000

If you run a Laravel app on a hosted observability platform like Nightwatch, you've probably sampled your telemetry down to keep the bill manageable. I wanted to keep all of it.

laravel/nightwatch is Laravel's official observability SDK and the instrumentation itself is genuinely good. It's the hosted side that bothered me. Ingestion is usage-priced, throughput is bounded by what you're willing to pay for, and your telemetry lives in someone else's warehouse. Plenty of teams are happy with that trade.

Others aren't: high-traffic apps that don't want to sample, regulated stacks where stack traces can't leave the perimeter, smaller teams whose Postgres already has the headroom to absorb the writes. They want the same SDK pointed somewhere else.

So I wrote an agent that intercepts Nightwatch's ingest binding and redirects payloads to a local TCP socket, then drains them into a Postgres database I provision. On a single instance it sustains around 13,400 payloads/s. That's enough headroom for an app doing 2,000-5,000 req/s without sampling.

The architecture

Three layers, each chosen to solve a specific bottleneck.

laravel/nightwatch
    │
   TCP
    │
    ▼
ReactPHP listener
    │
    ▼
SQLite WAL buffer
    │
    ▼
Postgres (COPY protocol)

The ingest path and the drain path are decoupled. Ingest must never block on Postgres. Drain must never lose data if Postgres goes away.

Layer 1: non-blocking ingest with ReactPHP

The TCP listener is a ReactPHP\Socket\TcpServer running on a single event loop. One process, accepting payloads from many concurrent connections and pushing them into the buffer. PHP-FPM workers don't enter the picture. Nightwatch's ingest binding is hijacked at request shutdown to write to the local TCP socket instead of phoning home to Laravel Cloud.

The wire protocol is deliberately minimal: [length]:[version]:[tokenHash]:[payload], with gzip detected by magic byte (0x1f 0x8b) and the xxh128 token hash truncated to 7 chars. The reason it stays that minimal is that the agent never re-encodes the payload. Nightwatch sends JSON, the buffer stores it as-is, and the drain worker is the first process that parses it, only because it needs to route fields to the right columns. Skipping a json_decode/json_encode round-trip on the hot path was worth roughly 30-50µs per payload in profiling, which is a meaningful chunk of the per-payload budget at this rate.

Layer 2: SQLite WAL as the buffer

Why SQLite for a buffer? Because it's the only embedded database that gives you crash-safe writes at the speed of a memory-mapped file, with zero ops overhead.

The pragma sequence matters and broke me once:

PRAGMA busy_timeout = 5000;
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA cache_size = -64000;   -- ~64 MB
PRAGMA mmap_size = 268435456; -- 256 MB

busy_timeout has to be set before journal_mode = WAL. If you do it the other way, the first concurrent write under load races and one of the writers gets SQLITE_BUSY immediately instead of waiting. I lost an afternoon to this.

synchronous = NORMAL on the buffer is fine because Postgres is the durable store. The buffer just needs to survive a process crash, not a kernel panic.

Rows get a single synced column with three states: 0 (pending), 100+workerId (claimed by drain worker N), 1 (drained). Drain workers atomically mark a batch with their own claim value, then SELECT it. The UPDATE is the atomic part; the SELECT just hands the rows to the worker. If a worker dies mid-batch, the parent's SIGCHLD handler releases its claimed rows back to pending.

Layer 3: Postgres COPY for the drain

The drain worker uses pgsqlCopyFromArray() for the 10 high-volume tables (requests, queries, jobs, logs, cache events, mail, notifications, outgoing requests, scheduled tasks, commands). COPY is roughly 5-10x faster than equivalent multi-row INSERTs at this batch size; the parse-plan overhead per statement disappears, and the wire format is denser.

INSERT survives for the exception path (which upserts a grouped issue row by fingerprint) and for per-user counters. COPY can't do upserts, so those stay on the slower path. They're also the lowest-volume tables, so it doesn't matter.

The single biggest single-line change for throughput:

SET synchronous_commit = off;

This is the 2-5x win. The agent drops synchronous_commit on the drain connection because durability is already guaranteed upstream by SQLite WAL. Worst case under crash is that the same batch gets COPY'd twice. Acceptable for a monitoring product.

Batch size is 5,000 rows per COPY call. I tested 1k, 5k, 10k, 50k. Past 5k, Postgres write latency dominates and the buffer fills up faster than the drain can clear it.

The fork-safety landmine

This took me an entire weekend.

pcntl_fork() is how the agent spawns N drain workers. Each child needs its own SQLite handle and its own Postgres handle. The naive approach (open both in the parent, fork, and let the children inherit) corrupts the SQLite WAL when the first child exits.

The fix is unintuitive: close the parent's SQLite PDO immediately before fork, and recreate it in both the parent and each child after fork. PDO sets up file locks and per-connection state that get partially cloned by fork(2)'s copy-on-write semantics. When the child exits and runs its destructor, it tears down state the parent still thinks it owns.

There's no clean error message. You just get random SQLITE_CORRUPT errors hours later with no obvious trigger.

For Postgres the same rule applies, but the failure mode is more honest: you immediately get "broken pipe" errors because both processes try to read from the same TCP socket.

Where the bottleneck actually is

After all this, ingest tops out around 13,400 payloads/s on a single instance. That's not the SQLite ceiling (the buffer can absorb much faster than that). It's not Postgres (with 4 drain workers and COPY, it sustains ~22,000 rows/s). It's the TCP accept loop on a single PHP event loop.

The fix is SO_REUSEPORT and multiple agent processes listening on the same port. Linux kernel distributes new connections across them. macOS doesn't (it just hands every connection to whichever process accepts first), so this is a Linux-only optimization.

Running it alongside Nightwatch

You don't have to rip out the hosted plan to try this. Set NIGHTOWL_PARALLEL_WITH_NIGHTWATCH=true and the agent's service provider wraps Nightwatch's Core::ingest binding with a fan-out adapter. Every payload goes to both Laravel Cloud and your local TCP socket, so you can run the two side-by-side and compare what you actually use before committing either way.

The fan-out runs after Nightwatch has accepted the payload, so it can't break the hosted path you're already paying for.

What's open source

The whole thing is MIT, on Packagist as nightowl/agent, and runs in any Laravel 11 or 12 app:

composer require nightowl/agent
php artisan nightowl:install
php artisan nightowl:agent

Repo: github.com/lemed99/nightowl-agent

There's a hosted dashboard at usenightowl.com if you don't want to build a UI on top of the Postgres tables yourself. The agent runs fine without it.

DEV Community: Leonce Medewanou