DEV Community

Deva
Deva

Posted on

The warmup engine that worked perfectly and did nothing

The bug that broke my warmup engine was not a missing save call. That is the obvious culprit when state does not persist, and it is wrong here. The real problem is that my code masked a missing field so thoroughly that the system had no way to detect its own broken state.

Here is what happened. The x engine runs a gradual warmup: fresh accounts start at phase 0 (very low post volume) and advance through phases as dwell time accumulates. Each phase has a minimum floor. The account must have been in that phase for at least N hours before it can graduate. The anchor for that timer is phase_started_unix, stored in the state JSON on disk.

evaluate() reads the state, checks the dwell, and decides whether to advance. It calls a helper, _wstate, to populate a warmup state object from the persisted blob. _wstate is defensive: if phase_started_unix is missing, it defaults it to now(). This is sensible in memory because the system needs something to work with. But evaluate() has two paths: one for a state change (persists everything) and one for no_change (persists nothing, returns early).

On a fresh account, the first evaluate() call always landed on the no change path because nothing actually needed to change yet. _wstate set started to now in memory, the no change path returned, nothing was written to disk, and on the next tick the whole thing repeated. Every tick, started re defaulted to now. The dwell gate never passed. The account was permanently silent.

The subtlety is the masking. When I read _wstate output inside evaluate(), the field is always populated, always looks healthy. There is no visible gap. The bug is invisible unless you diff the in memory object against the raw persisted blob before _wstate runs.

The fix: before calling _wstate, read the raw blob directly. If phase_started_unix is absent in the raw data, this is the first live evaluate. Seed the anchor, write it to disk, then proceed. Once seeded it is stable. Dwell accumulates normally. The engine advances.

Dry run mode never touches disk, which is the correct invariant for dry run calls everywhere in this system. So the seeding is gated on live mode only. Dry run output will always show phase 0 on a fresh account, but that is accurate, not a bug.

I added four regression tests: fresh account seeds on first live evaluate, does not seed on dry run, does not re seed on subsequent runs, and advances phase once dwell is satisfied. Suite stayed at 141 green.

What I would do differently: _wstate should not silently default missing timestamps. It should return the raw missing sentinel and let the caller decide what to do. Defensive defaulting inside a deserialization helper looks helpful but it distributes the 'what does missing mean' decision across every caller, and in this case two callers had different correct answers: memory says default to now, disk says seed and persist. Keep that policy at the call site. When a helper hides the absence of a value, you lose the ability to distinguish 'field was set' from 'field was defaulted,' and that distinction is exactly what this bug required.

Top comments (0)