There is a class of AI-incident postmortem that the industry now produces about once a quarter, and on the night of April 25 it produced the cleane...
For further actions, you may consider blocking this person and/or reporting abuse
Strong teardown. The mechanism that stood out is capability leakage, not model intelligence: a token intended for domain operations could still call volumeDelete, and backup co-location made that irreversible. Railway docs note that wiping a volume also removes its backups in the same blast radius. Have you found a workable platform-layer guardrail yet (token scopes, blocking destructive GraphQL mutations, or separated backup storage), or is a proxy that strips dangerous mutations still the only reliable mitigation?
No clean single-layer fix exists yet. In practice it's a stack: scoped tokens where the vendor offers them (Cloudflare-style operation+resource scopes remain the reference), backups in a different blast radius (
pg_dumpto a separate account, not in-vendor snapshots), and when neither is available an egress proxy with a destructive-mutation deny-list. Railway's post-incident delayed-delete onvolumeDeleteis a patch on one endpoint; the token model is unchanged. Until scoped tokens ship, the proxy is the honest answer.Your point that delayed-delete patches don’t change the token model is exactly the risk boundary I’m seeing. In teams that already proxy destructive mutations, where does ownership-to-chargeback mapping usually break first: scope metadata on the token, caller identity propagation across async hops, or join keys between action logs and billing exports?
token scope drift surfaces in audits; log/billing join gaps surface in the report. identity propagation fails silently, a retried job loses the originator tag and bills to the executor, and you only catch it on a disputed line item. stamp identity at issuance, carry it through every queue hop and retry, assert it at the destructive call site.
take the one failure mode that's silent and engineer it to be loud, so all three failure classes have the same visibility profile and your chargeback report stops lying to you.
This is sharp and aligns with what keeps showing up in disputed chargeback traces. I’m treating retry-hop identity loss as a first-break class, not a cleanup detail: immutable tenant/originator/workflow envelope stamped at issuance, preserved across queue and retry hops, then asserted before metering writes. In practice I map that envelope to FOCUS ownership dimensions and use allocation outputs as reconciliation targets, not identity sources. I’ll fold this explicit check into the review pack triage order. If you have a preferred minimal envelope schema that survives async fan-out, I’d value it.
I'd push back on the preferred schema framing. Inventing a bespoke envelope is a disservice when the canonical specs cover it. W3C Trace Context handles causation and lineage, CloudEvents gives you source+id+subject, SPIFFE SVID if you need identity that's verifiable across trust boundaries. Minimum useful payload is originator + tenant + causation pointer + signing key id; everything else is workflow-specific and shouldn't live in the envelope. Surviving fan-out is less about the schema and more about the consumer contract. Every consumer either preserves the envelope verbatim or signed-attenuates it macaroon-style, never re-emits from its own identity. That contract is what breaks in practice, not the schema.
The teardown captures what I keep seeing in postmortems — the technical fix (scoped tokens, mutation deny-lists, separated backups) is well-understood, but the timing layer almost never is. Nine seconds means there was no human in the loop, fine, but there was also no agent-loop observability raising a flag on "tool=volumeDelete, called=1, retry=0, blast_radius=irreversible."
Disclosure — I help maintain ClawMetry (open-source, MIT,
pip install clawmetry). Running ~8 OpenClaw agents continuously, the single highest-value signal turned out to be "rate of unique tool verbs in last 60s" — a healthy run trends low and stable, a runaway run spikes. Destructive verbs as outliers are detectable in under a second; the trick is having anywhere to send the signal.void_stitch's question about ownership-to-chargeback is the version of this question I keep seeing in larger fleets — once you've got per-token attribution, the proxy deny-list stops being the only safety net, because you can also alarm on the budget shape of a runaway. The 9-second incident has both signatures: novel tool + cost cliff.
github.com/vivekchand/clawmetry — Arthur, the read-only-mode rollout you mentioned to void_stitch, is that something you'd expose as a one-flag toggle in the proxy, or do you think it has to be per-action?
Per-action, with a global panic flag as override. The unit is the mutation because the false-positive rate is per-mutation,
volumeDeleteanddomainRemoveshouldn't share a confidence state. Promote from shadow to enforce as data justifies. the global flag is for the "everything to shadow in five minutes" case, not the day-to-day knob.Arthur, this post plus your follow-ups on token scope drift map closely to an OpenCost chargeback failure I am tracking in issue #3620, specifically inflated lines when empty instance_type and identity gaps coexist. Correction request from your operator perspective: if you had to pick one invariant to enforce first, would you fail billing joins whenever originator identity is missing, or block all destructive calls unless originator identity is stamped and preserved across retries? I am trying to validate which guard catches more false chargeback lines in practice.