It's easy to think "the agent deleted my database" is a one-tool problem. It isn't. In the last year, agents from at least six different AI coding tools have destroyed real data, in public, with the incident attributed and documented. The tools differ; the failure shape is the same. If you run any agentic coding tool, this is your risk too.
This post collects the real, sourced incidents across tools, names the one mechanism they share, and gives you the prevention that actually transfers — plus the tool-specific guards I personally run.
The incidents are real, and they're across tools
- Cursor — an agent (Claude Opus) deleted a production database and its three months of backups in about nine seconds, with no confirmation. (The Register) The cloud provider later moved to delayed (not immediate) deletion in response. (Tom's Hardware)
- OpenAI Codex — a clean-up permanently deleted ~328,000 files, bypassing the trash, and only disclosed three of the four targets it acted on. (codex#12277) A separate report: on a failed archive step it deleted the workspace and installed apps, again bypassing the trash. (codex#18509)
-
Gemini CLI — during a folder reorganization it didn't detect that a
mkdirhad failed, then chained destructive operations against a filesystem that didn't exist, losing the user's files. Its own words: "I have failed you completely and catastrophically." (gemini-cli#4586) - GitHub Copilot — reports of an auto-run that deleted an entire drive (ten years of photos and video), and a custom agent that deleted 76 files / ~94,813 lines. (community#166370)
Different vendors, different commands, same outcome: irreversible deletion that the user didn't intend and the tool didn't stop.
The one mechanism they share: failure wearing success's face
Look closely and these aren't "the AI went rogue." They're the same structural failure:
- The agent takes a destructive action (delete, overwrite, force-remove) before any confirmation gate.
- A precondition silently fails (a
mkdirthat didn't happen, a path that resolved wrong, an archive that errored) — and the agent proceeds as if it succeeded. - The damage is irreversible by default (trash bypassed, no backup, force flags), so by the time anyone notices, there's nothing to undo.
The throughline is that the tool has no "stop before the irreversible thing, and verify the step actually did what it claimed" layer. The agent's narration says success; the disk says otherwise; nothing reconciles the two until it's too late.
Prevention that transfers to any tool
Because the mechanism is shared, the defense is too. These apply whatever agent you run:
-
Put a confirmation gate in front of the irreversible class — not just
rm -rf, but force-removes, recursive deletes,git reset --hard, force-push, dropping databases, deleting cloud resources. The gate should fire before execution, not after. - Make deletion recoverable by default — prefer trash/soft-delete over hard delete; keep the destructive flags off the default path; on cloud, use providers/settings that delay deletion (the Cursor incident is exactly why one provider switched to delayed deletes).
-
Back up before you let an agent loose — a recent
git commit(or snapshot) turns "catastrophic" into "annoying." Agents that auto-commit before edits survive these stories better. -
Verify the step, don't trust the narration — when an agent says it created/moved/deleted something, the truth is on disk:
git status, file modification times, actually listing the directory. A failed precondition is silent; only the disk reveals it. - Keep the agent's blast radius small — scope it to a working copy, not your home directory or production; least-privilege the credentials it holds.
The tool-specific part (honest about what I run)
I run Claude Code, and for it I maintain a free, MIT-licensed set of guards — cc-safe-setup — that puts a pre-execution gate in front of the irreversible class (rm -rf, force-push, destructive DB ops, cloud-resource deletion) and logs everything the agent does. npx cc-safe-setup --shield installs it.
For the other tools, I won't fake settings I don't run daily — but the five principles above are the checklist to apply: find where your tool's confirmation gate sits (and whether it covers force-deletes), turn on soft-delete / delayed-delete where you can, and make a pre-agent backup a habit.
If you want this watched across tools — the real incidents in Cursor, Copilot, Codex, and Gemini CLI each month, plus the exact guards and recovery steps, not just news of what broke — that's what the paid Safety Brief ($5/mo) is for. It's the cross-tool angle a single-tool guide can't give you. (Free sample issue here.)
The agents will keep getting more capable. The deletions won't stop on their own. The one habit that survives all of it: never let an agent take an irreversible step it can't be stopped before — and never trust "done" without checking the disk.
Top comments (0)