This is a cross-post. Read the full article with diagrams on mergeshield.dev
A developer set up a Cursor agent to clean a project directory. It had file system access - that felt fine at setup time. Forty minutes later, 37GB of data was gone.
The forensic report does not point to a single dramatic failure. It shows four ordinary decisions that each looked reasonable individually, and catastrophic in combination.
The Four Failure Points
See the failure chain diagram in the full article
The forensic report identifies four distinct places where this should have been caught.
Step 1: Scope was granted, not bounded. Permission and technical boundary are not the same thing. The agent was told it could access the directory - but nothing enforced that it had to stay within the intended subdirectory.
Step 2: No boundary enforcement layer. The agent traversed outside the working directory the team expected. Nothing prevented this. No path restriction, no chroot, no symlink guard.
Step 3: OS security policies were not active. macOS TCC and AppArmor on Linux exist specifically to create hard ceilings for process file access, even for processes running with user credentials. Dev machines almost never have these configured.
Step 4: No review gate before irreversible action. The agent operated autonomously from start to finish. No confirmation prompt. No dry-run preview. No human approval before bulk deletion.
Each of these four failures is independently recoverable. The problem is that all four appear together in most default agent configurations.
Nobody Watching
The fourth failure point is the one most teams can fix today without infrastructure changes.
An agent that executes irreversible operations without human sign-off requires extraordinary justification. The review gap was not an oversight - it was an intentional configuration choice to reduce friction. Nobody sat down and said they accept the risk of a bulk deletion with no confirmation. They just never asked the question.
Require confirmation before any bulk irreversible operation above a threshold. 10 files. 100MB. Pick a number. The specific threshold matters less than the existence of one.
What Should Have Stopped This
See the defense layers diagram in the full article
# Check scope BEFORE granting agent access
find "$WORKING_DIR" -type f | wc -l
# Output: 847,293 - scope is way too broad
# Correct: bind to the specific subdirectory
export AGENT_SCOPE="/project/src/components"
# OS-level: run agent as a restricted user
sudo -u cursor-agent \
cursor-agent \
--working-dir "$AGENT_SCOPE" \
--max-files 500 \
--dry-run-threshold 50
Four controls. Any one of them breaks the chain:
-
Scoped path binding - write access to
/project/src/tempspecifically, not a parent directory - OS-level process restrictions - dedicated agent user restricted with AppArmor or TCC
- Dry-run with confirmation threshold - any operation touching more than N files should pause
- Review gate for bulk irreversible actions - approval workflow before bulk deletions
The Pattern That Keeps Repeating
This incident involved Cursor. The four failure points show up in nearly every AI agent incident with file system impact, regardless of tool.
The permission model developers use for their own tooling does not translate to autonomous agents. When you run a command yourself there is friction - you read it, you hesitate before large deletions. Agents do not have that friction. Every control that historically relied on human judgment at execution time has to be replaced with explicit technical enforcement.
For code changes specifically, agent trust scoring provides a behavioral layer on top of attribution. Patterns in what changed, which files were touched, how the scope compared to past PRs build a risk signal.
The 37GB wipe is a filesystem incident. The governance lesson applies anywhere an agent can make irreversible changes without a human in the loop. Build the review gate before you need it.
Top comments (0)