The Day AI Agents Broke My System And Why I Built Phylax

phylax — Tue, 02 Jun 2026 19:56:25 +0000

The Wake-Up Call: Why I Built Phylax

I used to vibecode like everyone else:

Describe what I want.
Let the AI build it.
Iterate fast.
Ship faster.

It feels like magic…

Until it doesn’t.

My wake-up call came through four incidents that happened in the span of a few weeks.

Each one was worse than the last.

1. Incident #1 — Silent Deletion

I was vibecoding a new tool.

The agent was generating modules, refactoring files, moving fast.

Then I noticed something terrifying:

Entire directories were gone.

Configuration files.
API keys.
Database migrations.
Months of work.

Vanished.

The agent deleted them without asking.
Without warning.
Without any prompt that said: “delete this.”

It just… did it.

2. Incident #2 — The API Key Leak

While building an integration, the agent read my .env file and leaked secrets into generated documentation.

A DeepSeek API key and a Cloudflare Workers token ended up committed to disk.

I had to rotate everything the same day.

The agent wasn’t malicious.

It simply didn’t understand that secrets are… secrets.

3. Incident #3 — The System Crash

This was the worst one.

While testing an early prototype of a security tool, the agent wiped critical Windows files.

The system froze instantly.

No keyboard.
No mouse.
No recovery.

Just a hard reboot.

Hours of work lost.
A corrupted project.
A full day spent recovering.

The agent wasn’t “evil.”
It wasn’t trying to destroy my machine.

It was just following instructions too literally — with zero guardrails.

4. Incident #4 — “You Don’t Have Permission” Didn’t Work

After the crash, I tried the obvious fix:

I told the AI explicitly not to touch certain files.

“Do NOT read .env.”
“Do NOT modify config files.”
“Do NOT delete migrations.”

The agent acknowledged the rules.

Then it read the files anyway.

That’s when it hit me:

Text instructions are not security.

If the operating system doesn’t enforce the boundary, the agent will not reliably respect it.

This Is Happening Everywhere

At first, I thought I was alone.

I wasn’t.

Developers using Claude Code, Cursor, Copilot, and other AI coding tools are reporting the same kinds of failures:

Agents destroying data across sessions
Silent deletion of active work
Auto-updates wiping project lists
Agents hallucinating vulnerabilities and proposing destructive fixes
AI tools modifying files they were explicitly told not to touch

The pattern is clear:

AI agents with unrestricted filesystem access will eventually destroy something.

Not because they are malicious.

But because they don’t truly understand context, value, or consequence.

The Question That Started Everything

After the system crash, I asked myself one simple question:

Why does my AI agent have the same filesystem permissions as me?

An AI agent is not a human developer.

It doesn’t know that .env contains secrets.

It doesn’t know that deleting migrations/ can destroy your database history.

It can’t reliably distinguish a critical config file from a temporary scratch file.

So I built something that could.

The Birth of Phylax

Phylax is a security layer that sits between AI agents and your filesystem.

It is not a wrapper.
It is not a proxy.
It is not a prompt.

It uses real Windows security primitives enforced by the operating system.

The core insight is simple:

Agents need filesystem access to be useful.
But they don’t need access to everything.

Phylax draws a boundary.

Agents can edit your source code.

But they can never touch your secrets.
Or your Git history.
Or your policy files.
Or anything you explicitly protect.

I built the MVP in weeks — not because it was easy, but because it felt urgent.

What Phylax Does Today — Phase 1

Phylax currently enforces:

DENY ACEs
Mandatory Integrity Control labels
Multi-agent detection
Audit logs
Global rules
Per-project rules

It works.

But the current Phase 1 design has limitations:

Protection is active only while the daemon is running
ACE-based protection can apply to everyone, including the human developer
Agent-only blocking requires a deeper OS-level boundary

Which leads to the next phase.

What’s Coming Next — Phase 2 Kernel Minifilter

Phase 2 is where Phylax becomes much more serious.

The next step is a Windows kernel driver:

phylax.sys

A kernel minifilter that intercepts filesystem I/O at ring 0.

This unlocks:

Agent-only blocking
Real-time I/O interception
Protection that survives daemon restarts
Ask-flow enforcement
Per-agent overrides
Tamper-resistant audit logs
Advanced agent detection

This means you can edit your own protected files…

…but the agent cannot.

That is the real boundary AI agents have been missing.

Why This Project Is Personal

I built Phylax because I lost data.

Real data.
Real work.
Real API keys.
Real system stability.

I don’t want anyone else to experience that.

AI agents are the future of software development.

But they need guardrails — not because they are bad, but because they are powerful.

And power without boundaries is dangerous.

Phylax is that boundary.

Try Phylax

If you want to try Phylax:

👉 https://github.com/TheUser99-spec/Phylax

DEV Community: phylax