ZiLing

Posted on Dec 31, 2025

I Thought It Was Refactoring My Code. It Actually Wiped It Out.

#ai #opensource #python #security

3 Months of Code, Gone in 5 Seconds

I’m still a bit shaky as I type this.

A few weeks ago, I was using an LLM-based automation to refactor a project’s directory structure.

The goal was simple: clean things up, reorganize a few core modules — nothing risky.

During the planning stage, everything looked perfect.

Clear reasoning. Careful steps. It even reassured me:

“For safety, I will scan the directories first.”

I let it run in the background and went to do something else.

When I came back, my code was gone.

Not moved.

Not misplaced.

Physically deleted.

Because of a subtle path hallucination, the model interpreted my project root as a temporary directory.

There was no warning. No error. Nothing suspicious before execution.

In about 5 seconds, it “optimized” 3 months of my work into a blank screen.

That was the moment I realized:

the word “refactor” in the title was doing a lot of lying.

Why Prompt Engineering Isn’t Enough

This accident taught me a hard lesson:

AI failures don’t usually happen during “thinking” —

they happen during “doing.”

We spend an enormous amount of time designing prompt guardrails, trying to convince models to behave safely.

But in practice:

Hallucinations are inevitable

A model can promise safety in text, then hallucinate a destructive path at the exact millisecond it generates a tool call.

Execution is irreversible

Once an AI has filesystem or network access, every action produces real-world side effects.

There is no “undo” button.

Running AI automation without execution-time protection is basically

barefoot running on broken glass.

FailCore: Not a Framework, Just a Safety Belt

I didn’t want to build another heavy framework.

FailCore exists for one reason:

that incident made it obvious what I was missing.

After the failure, I realized I needed three very concrete things.

1. Execution-Time Interception

That path hallucination made one thing clear:

safety checks can’t stop at the prompt layer.

FailCore hooks into tool calls at the Python runtime level.

If an automated process tries to touch an unauthorized directory or a dangerous network target

(for example, an internal IP that could trigger SSRF),

the circuit is broken before the side effect happens.

2. A “Black Box” Audit Trail

During those 5 seconds, I had no idea what the system was actually doing.

So I needed evidence.

FailCore turns raw execution traces into an HTML audit report, showing:

when an action happened
what parameters were used
which resource was targeted
and why it was allowed or blocked

This was the first time I could actually see what the AI did, step by step.

3. Deterministic Replay

I didn’t want to burn tokens or risk my environment just to reproduce a failure.

With FailCore, you can take a recorded execution trace and replay it locally —

without re-running dangerous operations —

to pinpoint exactly where the logic went wrong.

Opening the Black Box

Below is a prototype of the HTML audit report generated by FailCore:

This isn’t just about preventing accidents.

It’s about observability.

For developers: debugging non-deterministic failures with 100% replay accuracy
For teams: maintaining an auditable trail of automated actions
For AI systems: operating within explicit, enforceable boundaries

Final Thoughts

AI is incredibly good at writing code.

But we shouldn’t let it be the judge, jury, and executioner of our local file systems.

FailCore is still a work in progress, but it’s what allows me to keep running AI automation on my own machine without fear.

If you’re letting AI touch the real world,

execution safety deserves its own layer.

👉 GitHub: https://github.com/Zi-Ling/failcore

If there’s interest, I can write a follow-up post explaining how the Python runtime hooks actually work.

DEV Community