I built a system that stops AI actions before they execute

#security #ai #python #machinelearning

Most AI systems today don’t just generate text. They actually take actions. They send emails, trigger workflows, move data, and even change infrastructure. And almost all of them follow the same pattern. The model decides, the action executes, and only then do you deal with what happened.

That always felt backwards to me.

I kept running into situations where AI wasn’t just suggesting things. It was doing things. Calling APIs, kicking off jobs, modifying state. And the only safeguards in place were logs, retries, or alerts after the fact. Which means if something goes wrong, it has already happened.

So I started thinking about what it would look like to move control earlier in the process.

Instead of letting AI act directly, every action gets evaluated first. Before anything executes, the system checks if it is allowed, whether it matches policy, and whether it is within scope. It returns a simple result, allow or deny. If it is denied, the action never happens.

That one shift changes a lot. You are no longer reacting to bad outcomes. You are preventing them. Decisions become consistent, explainable, and enforceable across systems instead of scattered across logs and edge cases.

This matters most in places where actions are hard to undo. Financial operations, infrastructure changes, and automated workflows all fall into this category. Once those happen, rollback is often messy or incomplete.

I could not find a clean way to do this, so I built a small API around the idea. Evaluate first, execute second.

Curious how others are thinking about this. Are you relying on safeguards after execution, or putting something in place before actions happen? https://www.primeformcalculus.com

Top comments (1)

Harjot Singh • May 31

This is the architecture I'm most bullish on - an interception layer that evaluates an action BEFORE it executes is the only thing that actually contains a confidently-wrong agent. Detect-after-the-fact is forensics; stop-before-execute is prevention. The agent proposes, your system disposes - and the disposing has to be deterministic code, not another model's opinion.

The design question that decides its power: is the policy declarative and is the block hard (the action literally cannot proceed) vs advisory? Pre-execution + hard-block on irreversible ops is exactly the gate I build into Moonshift (prompt to a shipped SaaS on your own GitHub+Vercel) - deploys/destructive steps can't happen on the model's say-so alone. Genuinely important build; what granularity do you intercept at - tool calls, or a higher action abstraction? (Moonshift's first run's free if useful.)