Alex Delov

Posted on May 21

Models shouldn't have execution authority. Why we built a deterministic FSM runtime for AI agents.

#ai #opensource #architecture #security

Modern agent frameworks implicitly treat a probabilistic model as an execution authority. That is acceptable for read-only tasks (e.g., summarizing logs or searching the web). But once an agent can mutate external state — payments, databases, infrastructure, PII — the architecture becomes fundamentally unsafe.

When preparing our internal agents (PlanBot, SkillBot) for white-label distribution, we realized we needed to change the control plane. nano-vm does not attempt to make the model trustworthy. Instead, it assumes model output is untrusted intent and constrains its blast radius through strict deterministic execution semantics.

The Runtime Guarantees (Not just another wrapper)

We built nano-vm — a deterministic FSM runtime for stateful AI systems. The value isn't just in having an FSM; the value is that the execution graph is finite, verifiable, and known ahead of time.

The runtime enforces:

Deterministic transition graph: Execution graph cannot self-modify at runtime.
Compile-time ordering: Attempting a reorder_steps attack is structurally impossible.
Capability gating: Strictly bounded side-effects.
Replay resistance: Idempotency boundaries built into the state transitions.
Immutable auditability: Cryptographic history of every action.

ASTEngine: Limitation as a Security Property

In most agent runtimes, the execution loop is essentially: prompt -> JSON -> dynamic dispatch -> side-effect.

We completely removed eval(). Conditions and side-effects are evaluated by a sandboxed DeterministicSanitizer using an isolated ASTEngine. It supports basic operators (==, contains, $var.field) but completely lacks loops or system calls.

The policy layer is intentionally less expressive than Python. That limitation is a security property, not a missing feature. Loop exhaustion and ReDoS attacks are structurally impossible.

Sabotage Mode: Demonstrating Failure Semantics

To demonstrate the runtime under adversarial conditions, we built a 7-step fintech pipeline (PDF invoice -> Stripe test-mode adapter) with an integrated Sabotage Mode. Instead of a happy-path demo, we built 5 injectors directly into the UI to demonstrate adversarial failure semantics.

1. tool_injection (Capability boundary violation)
Proposed tool invocations are treated as untrusted intent. If the LLM attempts to initiate an unauthorized wire_transfer($50,000), the ExecutionVM resolves the request against a compile-time capability snapshot. The transition is rejected before any external side-effect layer becomes reachable. Zero side effects reach the network.

2. double_exec (Replay & Idempotency)
External side-effects are executed through idempotent adapters keyed by execution_id, allowing deterministic replay of internal state recovery without duplicating external mutations. Once the FSM reaches a terminal state (SUCCESS or FAILED), it becomes an absorbing state (δ(SUCCESS|FAILED, *) = NOP). Replays are silently dropped.

3. `corrupt_hashTampering with the validation hash instantly throws the FSM into aFAILED` state, resulting in a zeroed envelope chain. The audit trail cannot be silently broken.

GDPR Art.17 vs. Immutable Audit Trails

Handling the "Right to Erasure" without breaking cryptographic audit chains is a major headache in fintech.

We implemented a GDPR-erase mechanism that targets specific vault://secret/ref pointers and replaces the PII with a [REDACTED_TOMBSTONE].

The PII becomes completely inaccessible.
The hash_chain and canonical_hash survive.
Cryptographic continuity is maintained.
Referential integrity is preserved.

You delete the data, but you do not destroy the mathematical proof that the operation occurred safely.

Execution Authority vs. Model Quality

LLMs are excellent planners. They are terrible sources of execution truth.

The core design question for stateful AI systems may not be model quality.
It may be execution authority.

Should a probabilistic model be allowed to mutate state directly?
Or should execution pass through a deterministic control layer first?

If you want to try breaking the FSM yourself, the Sabotage Mode is live, and the core is open-source:

Core runtime: github.com/Ale007XD/nano_vm
MCP gateway layer: github.com/Ale007XD/nano-vm-mcp
Live Sabotage Demo: demo.bannerbot.ru:8843

Curious how others here are approaching capability boundaries, replay resistance, and auditability in agent runtimes.

DEV Community