AI is a Non-Deterministic Guest in a Deterministic House: Stop Building Chatbots, Start Building Sandboxes

#ai #security #architecture #sre

The Signal: The Legally Binding Hallucination
Recently, a major airline's customer support chatbot hallucinated a bereavement fare policy. A customer claimed the refund, the airline refused, and a tribunal ruled in favor of the customer. The chatbot was deemed a legal agent of the company.

The failure wasn't that the LLM hallucinated—it’s that it was allowed to speak directly to the customer and the database without a chaperone. When you give a non-deterministic guest unregulated access to your deterministic house, you are legally and financially responsible for the fire.

We need to stop treating AI as an open-ended "chat" interface and start treating it as untrusted, highly volatile code execution.

Phase 1: The Architectural Bet
We are shifting from Open Dialogue to Hardened State-Machine Confinement.

The Vendor Trap is the "Chat Completion API." It encourages you to build open text boxes where users ask for anything, and the AI returns anything. It relies on "system prompts" to enforce behavior—which is like asking a burglar to please lock the door on their way out.

The Ownership Path is the Isolate Sandbox. We don't want a conversationalist; we want a function that takes inputs, runs in a cryptographically and memory-hardened environment, and outputs a strictly typed payload that we validate before it ever touches our main thread.

Phase 2: The Security Audit (Why your current sandbox is a liability)
Last week, I proposed using the native Node.js vm module to sandbox agent outputs. Our Lead QA and Security Tester ripped the pull request to shreds. Here is the audit report that forced an architectural rewrite:

Senior Tester Audit Report:

CRITICAL VULNERABILITY (Sandbox Escape): The native Node.js vm module is not a security boundary. The official docs explicitly state: "Do not use it to run untrusted code." An LLM can easily hallucinate a Prototype Pollution attack, traverse the prototype chain, and execute Remote Code Execution (RCE) on the host machine.

CRITICAL VULNERABILITY (Event Loop DOS): vm.runInContext runs on the main thread. If the LLM generates a simple while(true) {} loop, it will block the Node.js event loop entirely. Your server will instantly drop all active user connections.

State Corruption: If you pass live objects (like a DB connection) into the vm context, the agent can mutate them globally.

The Verdict: We cannot use native Node.js tools. We must drop down to the C++ V8 engine level.

Phase 3: The Production Implementation (V8 Isolates)
To build a true "Boss Battle" arena, we use isolated-vm. This creates a completely separate instance of the V8 JavaScript engine with its own memory heap. If the AI triggers an infinite loop or tries to break out, we snipe the isolate thread without affecting the main Node.js serve

const ivm = require('isolated-vm');
const { trace } = require('@opentelemetry/api');

const tracer = trace.getTracer('ai.hardened_sandbox');

class FortressSandbox {
    constructor(memoryLimitMB = 64, timeoutMs = 1500) {
        this.memoryLimitMB = memoryLimitMB;
        this.timeoutMs = timeoutMs;
    }

    async executeUntrustedAgent(aiGeneratedLogic, safeInputPayload) {
        return tracer.startActiveSpan('v8_isolate_execution', async (span) => {
            // 1. The Hard Boundary: Create a separate V8 heap
            const isolate = new ivm.Isolate({ memoryLimit: this.memoryLimitMB });
            const context = isolate.createContextSync();
            const jail = context.global;

            try {
                // 2. State Management: Pass data as deeply cloned strings, NEVER by reference
                jail.setSync('global', jail.derefInto());
                jail.setSync('_inputData', JSON.stringify(safeInputPayload));

                // 3. Compile the Agent's logic
                const script = isolate.compileScriptSync(`
                    // Agent must parse input, do its logic, and return a stringified result
                    const input = JSON.parse(_inputData);
                    let output = {};

                    ${aiGeneratedLogic}

                    JSON.stringify(output);
                `);

                // 4. The Dead Man's Switch: Run with strict timeouts
                // If it loops infinitely, the isolate is terminated. Main thread survives.
                const resultStr = script.runSync(context, { timeout: this.timeoutMs });

                span.setAttribute('sandbox.status', 'success');
                return JSON.parse(resultStr);

            } catch (error) {
                span.recordException(error);
                span.setAttribute('sandbox.status', 'terminated');
                // The guest tried to burn the house down. The house won.
                return { 
                    error: `GUARD INTERVENTION: Agent execution terminated. Reason: ${error.message}` 
                };
            } finally {
                // 5. Memory Cleanup: Destroy the arena
                isolate.dispose();
                span.end();
            }
        });
    }
}

// Example Usage:
// const fortress = new FortressSandbox();
// const output = await fortress.executeUntrustedAgent("output.action = 'refund'; output.amount = input.amount;", { amount: 500 });

Phase 4: Checklist (What to Build Next)
[ ] Implement Zod Egress Filtering: The output of FortressSandbox is secure from a code-execution standpoint, but the data is still untrusted. Pipe the output directly into a zod schema validator. If it fails, drop the request.

[ ] Tail-Based OTel Sampling: Sandboxes will fail often (by design). Configure your OpenTelemetry collector to only save the full trace spans for sandbox.status === 'terminated' to save on Datadog/Honeycomb costs.

[ ] Multi-Agent Firebreaks: If Agent A passes data to Agent B, it must pass through a schema check in between. Never let two agents share the same V8 isolate memory space.

The Bottom Line: Treat LLM outputs like user input from the public internet in 1999. Sanitize it, isolate it, and expect it to be malicious by default. Build the house. Contain the guest.

DEV Community

AI is a Non-Deterministic Guest in a Deterministic House: Stop Building Chatbots, Start Building Sandboxes

Top comments (0)