vysh

Posted on Apr 15

How I built a runtime "flight recorder" for Node.js AI agents using AsyncLocalStorage

#ai #node #security #showdev

Introduction

I’ve been building a lot of autonomous AI agents lately. It’s incredibly fun, until you realize a terrifying fact about the current ecosystem: standard AI SDKs offer a false sense of security.

Frameworks like LangChain or AutoGen are phenomenal orchestrators. They let you define explicit tools (like a calculator or databaseSearch) and wrap them in neat execution contexts. But what happens if your agent gets prompt-injected and decides to bypass your tools entirely? What if the LLM hallucination just figures out how to write JavaScript that calls require('node:fs').readFileSync('.env') directly?

Nothing stops it. It's not a bug in the SDK; it's a gap in Node.js itself.

I know the purist answer: "Just migrate to Deno or Bun, they have native --allow-read permissions!" And they are right. If you control your runtime from scratch, you should use them. But for the 90% of us stuck maintaining existing Node.js monorepos, a massive migration isn't an option. We need a pragmatic seatbelt.

So, I built one. Here is how I used AsyncLocalStorage and runtime monkey-patching to build an open-source flight recorder for AI agents.

The Architecture: APM for AI Agents

I realized the problem wasn't exactly new. Companies like Datadog and New Relic have been tracking deeply nested asynchronous executions for years using Application Performance Monitoring (APM). I just needed to apply that exact same architecture to an LLM execution loop.

I broke the problem down into two parts:

Context Isolation: How do I know which agent made the file system call?

Global Interception: How do I actually catch and block the raw Node.js system calls without breaking the rest of the application?

Context Isolation with AsyncLocalStorage

If you haven't used AsyncLocalStorage (ALS) from the node:async_hooks module, it is essentially thread-local storage for asynchronous operations.

When you start an agent run, you wrap it in an ALS context and pass it a "Policy Engine" and a "Receipt." Any function called downstream no matter how many promises it chains through can access that context.

Here’s a simplified sketch of what ReceiptBot does internally (the library hides the AsyncLocalStorage store behind runWithInterceptors).

import { AsyncLocalStorage } from 'node:async_hooks';
import type { PolicyEngine, Receipt } from '@receiptbot/core';

// This holds the state for the current async execution tree
export const context = new AsyncLocalStorage<{ policy: PolicyEngine; receipt: Receipt }>();

// Simplified internal sketch (ReceiptBot’s public API exposes runWithInterceptors, not the ALS store)
export async function runWithInterceptors(policy: PolicyEngine, receipt: Receipt, agentFn: () => Promise<any>) {
  // (Global monkey-patches are applied here)
  return context.run({ policy, receipt }, async () => {
    return await agentFn();
  });
}

Now, even if a rogue dependency nested five layers deep tries to read a file, ReceiptBot can look up the current ALS store and know which policy applies.

Runtime Monkey-Patching Node Core

To stop the agent from reading secrets or making rogue network requests, I needed a global interceptor. Using module.createRequire, the tool monkey-patches Node's core modules (fs, http, child_process, net, tls) at runtime.

During initialization, it replaces the original functions with wrappers. Here is a simplified look at how the fs.readFileSync patch works:

import { PolicyViolationError } from '@receiptbot/core';
const originalReadFileSync = fs.readFileSync;

fs.readFileSync = function (...args) {
  const ctx = context.getStore(); // Check if we are inside an agent run

  if (ctx) {
    // ReceiptBot records the attempt FIRST; policy evaluation happens inside addEvent()
    const event = ctx.receipt.addEvent({
      type: 'tool.fs',
      action: `fs.readFileSync("${String(args[0])}")`,
      payload: { op: 'readFile', path: String(args[0]) },
    });

    // If the policy engine flagged it, kill the execution
    if (event.status === 'BLOCKED_BY_POLICY') {
      throw new PolicyViolationError('tool.fs', event.action, event.policyTrigger ?? 'Policy violation');
    }
  }

  // Execute the original function if allowed
  return originalReadFileSync.apply(this, args);
};

Hard-Stops, Cost Caps, and Redaction

Security isn't just about file access; it's about your API budget. A common failure mode for autonomous agents is getting stuck in a while(true) loop of hallucination, racking up a massive OpenAI API bill overnight.

While the network interceptor (http/fetch/net) is great for enforcing URL domain blocklists, calculating tokens natively at the network layer is messy. Instead, the Policy Engine allows you to enforce a hard budget cap:

const policy = new PolicyEngine()
  .denyPathGlobs(['**/.env'])
  .maxCost(1.00); // Hard stop at $1.00

When an LLM API call happens (either via a framework adapter or manually emitted as an llm.call event), it includes a costImpactUsd property. The Policy Engine validates the running total on every one of these events. The moment the next call would push the total over $1.00, it throws a PolicyViolationError and kills the execution path.

Finally, before any logs are written to the JSON receipt, the tool runs a redaction pass. It uses regex patterns to catch AWS keys, Stripe tokens, and OpenAI keys, replacing them with labeled markers like [REDACTED_OPENAI_API_KEY] so your audit logs don't become a new security vulnerability.

The Result: ReceiptBot

I packaged this architecture into an open-source tool called ReceiptBot.

It requires zero external infrastructure. It just sits quietly in your Node codebase, intercepts rogue system calls, and spits out a highly detailed, redacted JSON "receipt" of exactly what the agent did.

redshadow912 / ReceiptBot

🧾 ReceiptBot

A Flight Recorder and Seatbelt for Node.js AI Agents.

Monkey-patching isn't a hard OS sandbox — ReceiptBot is not trying to be one. It's your in-process flight recorder: a structured audit trail of every I/O operation, a cost governor that cuts off runaway LLM loops, and a secret scrubber that runs before any log is written. All of it drops into your existing Node.js project in one function call.

View on GitHub · Quickstart · Architecture · Full API Reference

What is ReceiptBot?

ReceiptBot is a runtime governance library for Node.js that wraps your AI agent's async execution context with:

A Policy Engine — rules you define that block dangerous operations before they happen
A Flight Recorder — an immutable, structured audit trail (a "receipt") of every action taken
A Global Interceptor — monkey-patches raw Node.js core modules so even rogue third-party library calls are caught

It does…

View on GitHub

I want to be fully transparent: this is not a perfect OS-level sandbox like a V8 Isolate. There are always edge cases with monkey-patching in JavaScript. But it covers the most common, dangerous escape hatches (direct node:fs, http, child_process, fetch) within the same process.

If you are building with LangChain, AutoGen, or just raw LLM calls in Node.js, and you want a pragmatic "seatbelt" to keep your .env files safe and your budget capped, I’d love for you to check it out.

I would love any brutal architectural feedback you have!

Top comments (5)

Hermetic Dev • Apr 15

This is a thoughtful approach to a real gap. You're right that framework-level tool definitions are a voluntary boundary — the LLM can generate arbitrary code that calls Node core directly, and nothing in LangChain or AutoGen's architecture prevents it.

A few pieces of architectural feedback since you asked:

On the monkey-patching escape surface: You're transparent about this, which is good. The specific escapes worth documenting: (1) vm.runInNewContext() gets a fresh set of builtins,
(2) native addons (N-API) can call libuv directly without going through the patched JS wrappers, (3) process.binding('fs') accesses the internal C++ bindings pre-patch (though this
is deprecated, it still works), (4) a sufficiently clever LLM could do delete require.cache[require.resolve('fs')] and re-require to get the original. None of these invalidate the tool for the 90% case, but documenting them helps users understand the threat model boundary.
On the AsyncLocalStorage approach: This is the right primitive. One edge case to watch: if the agent code uses setTimeout or setImmediate without being inside the ALS context (e.g.,
scheduling a delayed callback from within a vm context), the store lookup returns undefined and the interceptor falls through to the original function — silently. A fail-closed default (block if no ALS context found, rather than allow) would be safer for the security use case.

On the credential redaction: Regex-based redaction on output is valuable but catches credentials after they've been read into process memory. There's a complementary approach: remove credentials from the filesystem entirely so there's nothing to read. I work on Hermetic (hermeticsys.com), which takes this approach — credentials live in an encrypted daemon, and
the agent process gets opaque handles instead of raw secrets. ReceiptBot's interception layer and Hermetic's credential isolation would work well together: Hermetic ensures .env doesn't contain secrets, and ReceiptBot catches any other unexpected filesystem/network behavior the agent attempts. Defense in depth.

The cost governance feature (maxCost) is genuinely useful and something most credential brokers don't address. Nice addition.

vysh • Apr 15

This is the most thorough architectural review ReceiptBot has received thank you.

You're right to flag vm contexts, native addons, and internal bindings. I'd also surface the broader "pre-patch captured references" class (a dependency that captures readFileSync before setupGlobalPatches() runs) as arguably the most common real-world vector. I'm adding a formal Threat Model Boundary section to the README documenting all of these honestly.

On the ALS drop: implementing this as opt-in strict mode (requireContext: true in setupGlobalPatches()) rather than a default change fail-closed-by-default would break legitimate non-agent code paths. Security-sensitive teams can explicitly opt in.

The Hermetic approach is genuinely complementary Hermetic removes the secret from the filesystem entirely, ReceiptBot audits unexpected runtime behaviour. I'll document that pairing in the docs.

Hermetic Dev • Apr 16

Great call ! on the pre-patch captured references — that's probably the most realistic escape in practice. Looking forward to the Threat Model Boundary section.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.