Erdem Arslan

Posted on Feb 22

Building Tripwired: Engineering a Deterministic Kill-Switch for Autonomous Agents

#rust #node #aiagents #opensource

Autonomous agents rarely fail because of a single bad decision. They fail because they continue acting after they should have stopped.

Whether it's an LLM stuck in an infinite loop, a runaway script burning through your OpenAI token budget, or a rogue command attempting to execute rm -rf inside a critical cluster, the fundamental problem remains the same: agents lack a deterministic, physiological sense of pain.

To solve this, we built Tripwired (v0.1.7)—an Apache 2.0 open-core behavioral kill-switch for AI agents. This article details the engineering decisions, performance optimizations, and architectural causes-and-effects that shaped the Tripwired kernel.

1. The Core Problem: The Absence of "Stop"

In modern AI agent frameworks (LangChain, autogen, CrewAI), the primary focus is on expanding the agent's capabilities. However, introducing tools and unbounded loop execution creates severe systemic risks:

Token Runaway: An agent encounters an unexpected error and continuously retries the same action, burning thousands of tokens per minute.
Tempo Compression: An agent makes looping decisions too fast, creating a denial-of-service effect on backend systems.
Dangerous Executions: The agent hallucinates or gets prompt-injected to execute destructive system commands.

The Goal: We needed a discrete physiological layer that intercepts the AgentEvent, assesses the ActivityState, and makes an immediate IntentDecision (CONTINUE, PAUSE, STOP) before the action is executed.

2. Architectural Evolution: Why We Needed a Rust Kernel

Our initial prototype (v0.1.0) was written entirely in Node.js. It worked perfectly for basic state tracking and token budget enforcement using an ActivityEngine and SafetyGate.

However, we quickly hit a performance bottleneck when we introduced LLM-based safety analysis along with deep regex pre-filtering.

Cause and Effect: The Event Loop Trap

The Cause: Node.js is single-threaded. When the safety gate had to perform complex pattern matching and network orchestration to validate an agent's intent, it blocked the event loop. Furthermore, spawning isolated Node processes for safety checks took ~540ms (warm or cold).
The Effect: A half-second penalty on every single agent action degraded the real-time experience of our systems.
The Solution: We extracted the high-performance decision engine into an isolated sidecar binary written in Rust (kernel/).

The Rust IPC Implementation

To integrate the Rust kernel with the Node.js application, we implemented a Dual IPC mechanism: Named Pipes for Windows and Unix Sockets for Linux, with a TCP fallback.

Here is the latency benchmark for a Llama 3.2 3B model safety check:

Scenario	Technology	Latency	Improvement
Baseline	Node.js spawn	~540ms	-
Cold Start	Rust + TCP/IPC	467ms	13% faster
Warm State	Rust + TCP/IPC	164ms	70% faster

By utilizing an isolated sidecar process, we ensure that even if the Node.js event loop freezes due to heavy agent execution, the kill-switch remains active and monitoring.

3. The 3μs Fast Path: Deterministic Pattern Filtering

While LLMs are excellent at nuanced intent analysis, using an LLM to check if an agent is trying to run docker stop is computationally wasteful and non-deterministic. We needed mathematical certainty for critical infrastructure patterns.

We introduced a Regex Pre-filter in the Rust kernel. Before any log or action payload reaches the LLM validation tier, it passes through this filter.

Cause: Evaluating every action through an LLM introduces 164ms of overhead and non-zero hallucination risk.
Effect: The pre-filter intercepts known dangerous patterns (e.g., rm -rf, DROP TABLE, kubectl delete) in 0.003ms (3μs). If a pattern matches, the action is killed instantly, bypassing the LLM completely.

Tiered FilterConfig System (v0.1.7)

Because what is "dangerous" changes based on the context, we implemented a tiered FilterConfig system powered by TOML.

Essential Tier: Hardcoded, non-bypassable protections against system-wide destruction (e.g., formatting disks).
Domain Tier: Context-specific rules (e.g., blocking patient.*delete in a healthcare setting, or market.*sell* in a trading setting).
Exclude Rules: Whitelisting specific valid patterns to prevent false positives.

# Example tripwired.toml
domain = "devops"

patterns = [
    "(?i)kubectl.*delete.*namespace",
]

exclude = [
    "(?i)test.*namespace",
]

4. The Data Trail: Immutability and Auditability

When an agent is killed, the operator needs to know why. Debugging an autonomous agent post-mortem requires exact state parity.

To solve this, Tripwired implements an append-only JSONL Audit Trail. Every decision logs the input hash (SHA-256), the model fingerprint (name@config_hash), the prompt version, and the raw inference response.

Cause: Traditional logging overwrites state and lacks cryptographically verifiable inputs.
Effect: The JSONL trail ensures that every "kill" decision can be replayed and mathematically verified in a sandbox, proving exactly why the agent's behavior was classified as runaway or dangerous.

5. Looking Forward: Zero-Config and Embedded Inference

Tripwired does not try to make agents smarter; it provides the structural boundaries to make them safer.

Our immediate roadmap (v0.2.0) focuses on Managed Sidecars—a lifecycle orchestrator in Node.js that automatically downloads, spawns, and connects to the correct platform-specific Rust binary without the user knowing it's there. Just npm install and go.

Later, in v0.2.1, we aim to eliminate the HTTP overhead entirely by embedding llama.cpp directly into the Rust kernel via FFI bindings, targeting a warm latency of 50-80ms.

Conclusion

Building safe autonomous agents requires acknowledging their capacity for rapid, unconstrained failure. By combining the developer experience of Node.js with the deterministic performance and process isolation of a Rust kernel, Tripwired provides the safety net required to field AI agents in production environments.

Tripwired is open-source. Check out the npm package and the repository to integrate the kill-switch into your own agent pipelines.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.