DEV Community: Alexander Paris

Why AI Agents Need a Firewall: Introducing Suprawall

Alexander Paris — Fri, 01 May 2026 09:04:59 +0000

Why AI Agents Need a Firewall: Introducing Suprawall

AI agents are moving into production. But who's securing them?

As teams deploy LangChain agents, CrewAI workflows, and custom AI systems into production environments, a critical gap has emerged: they lack basic security infrastructure.

The Problem: Unsecured AI Agents

When an AI agent goes wrong, it goes really wrong:

Prompt injection attacks can manipulate agent behavior
PII leakage exposes customer data through logs and outputs
Jailbreaks bypass safety constraints and business rules
Compliance violations occur silently (GDPR, HIPAA, EU AI Act)

Most teams have no visibility into these risks until it's too late.

Current Solutions Fall Short

Probabilistic guardrails (ML-based filtering) sound good in theory, but they fail in practice:

They can be bypassed with clever prompts
False positives block legitimate requests
They add unpredictable latency
They hallucinate edge cases

What we need is something deterministic — a security layer that makes guarantees, not guesses.

Introducing Suprawall

Suprawall is an open-source security middleware for AI agents that operates at the SDK layer, not the application layer.

Key features:

Deterministic prompt injection blocking — Not probabilistic ML, but hard rules that can't be bypassed
Automatic PII redaction — GDPR/HIPAA compliant, works transparently
EU AI Act enforcement — Built-in compliance checks
Sub-millisecond latency — No noticeable slowdown
Drop-in integration — Works with LangChain, CrewAI, OpenAI, Anthropic, LlamaIndex

How It Works

from suprawall import Suprawall

# Wrap your agent
agent = Suprawall.wrap(langchain_agent)

# Get deterministic security automatically
response = agent.run(user_prompt)
# PII redacted, injections blocked, compliance enforced

That's it. One line of code, production-grade security.

Why Deterministic Matters

Unlike probabilistic guardrails:

Suprawall operates at the SDK layer — it can see and intercept everything
No black-box ML models — transparent, auditable enforcement
No hallucinations — rules are explicit and testable
Compliance is guaranteed — not hoped for

Open Source, Self-Hostable

Suprawall is MIT licensed and open-source. Run it in your own infrastructure, audit the code, contribute improvements.

GitHub: https://github.com/wiserautomation/SupraWall
Website: https://supra-wall.com

Get Started

Try it today on GitHub. MIT licensed, zero dependencies, production-ready.

Perfect for:

AI engineers building agents
CTOs implementing AI governance
Compliance officers enforcing regulations
DevOps teams securing AI deployments

The firewall for AI agents is here. Use it.

Suprawall: Deterministic security for AI agents. One line of code.

LLM-as-judge is not a security layer for AI agents – here's why and what we built

Alexander Paris — Thu, 30 Apr 2026 16:04:29 +0000

Every guardrail product in the agent security space is built on the same architecture: use a second LLM to evaluate the first LLM's output. Lakera, NeMo Guardrails, Guardrails AI, the OpenAI Moderation API — all of them work this way at the tool-call layer. They score tokens. They don't intercept actions.

The problem is structural. When your agent decides to call execute_sql("DROP TABLE users"), the LLM-as-judge sees a text string and predicts whether it's dangerous. It's right about 80% of the time. The other 20% is where your agent wires money to the wrong account, deletes your production database, or leaks a customer record.

I wrote a post mapping four specific bypass patterns I found while building SupraWall, an open-source runtime policy engine for AI agents. For each one, I show the actual tool-call payload, why the LLM-judge misses it, and what a deterministic pre-execution intercept catches instead.

The four patterns:

Context window displacement. Inject 50k tokens of benign content before the malicious tool-call instruction. By the time the judge evaluates the tail of the context, it has softmax'd the threat away.
Indirect tool chaining. The agent is told to "summarize the file at this path." The path contains a second-order instruction. The judge scores the first instruction as safe. The tool executes the second.
Unicode homoglyph substitution in tool names. file_delete vs fіle_delete (Cyrillic і). The LLM-judge normalizes both to the same embedding. The runtime doesn't.
Confidence hijacking via few-shot priming. Prepend three examples where the judge correctly allowed benign operations, then submit the malicious one. The judge pattern-matches to "this looks like what I just approved."

The post includes the actual prompts I used to reproduce each bypass against publicly available guardrail APIs. I'm not pulling punches — I name the products, show the payloads, show the outputs.

The alternative I'm building: policy enforcement that happens at the SDK level, before the tool executes. No LLM in the enforcement path. ALLOW/DENY is a code path, not a probability distribution. 1.2ms decision latency vs. 50ms+ for a round-trip to a SaaS guardrail.

Full post + attack examples here: [supra-wall.com/blog/llm-as-judge-fails-agent-security]
GitHub (Apache 2.0): [github.com/wiserautomation/SupraWall]

Would genuinely like to hear from anyone who has found different bypass patterns — or who thinks I'm wrong about the architecture.

Why I spent 14 months building a firewall for AI agents

Alexander Paris — Wed, 22 Apr 2026 13:00:40 +0000

System prompts aren't enough. They are just a polite request for an agent to behave. In production, 'please don't delete the database' is not a security strategy.

Today, we are moving SupraWall to a public repository. It's the missing layer for agentic workflows: A hard gate that intercepts every tool call, evaluates it against real-time policy, and injects vault credentials just-in-time.

This isn't about blocking AI. It's about giving agents the professional guardrails they need to handle real-world permissions securely.

Open source
Apache 2.0
EU AI Act ready

Check it out on GitHub: https://github.com/wiserautomation/SupraWall

EU AI Act + LangChain: What You Actually Need to Build Before August 2026

Alexander Paris — Sun, 29 Mar 2026 06:43:39 +0000

The EU AI Act high-risk enforcement deadline is August 2, 2026. That is 126 days from today.

If you're running AI agents in production — especially on LangChain, CrewAI, or any tool-calling framework — and you're serving EU customers or operating in the EU, you are likely subject to obligations you probably haven't operationalized yet.

This is not a legal article. It's a technical one. Here's what Articles 9, 13, and 14 actually require you to build.

The three articles that matter for agent developers
Article 9 — Risk Management System
Not a document. A running system that continuously identifies, estimates, and evaluates risks across the lifecycle of the AI system. For agent developers, this means: logging every tool call, every decision, every output — in a way you can query after the fact.

Article 13 — Transparency and provision of information
Every interaction must be traceable. The system must be able to explain what happened, when, and why. For LangChain agents, this means structured metadata per tool invocation — not just application logs.

Article 14 — Human oversight
High-risk AI systems must be designed so a human can intervene, override, and halt them. For agents, this means you need REQUIRE_APPROVAL policies on sensitive tool categories — not just after-the-fact monitoring.

What most LangChain deployments are missing right now
Most production LangChain setups have:

Application-level logging (what the user sent, what the LLM returned)

Some prompt-level filtering

Maybe a token budget set in the LLM client

What they're missing:

Tool call-level audit trail — a tamper-evident, append-only record of every tool invocation with inputs, outputs, timestamp, and agent context. Not just logs — logs can be edited. You need RSA-signed chains.

Policy enforcement at the execution boundary — before the tool runs, not after. GDPR, DORA, and the AI Act all care about what actually executed, not what you intended.

Credential isolation — agents that see plaintext API keys in their context are a live credential theft vector. JIT injection means the agent requests a capability; it never receives the underlying secret.

Fail-closed defaults — if your compliance check times out, what happens? Most middleware silently degrades to "allow." That's worse than no check, because you have a false paper trail.

A concrete implementation pattern
Here's the minimal compliant pattern for a LangChain agent:

from suprawall import secure_agent

# Your existing agent — unchanged
agent = create_react_agent(llm, tools)

# One line. Every tool call is now policy-checked,
# vault-protected, and audit-logged.
secured_agent = secure_agent(agent, api_key="ag_your_key")

What this gives you:

Every tool call intercepted before execution

Policy engine runs in <2ms (deterministic, not probabilistic)

Credentials injected at runtime — agent sees capability, not secret

RSA-signed audit trail written append-only per interaction

Hard budget cap with circuit breaker — no infinite loops

The deadline is real this time
GDPR took years before meaningful enforcement. The AI Act is different: the AI Office is actively staffing, the prohibited practices have been enforceable since February 2025, and high-risk obligations kick in on a fixed date with specific technical documentation requirements.

126 days is enough time to instrument properly. It is not enough time to build the audit infrastructure from scratch while also shipping product.

→ SupraWall is open-source (Apache 2.0).
Early beta access at supra-wall.com or https://github.com/wiserautomation/SupraWall

How to Make Your LangChain Agent EU AI Act Compliant in 5 Minutes

Alexander Paris — Sat, 21 Mar 2026 16:19:29 +0000

The EU AI Act requires human oversight (Article 14), audit logging (Article 12), and risk management (Article 9) for production AI agents. Most LangChain deployments have none of these. If your agent is touching customer data, sending emails, executing financial transactions, or interacting with any external system, you are likely already non-compliant. Fines can reach €30 million or 6% of global annual turnover. The good news: you can add all three compliance pillars in under 5 minutes with a single middleware integration. Here's exactly how.

The 3-Line Problem

Most LangChain agents in production look something like this:

from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o")
agent = create_openai_functions_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

result = executor.invoke({"input": "Send a follow-up email to all leads from last quarter"})

Clean, functional, and dangerously non-compliant. Here's what's missing:

No audit trail. You have no record of what the agent decided, which tools it called, what data it accessed, or when. Article 12 of the EU AI Act mandates automatic logging of all events necessary to trace the AI system's decisions throughout its lifecycle. A plain LangChain executor writes nothing to a compliance-grade log.

No human oversight. Article 14 requires that high-risk AI systems allow human operators to monitor and intervene in real time. If your agent decides to bulk-email 10,000 leads at 2 AM, nothing stops it.

No policy engine. Article 9 demands a risk management system that identifies, analyzes, and mitigates risks specific to your deployment. There's no mechanism here to evaluate whether a particular tool call is permissible before it executes.

This is the three-line problem: three lines of executor code, three articles of the EU AI Act violated.

The 5-Minute Fix

Install the integration:

pip install langchain-suprawall

Now wrap your executor:

import os
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI
from langchain_suprawall import SuprawallMiddleware, RiskLevel

llm = ChatOpenAI(model="gpt-4o")
agent = create_openai_functions_agent(llm, tools, prompt)

executor = AgentExecutor(
    agent=agent,
    tools=tools,
    middleware=[
        SuprawallMiddleware(
            api_key=os.environ["SUPRAWALL_API_KEY"],
            risk_level=RiskLevel.HIGH,
            require_human_oversight=True,   # Article 14
            audit_retention_days=730,       # Article 12
        ),
    ],
)

result = executor.invoke({"input": "Send a follow-up email to all leads from last quarter"})

That's the entire change. Let's break down what each parameter actually does.

api_key connects to the SupraWall compliance backend, which is where your audit logs are stored, your policies are evaluated, and your human-in-the-loop notifications are dispatched.

risk_level=RiskLevel.HIGH tells SupraWall how aggressively to apply its policy engine. HIGH maps directly to the EU AI Act's high-risk classification (Annex III), which applies to agents making decisions in HR, credit, critical infrastructure, law enforcement adjacent systems, and customer-facing automation. At this level, every tool call is evaluated against your policy ruleset before execution. If you're unsure which level applies to you, start with HIGH — you can always downgrade after a legal review.

require_human_oversight=True activates Article 14 compliance. Any tool call classified as high-risk by SupraWall's policy engine will pause execution and dispatch a real-time notification to your designated compliance officer (via Slack, email, or webhook — configurable in your SupraWall dashboard). The agent cannot proceed until the oversight action is resolved.

audit_retention_days=730 sets log retention to two years (730 days), which aligns with the EU AI Act's post-market surveillance requirements under Article 72 for general purpose AI systems and is a conservative baseline for high-risk systems. Every tool call, decision, approval, denial, and error is stored with a tamper-evident timestamp and cryptographic chain of custody.

What Happens When a High-Risk Tool Is Called

Let's trace exactly what happens when your agent tries to call send_email with the above setup.

The agent decides to call send_email(to="leads@...", subject="Follow-up", body="...").
SupraWall middleware intercepts the call before execution. The tool does not run yet.
SupraWall evaluates the call against your policy ruleset. send_email is classified as a high-risk tool (external communication, potential PII exposure).
Because require_human_oversight=True, SupraWall dispatches a Slack message to your compliance officer: "Agent requested to call send_email to 847 recipients. Approve or deny?"
The compliance officer clicks Approve or Deny in Slack. SupraWall logs: the action taken, the timestamp (ISO 8601, UTC), and the approver's identity (pulled from their Slack/SSO profile).
If approved, the tool executes normally. If denied, the agent receives a structured error and can respond accordingly.

Compare this to LangChain's built-in HumanApprovalCallbackHandler. That approach pauses the agent and prints a prompt to your terminal — whoever is watching the terminal must type y or n. There's no log of who approved, no timestamp beyond your shell history, no integration with your existing compliance tooling, and no way to reconstruct the audit trail later. That's not Article 14 compliance; that's a debug flag.

SupraWall turns human oversight from a developer convenience into a compliance-grade system of record.

Generating an Audit Report

When your auditor (internal or external) asks for evidence of compliance, this is the two-line answer:

from langchain_suprawall import AuditReporter

reporter = AuditReporter(api_key=os.environ["SUPRAWALL_API_KEY"])
report = reporter.generate(
    start_date="2025-01-01",
    end_date="2025-03-31",
    format="pdf"
)
report.save("q1_audit.pdf")

The generated PDF includes: a full chronological log of every tool call made by your agent, the policy evaluation result for each call, the human oversight decisions (with approver identities and timestamps), any policy violations or near-misses flagged by the risk engine, and a summary attestation table mapping your deployment to the specific EU AI Act articles it satisfies.

This is what you hand to an auditor. It answers the three questions every EU AI Act auditor will ask: What did the system do? Who approved it? Can you prove it?

Next Steps

The langchain-suprawall package is available now on PyPI (pip install langchain-suprawall). The full tutorial — including how to configure your policy ruleset, set up Slack/webhook notifications, and handle multi-agent deployments where multiple executors share a single compliance context — is at suprawall.ai/blog/eu-ai-act-langchain.

The EU AI Act's high-risk provisions have enforcement teeth. The compliance window is narrower than most teams realize. Five minutes and one middleware import is a reasonable place to start.