yuqiang

Posted on May 8

Building an Agentic Firewall for Multi-Agent AI Systems

#ai #agents #opensource #python

Multi-agent AI systems are becoming more common.

Instead of one model answering one prompt, we now have systems where multiple agents plan, call tools, critique each other, route tasks, write code, verify outputs, and pass messages across a workflow.

That creates a new kind of engineering problem.

In a single-agent system, a bad answer is usually local.

In a multi-agent system, a bad answer can propagate.

One agent may produce an uncertain or incomplete output. Another agent may treat it as reliable context. A third agent may make a decision based on that transformed context. By the time the system fails, the original uncertainty has already been amplified across the agent graph.

This is the problem that Agentic Firewall tries to explore.

GitHub: https://github.com/schchit/Agentic-Firewall

The Problem: Agentic Systems Fail Through Propagation

Most LLM safety and reliability tools focus on individual inputs and outputs:

Is this prompt safe?
Is this model response toxic?
Does this answer contain PII?
Did the model hallucinate?
Is this tool call allowed?

These checks are useful, but they are not enough for multi-agent systems.

In agentic workflows, the unit of risk is not just a single message.

The risk is the communication path.

For example:

Planner Agent
  → Coder Agent
    → Reviewer Agent
      → Deployment Agent

If the planner makes an uncertain assumption, the coder may implement around it. The reviewer may only check code correctness, not whether the original assumption was valid. The deployment agent may then act on a decision that looks structured and confident, even though the initial observation was underdetermined.

The failure did not come from one bad message alone.

It came from uncertainty moving through the system.

What Is an Agentic Firewall?

An Agentic Firewall is a communication control layer between agents.

Instead of allowing agents to pass raw messages directly to each other, messages can be routed through a firewall layer that checks whether the message is safe, sufficient, target-relevant, and stable enough to continue.

A simple version looks like this:

text Agent A → Agentic Firewall → checked / compressed / risk-scored message → Agent B

The goal is not to replace agents.

The goal is to govern what flows between them.

Agentic Firewall focuses on four checks:

Target-relevant compression
Observation-target refinement
Uncertainty cascade detection
Verifier placement and risk contraction

These checks are implemented in the current repository as a practical Python/FastAPI runtime component. The repo describes the project as a graph-theoretic communication firewall for multi-agent systems, with checks for target-relevant facts, observation sufficiency, uncertainty cascades, and verifier placement. ([GitHub][1])

1. Target-Relevant Compression

Agent messages are often too verbose.

A planner may send paragraphs of reasoning when the next agent only needs:

json { "target": "deploy_decision", "facts": ["tests passed", "rollout window is short"], "risk": "medium", "next_step": "deploy canary" }

Compression is useful because it reduces token cost and improves routing clarity.

But compression can also be dangerous.

If the compressed message removes a fact that is necessary for the downstream target decision, the system may become cheaper but less reliable.

So the important question is not:

Did we compress the message?

The important question is:

Did we preserve the facts needed to determine the target?

That is target-relevant compression.

Agentic Firewall includes a semantic_loss.py component for checking whether compression loses target-relevant information and can produce semantic-loss certificates. ([GitHub][1])

2. Observation-Target Refinement

In many agentic systems, an agent observes only part of the world.

The system then asks that agent to make a target decision.

But sometimes the observation is not sufficient to determine the target.

For example:

text Observation: "tests failed" Target decision: "rollback" or "retry"

That observation may be insufficient.

The correct decision may depend on:

which tests failed,
whether the failure is flaky,
whether production is affected,
whether the deployment is reversible,
whether the failure came from infrastructure or application logic.

If two different world states produce the same observation but require different target decisions, then the target is not determinable from the observation.

Agentic Firewall adds a determinability layer through determinability.py, described in the README as a finite (F, Ω, D) checker that can output a decision table or residual conflict certificate. ([GitHub][1])

In simpler terms:

If the same observation can imply different correct actions, the firewall should not blindly allow the decision to continue.

It should require more information, a verifier, or a safer fallback.

3. Uncertainty Cascade Detection

Multi-agent systems can be represented as graphs.

Agents are nodes.

Messages are edges.

Some edges amplify uncertainty more than others.

A workflow like this may be stable:

text Planner → Coder → Reviewer

But a workflow with loops, strong dependency edges, or repeated self-reinforcement may become unstable:

text Planner → Coder → Reviewer ↑ ↓ └──── Replanner ←─┘

If uncertainty keeps circulating, the system can converge to a bad decision with high confidence.

Agentic Firewall includes an uncertainty cascade engine in cascade.py. The README states that it estimates spectral radius and classifies the graph as convergent, critical, or unstable. ([GitHub][1])

The idea is simple:

Some agent topologies dampen uncertainty. Others amplify it.

A firewall should be able to detect when the communication graph itself is risky.

4. Verifier Placement

If the system detects residual conflicts or cascade risk, the next question is:

Where should we add verification?

Not every agent edge needs a verifier.

Adding verification everywhere increases cost and latency.

The better approach is to place verifiers where they reduce the most risk.

For example:

text Planner → Coder → Reviewer → Deployment

If the highest-risk transition is from Reviewer to Deployment, then that may be the best place to add a verifier.

Agentic Firewall includes verifier placement logic as part of its governance decision. The README describes this as recommending verifier nodes when conflicts or critical cascade risk appear. ([GitHub][1])

This turns the firewall from a passive checker into an active governance layer.

It can recommend actions such as:

text ALLOW REQUIRE_VERIFIER TERMINATE

Architecture

The current repository keeps the architecture intentionally simple.

Core files include:

text gateway.py FastAPI runtime gateway firewall.py Integrated governance decision logic cascade.py Uncertainty cascade analysis determinability.py Observation-target determinability checker semantic_loss.py Target-relative compression checker audit.py Canonical JSON, SHA-256, optional HMAC audit packet tests/ Unit tests examples/ Basic usage examples

The README lists these components and describes the gateway as exposing /transform and /firewall/evaluate. ([GitHub][1])

A typical runtime path looks like this:

text Raw agent message → target-relevant compression check → observation determinability check → cascade risk analysis → verifier placement decision → governed output

API Example

Run the API:

bash uvicorn gateway:app --host 0.0.0.0 --port 8080

Transform a message:

bash curl -X POST http://127.0.0.1:8080/transform \ -H "Content-Type: application/json" \ -d '{ "target": "deploy_decision", "message": "Fact: tests passed. Risk: rollout window is short. Next step: deploy canary." }'

Evaluate firewall risk:

bash curl -X POST http://127.0.0.1:8080/firewall/evaluate \ -H "Content-Type: application/json" \ -d '{ "agents": ["planner", "coder", "reviewer"], "edges": [ ["planner", "coder", 0.8], ["coder", "reviewer", 0.9] ], "residual_conflict_count": 1 }'

The repository README includes these two API examples directly. ([GitHub][1])

Python Usage

You can also use the core logic directly:

`python
from determinability import check_determinability
from firewall import evaluate_firewall

configs = [
{"obs": "same", "target": "allow"},
{"obs": "same", "target": "block"},
]

report = check_determinability(
configs,
lambda c: c["obs"],
lambda c: c["target"]
)

print(report.determinable)
print(report.conflicts[0])

agents = ["planner", "coder", "reviewer"]
edges = [
("planner", "coder", 0.8),
("coder", "reviewer", 0.9)
]

decision = evaluate_firewall(
agents,
edges,
residual_conflict_count=report.residual_conflict_count
)

print(decision.action)
`

The example above follows the usage pattern shown in the project README. ([GitHub][1])

Why This Matters

As AI applications move from chatbots to agentic systems, the failure mode changes.

The main problem is no longer only:

Did the model answer correctly?

It becomes:

Did the system preserve the right information across agents?

And:

Did uncertainty get amplified before the final decision?

This is especially important in workflows like:

AI coding agents
research agents
autonomous data analysis
customer support routing
deployment automation
security triage
multi-agent planning systems
tool-using AI assistants

In these systems, the communication layer becomes critical infrastructure.

Agentic Firewall is an early attempt to make that layer explicit.

What This Is Not

Agentic Firewall is not a complete enterprise security product.

It is not a replacement for:

model evaluation,
prompt injection defense,
access control,
sandboxing,
human review,
observability,
policy engines,
or traditional application security.

It is better understood as a research-driven engineering prototype for a missing layer:

risk-aware communication governance between AI agents.

The current repository is MIT licensed and currently has no published releases. ([GitHub][1])

Getting Started

`bash
git clone https://github.com/schchit/Agentic-Firewall
cd Agentic-Firewall

python -m pip install -r requirements.txt
python -m unittest discover -s tests -v
python examples/basic_usage.py
`

Run the API:

bash uvicorn gateway:app --host 0.0.0.0 --port 8080

Closing Thought

Multi-agent systems need more than better prompts.

They need communication control.

When one agent sends information to another, the system should be able to ask:

Is this message target-relevant?
Did compression remove necessary facts?
Is the observation sufficient for the target decision?
Is uncertainty being amplified across the graph?
Where should verification be inserted?

That is the basic idea behind Agentic Firewall.

It is a small step toward treating agent communication as something that can be checked, governed, audited, and made safer before it becomes downstream action.

GitHub: https://github.com/schchit/Agentic-Firewall