PolicyAware vs Guardrails vs AI Gateways vs Model Routers: The Comparison Every AI Engineer Needs to Read

#opensource #ai #security #python

I've been building AI-powered features for a while now, and the hardest conversations I have with my team are never about which model to use. They're always about the same thing: what is this system actually allowed to do, and how do we prove it?

That question pushed me to build PolicyAware - an open source Python control plane that sits in front of your models, tools, and retrieval systems. Before I explain what it does, I want to walk through why the tools most teams reach for first - guardrails, AI gateways, and model routers - are genuinely useful but leave a critical gap wide open.

The landscape right now

If you search for "AI safety" or "LLM governance" you will find three categories of tools coming up again and again:

Guardrail libraries - validate prompts and outputs against safety rules
AI gateways - proxy your requests to model providers, centralize API keys
Model routers - pick the cheapest or fastest model for each request

All three are useful. None of them alone answers the governance question.

Here is the mental model I use: a guardrail checks what the model says. A gateway manages where the request goes. A router decides which model handles it. But none of them ask the most important question first: should this request be allowed to run at all, under this user's role, for this tenant, in this region, given this risk level?

That is the gap PolicyAware fills.

Side-by-side comparison

Capability	Guardrails	AI Gateway	Model Router	PolicyAware
Block unsafe prompts before execution	Sometimes	Sometimes	No	Yes
Redact PII / PHI / secrets pre-execution	Sometimes	Sometimes	No	Yes
Decisions using role, tenant, region, risk	Limited	Limited	Limited	Yes
Deny-by-default posture	Usually no	Usually no	No	Yes
Govern MCP / agent tool calls	Usually no	Sometimes	No	Yes
Require human approval for risky actions	Usually no	Sometimes	No	Yes
Route across providers after policy approval	No	Yes	Yes	Yes
Evaluate RAG citation, grounding, leakage	Sometimes	Limited	No	Yes
Emit audit traces with reason codes	Limited	Sometimes	Limited	Yes
Generate compliance evidence artifacts	Usually no	Usually no	No	Yes

The right column is not a flex. It is a description of what enterprise AI systems actually need once they move beyond read-only chat and start touching real data, real tools, and real business workflows.

When each tool is the right call

Use a guardrails library when your only need is response formatting, toxicity filtering, or structured output validation. If you do not need RBAC, tenant rules, approval flows, or audit evidence, a guardrail is lighter and faster.

Use an AI gateway when your main problem is juggling provider keys, rate limits, and fallback routing. Gateways are great infrastructure. They are just not governance.

Use a model router when you are optimizing for cost, latency, or quality tradeoffs across providers. A router does not decide whether a request should run - only which model would run it.

Use PolicyAware when your AI system touches sensitive data, calls external tools, operates under regional compliance rules, or takes actions with financial or operational consequences. If you need to explain a decision to a security team six months from now, you need a control plane, not just a proxy.

How the architecture fits together

Here is the pattern I use in production. The key rule is: nothing reaches a model, retriever, or tool until the control plane has made an explicit decision.

+-----------------------------+
| Application Layer           |
| (web app / API / workflow)  |
+-------------+---------------+
              |
              v
+-----------------------------+
| PolicyAware Control Plane   |
|                             |
|  1. Identity + context      |
|  2. Deny-by-default check   |
|  3. PII / PHI detection     |
|  4. Risk classification     |
|  5. Approval gate (if high) |
|  6. Provider routing        |
+--------+----------+---------+
         |          |
         v          v
  +----------+  +----------+
  | RAG Layer|  | Tools /  |
  | retrieval|  | MCP      |
  | citation |  | payments |
  +----+-----+  +----+-----+
       |              |
       +------+-------+
              |
              v
       +-------------+
       | Model Layer |
       | (local/SaaS)|
       +------+------+
              |
              v
       +-------------+
       | Evaluations |
       | leakage     |
       | grounding   |
       | audit trace |
       +-------------+

Every arrow in that diagram has a policy decision attached to it. That is the entire point.

A real example: the $500 refund prompt

Let us make this concrete. A customer-support copilot gets this message:

Email jane@example.com and refund the customer $500.

Here is what different tools do with it:

A guardrail might check whether the output looks safe
A gateway forwards the request to your provider of choice
A router picks GPT-4.1 because it is the best model for support tasks
PolicyAware stops and works through the full decision tree before any of that happens

Code: policy-first middleware

from dataclasses import dataclass, field
from enum import Enum
import re
from typing import List, Optional

class Decision(str, Enum):
    ALLOW = "allow"
    DENY = "deny"
    REQUIRE_APPROVAL = "require_approval"

@dataclass
class RequestContext:
    user_id: str
    role: str
    tenant: str
    region: str
    task_type: str
    prompt: str
    tools: List[str] = field(default_factory=list)

@dataclass
class PolicyResult:
    decision: Decision
    risk_tier: str
    redacted_prompt: str
    reason_codes: List[str]
    required_approver: Optional[str] = None

EMAIL_RE = re.compile(r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}")

ALLOWED_TOOLS = {
    "support_agent": {"knowledge_search", "draft_email"},
    "finance_manager": {"knowledge_search", "draft_email", "issue_refund"},
}

def evaluate_policy(ctx: RequestContext) -> PolicyResult:
    reason_codes = []
    allowed = ALLOWED_TOOLS.get(ctx.role, set())

    for tool in ctx.tools:
        if tool not in allowed:
            return PolicyResult(
                decision=Decision.DENY,
                risk_tier="high",
                redacted_prompt=EMAIL_RE.sub("[REDACTED]", ctx.prompt),
                reason_codes=[f"tool_not_permitted:{tool}"],
            )

    redacted = EMAIL_RE.sub("[REDACTED]", ctx.prompt)
    if EMAIL_RE.search(ctx.prompt):
        reason_codes.append("pii_detected")

    is_high_risk = "refund" in ctx.prompt.lower() or "issue_refund" in ctx.tools
    if is_high_risk:
        reason_codes.append("high_risk_financial_action")
        return PolicyResult(
            decision=Decision.REQUIRE_APPROVAL,
            risk_tier="high",
            redacted_prompt=redacted,
            reason_codes=reason_codes,
            required_approver="finance_supervisor",
        )

    reason_codes.append("policy_allow")
    return PolicyResult(
        decision=Decision.ALLOW,
        risk_tier="low",
        redacted_prompt=redacted,
        reason_codes=reason_codes,
    )

# Try the refund prompt
ctx = RequestContext(
    user_id="u-1001",
    role="support_agent",
    tenant="acme-corp",
    region="us-east",
    task_type="customer_support",
    prompt="Email jane@example.com and refund the customer $500.",
    tools=["draft_email", "issue_refund"],
)

result = evaluate_policy(ctx)
print(result.decision)      # Decision.DENY
print(result.reason_codes)  # ['tool_not_permitted:issue_refund']

The support agent gets denied before the prompt ever reaches a model. The reason code is logged. The redacted prompt is stored. That is the audit trail your security team will ask for.

Code: compliant model routing

Routing still matters - but it should only happen after policy approves the request.

@dataclass
class RouteDecision:
    provider: str
    model: str
    reason: str

COMPLIANT_MODELS = {
    "us-east": [("azure_openai", "gpt-4.1"), ("local_vllm", "llama-3.1-70b")],
    "eu-west": [("azure_openai_eu", "gpt-4.1"), ("local_vllm_eu", "llama-3.1-70b")],
}

def route_after_policy(result: PolicyResult, ctx: RequestContext) -> RouteDecision:
    if result.decision != Decision.ALLOW:
        raise PermissionError(f"Cannot route - decision is {result.decision}")

    options = COMPLIANT_MODELS.get(ctx.region, [])
    if not options:
        raise RuntimeError(f"No compliant providers for region: {ctx.region}")

    provider, model = options[0]
    return RouteDecision(provider, model, "policy-approved compliant route")

A traditional router asks: which model is fastest? This asks: which model is allowed? The order of those questions changes everything about your compliance posture.

Code: audit trace

This is the piece most teams skip - and regret during their first security review.

from datetime import datetime
import json

def emit_audit_trace(ctx: RequestContext, result: PolicyResult, route=None):
    trace = {
        "timestamp": datetime.utcnow().isoformat() + "Z",
        "user_id": ctx.user_id,
        "tenant": ctx.tenant,
        "region": ctx.region,
        "task_type": ctx.task_type,
        "decision": result.decision.value,
        "risk_tier": result.risk_tier,
        "reason_codes": result.reason_codes,
        "tools_requested": ctx.tools,
        "route": None if route is None else {
            "provider": route.provider,
            "model": route.model,
        },
        "prompt_preview": result.redacted_prompt[:200],
    }
    print(json.dumps(trace, indent=2))

Sample output for the denied refund request:

{
  "timestamp": "2026-05-23T15:00:00Z",
  "user_id": "u-1001",
  "tenant": "acme-corp",
  "region": "us-east",
  "task_type": "customer_support",
  "decision": "deny",
  "risk_tier": "high",
  "reason_codes": ["tool_not_permitted:issue_refund"],
  "tools_requested": ["draft_email", "issue_refund"],
  "route": null,
  "prompt_preview": "Email [REDACTED] and refund the customer $500."
}

Every denied request, every approval gate, every route choice - all replayable. That is the evidence layer.

Start using PolicyAware today

PolicyAware is open source, MIT licensed, and published as a Python package. You do not need a SaaS contract. You do not need to rip out your existing stack. Drop it in as a middleware layer in front of your LLM calls.

pip install policyaware

Simplest integration pattern:

from policyaware import evaluate_policy, RequestContext

ctx = RequestContext(
    user_id=current_user.id,
    role=current_user.role,
    tenant=current_user.tenant,
    region=current_user.region,
    task_type="customer_support",
    prompt=user_message,
    tools=requested_tools,
)

result = evaluate_policy(ctx)

if result.decision == "allow":
    response = call_your_llm(result.redacted_prompt)
elif result.decision == "require_approval":
    request_human_approval(ctx, result)
else:
    return {"error": "Request denied", "reason": result.reason_codes}

One function call between your application and your model. Policy first. Everything else second.

The bottom line

Guardrails make your outputs safer. Gateways make your infrastructure cleaner. Routers make your model spend smarter. But none of them govern the full execution path.

If your AI system is making decisions that touch real people, real money, or real compliance boundaries - you need a control plane that runs policy before execution and produces evidence after it.

That is exactly what PolicyAware is built for. Star the repo, install the package, and let me know what governance problems you are running into - I am actively building this out in the open.

GitHub: https://github.com/ktirupati/policyaware