DEV Community

KRISHNA KISHOR TIRUPATI
KRISHNA KISHOR TIRUPATI

Posted on

Build a Policy-Aware AI Gateway in Python: Data Protection + Policy Enforcement with policyaware

Most AI apps ship without any real governance layer. Prompts flow raw to models, sensitive data ends up in logs, and nobody finds out until a compliance audit or a breach. I built policyaware to fix that — a Python-first package that gives you data protection and policy enforcement in front of any AI system.

This article is a hands-on technical walkthrough. Every section has working code. By the end you will have a pattern you can wire into any AI gateway or agent pipeline today.

Quick Install

!pip install policyaware
Enter fullscreen mode Exit fullscreen mode

GitHub: https://github.com/ktirupati/policyaware
Wiki: https://github.com/ktirupati/policyaware/wiki


Part 1 — Data Protection

What the engine detects

The DataProtectionEngine scans any string and returns a structured DataFindings object. It classifies content into three buckets:

Bucket What it catches
PII email, phone, SSN, credit card
PHI medical record, patient ID, diagnosis, medication
Secrets API keys, bearer tokens, private keys

Inspecting a prompt

from policyaware import DataProtectionEngine

text = "Hi, I'm Jane. Reach me at jane@example.com or 212-555-7890."

engine = DataProtectionEngine()
findings = engine.inspect(text)

print(findings.contains_pii)        # True
print(findings.contains_phi)        # False
print(findings.contains_secrets)    # False
print(findings.contains_sensitive)  # True  (aggregate flag)
print(findings.categories)          # ['email', 'phone']
print(findings.redactions)          # 2
Enter fullscreen mode Exit fullscreen mode

DataFindings field reference

Field Type Description
contains_pii bool email, phone, SSN, credit card detected
contains_phi bool medical record, diagnosis, medication detected
contains_secrets bool API key, bearer token, private key detected
contains_sensitive bool True if any of the above is True
categories list e.g. ['email', 'phone', 'ssn']
redactions int Total number of matches found
redacted_text str Sanitised text returned by .redact()

Part 2 — Policy Enforcement

Data protection tells you what is in the request. Policy enforcement tells you what to do about it. The PolicyEngine loads a YAML file and evaluates every request against your rules, returning a structured PolicyDecision.

The four decision outcomes

Decision Meaning
allow Request passes through, apply any transforms
deny Request is blocked outright
conditional_allow Passes but triggers follow-up checks
require_approval Routes to a human-in-the-loop flow

The engine is deny-by-default. If no rule explicitly grants access, the request is blocked. No silent pass-throughs.

Writing your first policy YAML

Rules reference DataFindings fields directly via the data root:

# support_policy.yaml
id: support_policy
schema_version: "0.2"
default: deny

rules:

  # Rule 1: Block anything containing secrets (API keys, tokens)
  - name: deny_secret_leakage
    effect: deny
    when:
      data.contains_secrets: true

  # Rule 2: Redact PII for standard users, but not for compliance officers
  - name: redact_pii_standard_users
    effect: transform
    action: redact
    when:
      data.contains_pii: true
      user.role_not_in:
        - privacy_admin
        - compliance_officer

  # Rule 3: Allow support agents in US for low/medium risk requests
  - name: allow_support_agents
    effect: allow
    when:
      user.role_in:
        - support_agent
        - support_manager
      request.region: us
      risk.tier_in:
        - low
        - medium
Enter fullscreen mode Exit fullscreen mode

Enforcing the policy at runtime

Load the YAML, build a GatewayRequest, inspect the prompt, then call decide:

from policyaware import DataProtectionEngine, GatewayRequest, PolicyEngine

# Load policy from YAML file
policy = PolicyEngine.from_file("support_policy.yaml")

# Build the request context
request = GatewayRequest(
    tenant="acme-corp",
    app="support-copilot",
    user={"role": "support_agent", "id": "u_001"},
    context={"region": "us", "risk": "low"},
    messages=[{"role": "user", "content": "Email jane@example.com, urgent!"}],
)

# Step 1: inspect the prompt
findings = DataProtectionEngine().inspect(request.prompt_text)

# Step 2: evaluate policy
decision = policy.decide(request, findings)

# Step 3: act on the decision
print(decision.decision.value)   # 'allow' / 'deny' / 'conditional_allow' / 'require_approval'
print(decision.actions)          # ['redact']
print(decision.matched_rules)    # ['redact_pii_standard_users', 'allow_support_agents']
print(decision.violated_rules)   # []
print(decision.reason)           # Human-readable explanation
print(decision.reason_codes)     # Machine-readable codes for logging
print(decision.risk_score)       # Numeric risk score
print(decision.risk_tier)        # 'low' / 'medium' / 'high' / 'critical'
print(decision.remediation)      # Suggested fix if blocked
Enter fullscreen mode Exit fullscreen mode

PolicyDecision field reference

Field Type Description
decision enum allow, deny, conditional_allow, require_approval
actions list Transforms to apply e.g. ['redact']
matched_rules list Rules that matched the request
violated_rules list Rules that were violated (for audit logs)
reason str Human-readable explanation
reason_codes list Machine-readable codes for dashboards
risk_score float Numeric risk score
risk_tier str low, medium, high, critical
remediation str Suggested fix when request is blocked

Policy Context Roots

Inside every when clause you can reference these roots:

Root Example usage What it covers
tenant tenant: acme Customer or team identifier
app app: support-copilot Calling application or service
user user.role_in: [support_agent] Role, ID, department attributes
request request.region: us Region, task type, autonomy level
data data.contains_pii: true Output from DataProtectionEngine
risk risk.tier_in: [low, medium] Risk score and tier
ml ml.prompt_injection.detected: true Optional ML classifier signals

Validate Policies Before Production

Ship broken policies and you get silent misses or unintended blocks. policyaware ships a schema validator and CLI to catch issues early.

Python validator:

import yaml
from policyaware import PolicySchemaValidator

with open("support_policy.yaml", "r", encoding="utf-8") as f:
    policy = yaml.safe_load(f)

PolicySchemaValidator().validate(policy)  # raises on schema errors
Enter fullscreen mode Exit fullscreen mode

CLI commands:

# Validate the YAML schema
policyaware policy validate support_policy.yaml

# Explain how a specific request flows through your rules
policyaware policy explain --request sample_request.json
Enter fullscreen mode Exit fullscreen mode

The explain command is especially useful in CI/CD pipelines — you can run policy checks against a suite of sample requests before merging.


Optional: ML-Assisted PII Detection with Presidio

Regex-based rules miss things like names and addresses. For those, policyaware supports an optional Microsoft Presidio integration:

pip install "policyaware[presidio]"
Enter fullscreen mode Exit fullscreen mode
from policyaware import PresidioPIIClassifier

classifier = PresidioPIIClassifier(score_threshold=0.5)

assessment = classifier.classify(
    "Jane Doe lives at 120 Main St and her phone is 212-555-7890."
)

print(assessment.model_dump())
# Returns detected entities with type, value, and confidence score
Enter fullscreen mode Exit fullscreen mode

The Presidio findings feed back into the same data and ml roots in your YAML, giving you deterministic + ML detection in one framework.


TL;DR — What You Get in One Package

Capability How
Detect PII, PHI, Secrets DataProtectionEngine().inspect(text)
Redact sensitive content DataProtectionEngine().redact(text)
Enforce access policies via YAML PolicyEngine.from_file("policy.yaml")
Rich audit-ready decisions PolicyDecision with reason, risk, remediation
ML-assisted detection PresidioPIIClassifier (optional extra)
Validate policies before shipping PolicySchemaValidator + CLI

Get Started Now

!pip install policyaware
Enter fullscreen mode Exit fullscreen mode

Here is the fastest path to seeing value:

  1. Install the package
  2. Run DataProtectionEngine().inspect() on one real prompt from your app
  3. Write a 3-rule YAML that reflects your actual governance needs
  4. Call policy.decide(request, findings) and log the full PolicyDecision

That four-step experiment is enough to understand whether policyaware fits your stack.

I am the author and sole maintainer of this package. I built it because every AI project I worked on had the same gap — no structured layer between raw user input and the model. If you run into anything unexpected, have a governance pattern not covered yet, or want to contribute, I want to hear from you.

If this was useful, drop a like, share it with your team, and star the repo. Every bit of feedback helps make policyaware better for everyone building serious AI systems in Python.

Top comments (0)