Stop AI Agents from Leaking PII

#ai #python #security #privacy

Your AI agent passes a context dict to every LLM call. That dict might contain credit card numbers, SSNs, API keys, or email addresses. If the agent signs that context without checking it first, you just created a permanent, cryptographically-verified record of leaked PII.

asqav's content scanning pipeline inspects the context before signing. If it finds sensitive data, the sign request is rejected.

How it works

When you call agent.sign(), asqav's API runs the context through pattern matchers for PII categories: credit cards, government IDs, API keys, emails, phone numbers. If anything matches, the request fails with a clear error.

import asqav

asqav.init(api_key="sk_live_...")
agent = asqav.Agent.create("data-pipeline")

# This context contains a credit card number
context = {
    "customer_name": "Jane Doe",
    "payment": "4111-1111-1111-1111",
    "action": "process_refund"
}

try:
    sig = agent.sign("payment:refund", context)
except asqav.APIError as e:
    print(e)  # Content policy violation: credit card number detected

The sign call never completes. No signature is created. No PII ends up in your audit trail.

What gets scanned

Credit card numbers (Luhn-validated)
Social security numbers
API keys and secrets (common patterns)
Email addresses in unexpected fields

Scanning runs server-side. You don't need to add any client code beyond the normal agent.sign() call. It's on by default for Pro tier and above.

Why this matters

Audit trails are permanent. If PII lands in a signed record, you can't delete it without breaking the signature chain. Scanning before signing prevents the problem at the source.

pip install asqav

Docs | GitHub

Top comments (4)

klement Gunndu • Mar 27

The point about audit trails being permanent is the key insight — most PII leaks in agent systems happen in logs and traces, not the primary output. Scanning before signing catches it at the right layer.

Victor García • Mar 27

Exactly — and it goes beyond logs. Embeddings, background jobs, and entity extraction pipelines all touch the same data. If you only scan at the output layer, PII can still leak through any of those side channels. The earlier you classify, the fewer places you need to guard.

Victor García • Mar 27

Interesting approach — we tackled the same problem from a different angle: instead of blocking on PII detection, we pseudonymize it (consistent SHA-256 tokens) so the LLM can still reason about relationships, and route critical data to a local model. Wrote about it here: micelclaw.com/blog/pii-routing/

Henri Sila • Mar 27

Really important topic. It’s easy to focus on what AI agents can do and overlook how easily PII can leak through prompts, logs, or integrations. “It works” isn’t the same as “it’s safe.”

Some comments may only be visible to logged-in visitors. Sign in to view all comments.