DEV Community

Dave Sng
Dave Sng

Posted on

PII Leakage in LLM Pipelines: Detect and Redact Sensitive Data Before It Escapes

A CyberHaven study found that 11% of data employees paste into AI tools is confidential — including PII, source code, and internal business data. If you're building LLM-powered applications in 2026, you have a PII problem. The question is whether you've caught it yet.

This guide walks through exactly where PII leaks in an LLM pipeline, how to detect it programmatically, and how to build a guard layer that actually works in production.

Where PII Actually Escapes

Most developers focus on output — making sure the LLM doesn't say something sensitive. But leakage happens in at least four places:

  1. User prompts — "Summarize this email from john.doe@acme.com about SSN 123-45-6789..."
  2. RAG retrieval — Your vector store pulls a document containing a customer's home address, which gets injected into context automatically
  3. Request logging — Your observability stack captures the full prompt — including the SSN the user pasted in
  4. Third-party LLM calls — The full context window is transmitted to an external provider (OpenAI, Anthropic, etc.)

Each of these is a distinct attack surface. A robust solution handles all four.

Approach 1: Regex Pattern Matching (Fast, Precise for Known Formats)

The quickest wins come from structured PII — things with predictable formats.

import re
from dataclasses import dataclass

@dataclass
class PIIMatch:
    entity_type: str
    text: str
    start: int
    end: int

# Common PII patterns
PATTERNS = {
    "EMAIL":       r"[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}",
    "US_SSN":      r"\b\d{3}-\d{2}-\d{4}\b",
    "PHONE":       r"\b(\+?1[-.\s]?)?\(?\d{3}\)?[-.\s]\d{3}[-.\s]\d{4}\b",
    "CREDIT_CARD": r"\b(?:\d[ -]?){13,16}\b",
    "IP_ADDRESS":  r"\b(?:\d{1,3}\.){3}\d{1,3}\b",
}

def detect_pii_regex(text: str) -> list[PIIMatch]:
    matches = []
    for entity_type, pattern in PATTERNS.items():
        for m in re.finditer(pattern, text):
            matches.append(PIIMatch(
                entity_type=entity_type,
                text=m.group(),
                start=m.start(),
                end=m.end()
            ))
    return sorted(matches, key=lambda x: x.start)

def redact(text: str, matches: list[PIIMatch]) -> str:
    result = text
    for match in reversed(matches):  # reverse to preserve offsets
        result = result[:match.start] + f"[{match.entity_type}]" + result[match.end:]
    return result

# Example
text = "Contact sarah.jones@example.com or call 555-867-5309. Her SSN is 123-45-6789."
matches = detect_pii_regex(text)
print(redact(text, matches))
# → "Contact [EMAIL] or call [PHONE]. Her SSN is [US_SSN]."
Enter fullscreen mode Exit fullscreen mode

Regex runs in microseconds and has zero false negatives for well-formed structured data. The limitation: it misses unstructured PII like names, organizations, and addresses.

Approach 2: NER with Presidio (For Names, Locations, Dates)

Microsoft Presidio (built on spaCy) handles the contextual cases regex can't catch:

from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine

analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()

def detect_and_redact_nlp(text: str, language: str = "en") -> str:
    results = analyzer.analyze(
        text=text,
        entities=["PERSON", "LOCATION", "DATE_TIME", "EMAIL_ADDRESS", "PHONE_NUMBER"],
        language=language
    )
    anonymized = anonymizer.anonymize(text=text, analyzer_results=results)
    return anonymized.text

text = "John Smith, who lives in Austin TX, called about his account yesterday."
print(detect_and_redact_nlp(text))
# → "<PERSON>, who lives in <LOCATION>, called about his account <DATE_TIME>."
Enter fullscreen mode Exit fullscreen mode

Presidio supports 15+ entity types and can be extended with custom recognizers for domain-specific identifiers (employee IDs, internal codes, medical record numbers).

Install with:

pip install presidio-analyzer presidio-anonymizer
python -m spacy download en_core_web_lg
Enter fullscreen mode Exit fullscreen mode

Approach 3: API-Based Detection (For Production Scale and Compliance)

For production systems where accuracy and regulatory coverage matter, running NLP models yourself adds infrastructure overhead. An API-based approach gives you a maintained detection engine without the ops burden.

GlobalShield API provides AI-powered PII detection with multilingual support and regulatory classification (GDPR, HIPAA, CCPA) built in:

import requests

RAPIDAPI_KEY = "your-rapidapi-key"

def detect_pii_globalshield(text: str) -> dict:
    url = "https://globalshield-api.p.rapidapi.com/detect"
    headers = {
        "x-rapidapi-key": RAPIDAPI_KEY,
        "x-rapidapi-host": "globalshield-api.p.rapidapi.com",
        "Content-Type": "application/json"
    }
    payload = {"text": text, "mode": "redact"}
    response = requests.post(url, json=payload, headers=headers)
    return response.json()

result = detect_pii_globalshield(
    "Invoice from Dr. Maria Chen, License #CA-12345, billed to michael.torres@corp.io"
)
# Returns redacted text + entity map + regulatory flags (HIPAA, GDPR, CCPA)
print(result["redacted_text"])
print(result["entities_found"])
Enter fullscreen mode Exit fullscreen mode

This is especially useful when you need compliance flags alongside redaction — something regex and basic NER can't provide on their own.

Building the Guard Layer: Pre and Post Model

The right architecture wraps your LLM client and intercepts at two points:

class LLMGuard:
    def __init__(self, llm_client):
        self.llm = llm_client
        self.entity_map = {}

    def _redact(self, text: str) -> str:
        matches = detect_pii_regex(text)
        for i, match in enumerate(matches):
            key = f"__PII_{i}__"
            self.entity_map[key] = match.text
            text = text.replace(match.text, key, 1)
        return text

    def _restore(self, text: str) -> str:
        for key, value in self.entity_map.items():
            text = text.replace(key, value)
        return text

    def complete(self, prompt: str) -> str:
        self.entity_map.clear()

        # Step 1: Redact before sending to external LLM
        safe_prompt = self._redact(prompt)

        # Step 2: Call LLM with sanitized input
        response = self.llm.complete(safe_prompt)

        # Step 3: Restore for user-facing display only (never persist restored text)
        return self._restore(response.text)
Enter fullscreen mode Exit fullscreen mode

This pattern guarantees:

  • Third-party LLM providers never see raw PII
  • Your logs stay clean
  • PII can be re-inserted for the end user's display without persisting to disk

Comparison: Which Approach Fits Your Use Case?

Approach Speed Accuracy Setup Effort Best For
Regex ~0.1ms High (structured only) Zero SSNs, emails, credit cards
Presidio/spaCy ~50ms High (contextual) Medium Names, locations, dates
API (GlobalShield) ~200ms Highest (AI + compliance) Minimal Production, multilingual, GDPR/HIPAA
Combined regex + NER ~60ms Very High Medium Most production systems

For most teams, the pragmatic choice is: regex first (free, near-instant), then NLP for what gets through, then an API if you need compliance coverage.

Don't Forget Your Logging Stack

Even if you redact prompts before sending to the LLM, naive structured logging will capture the raw prompt first:

import logging
import re

class PIIFilter(logging.Filter):
    PATTERNS = [
        (re.compile(r"[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}"), "[EMAIL]"),
        (re.compile(r"\b\d{3}-\d{2}-\d{4}\b"), "[SSN]"),
        (re.compile(r"\b(\+?1[-.\s]?)?\(?\d{3}\)?[-.\s]\d{3}[-.\s]\d{4}\b"), "[PHONE]"),
    ]

    def filter(self, record):
        record.msg = self._sanitize(str(record.msg))
        return True

    def _sanitize(self, text: str) -> str:
        for pattern, replacement in self.PATTERNS:
            text = pattern.sub(replacement, text)
        return text

# Attach to root logger — covers all libraries too
logging.getLogger().addFilter(PIIFilter())
Enter fullscreen mode Exit fullscreen mode

This runs at log-write time and prevents PII from reaching your log aggregation service (Datadog, CloudWatch, Splunk, etc.).

The Four-Layer Checklist

Before shipping an LLM feature, run through these:

  • [ ] Input guard — redact user prompts before they hit the model
  • [ ] Retrieval guard — scan RAG-retrieved chunks before context injection
  • [ ] Log filter — sanitize at the logging layer, not after
  • [ ] Output scan — check model responses for PII that leaked through context

Libraries like Presidio make the NER layer straightforward. For teams needing regulatory compliance classification and multilingual support without running NLP infrastructure, GlobalShield API handles that tier without the ops overhead.


What's your current approach to PII in LLM pipelines? Are you running in-process redaction, relying on your LLM provider's built-in safeguards, or using a dedicated service? Drop your setup in the comments — especially curious what's working in production at scale.

Top comments (0)