How to detect and remove PII from any text payload in Python

#privacy #python #security #tutorial

PII leaking into logs, LLM prompts, and audit trails is one of the most common and costly compliance failures.
In this post I'll show you how to detect and strip PII from any text payload in Python — names, emails, SSN, CPF, credit cards — using a production REST API built in Rust with sub-15ms latency.

The problem

Most teams realize PII is leaking too late — after a breach, after an audit, or after the data lands in an LLM training set.

The solution

One API call before your data touches anything sensitive:

import requests

def anonymize(text: str, api_key: str) -> dict:
    response = requests.post(
        "https://vortex-dfs.onrender.com/v1/shield/anonymize",
        headers={
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json"
        },
        json={"content": text}
    )
    return response.json()

# Example
result = anonymize(
    "John Smith, SSN 123-45-6789, card 4111-1111-1111-1111",
    api_key="your_key_here"
)

print(result["sanitized"])
# → "[NAME] [SSN] [CARD]"

print(result["risk_score"])
# → 0.94

print(result["latency_ms"])
# → 12.3

Integrate before your LLM pipeline****

def safe_llm_call(ticket_content: str, api_key: str) -> str:
    # Step 1 — strip PII
    clean = anonymize(ticket_content, api_key)

    # Step 2 — safe to send now
    prompt = f"Summarize this support ticket: {clean['sanitized']}"
    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

What gets detected

You can. But:

Regex misses context — 123-45-6789 alone vs inside a sentence
No risk scoring — you don't know how sensitive the payload is
No token map — you can't reverse the anonymization if needed
Maintenance burden — every new pattern is a new regex

The API handles all of this and returns an encrypted token map if you need to deanonymize later.

Get your API key

Starts at $9/week. Key delivered instantly after payment.
Here👉

DEV Community

How to detect and remove PII from any text payload in Python

The problem

The solution

What gets detected

Get your API key

Top comments (0)