Why 90% of Sanctions Screening Alerts Are False Positives (And How to Fix It with Python)
If you've ever built a KYC or AML compliance pipeline, you already know the pain: your sanctions screening system fires an alert for "Mohammed Al Hassan" and your compliance team spends two hours manually verifying it's not that Mohammed Al Hassan on the OFAC SDN list. Then it happens again. And again.
Industry data consistently shows that 90%+ of sanctions screening matches are false positives. This is not just an annoyance — it creates alert fatigue, slows down customer onboarding, and costs fintech teams real money in manual review time.
In this tutorial, we will look at why traditional approaches fail and how to build a smarter Python sanctions screening pipeline that dramatically reduces false positives.
Why Traditional Name Matching Fails
Most legacy sanctions screening tools rely on simple string-matching algorithms like Levenshtein distance or Jaro-Winkler similarity. Here is what that looks like:
from fuzzywuzzy import fuzz
def naive_screen(name: str, sanctions_list: list[str]) -> list[dict]:
results = []
for entry in sanctions_list:
score = fuzz.token_sort_ratio(name, entry)
if score >= 80:
results.append({"match": entry, "score": score})
return results
# Test it
sanctions = ["Mohammed Al Hassan", "Mohammed Al Hussain", "Mohamed Alhasan"]
print(naive_screen("Mohammed Hassan", sanctions))
# Returns ALL THREE as matches — all false positives
This approach has three fundamental problems:
- Transliteration variants: "Gaddafi", "Qaddafi", "Gadhafi" are all the same person — but simple string matching treats them differently
- Common names: "Mohammed", "Kim", "Wang" appear thousands of times in any sanctions list. A low threshold floods your queue; a high threshold misses real hits.
- No contextual awareness: Name alone is insufficient. Birth date, nationality, and aliases are all critical disambiguation factors.
A Better Approach: Multi-Factor Screening
The fix is not just a better fuzzy matcher — it is a fundamentally different data model. Here is what a proper screening pipeline should consider:
from dataclasses import dataclass, field
from typing import Optional
import unicodedata
import re
@dataclass
class EntityProfile:
name: str
date_of_birth: Optional[str] = None
nationality: Optional[str] = None
aliases: list[str] = field(default_factory=list)
id_numbers: list[str] = field(default_factory=list)
def normalize_name(name: str) -> str:
"""Normalize unicode, remove diacritics, lowercase."""
normalized = unicodedata.normalize("NFD", name)
ascii_name = normalized.encode("ascii", "ignore").decode("ascii")
cleaned = re.sub(r"\b(mr|mrs|dr|prof|jr|sr)\.?\b", "", ascii_name, flags=re.IGNORECASE)
return " ".join(cleaned.lower().split())
def phonetic_key(name: str) -> str:
"""Simple phonetic normalization for common transliteration variants."""
name = normalize_name(name)
substitutions = [
(r"\bq", "k"), (r"gh\b", "g"), (r"ph", "f"),
(r"ck", "k"), (r"ae", "a"), (r"oe", "o"),
]
for pattern, replacement in substitutions:
name = re.sub(pattern, replacement, name)
return name
def calculate_match_confidence(candidate: EntityProfile,
reference: EntityProfile) -> dict:
"""Multi-factor confidence scoring."""
from fuzzywuzzy import fuzz
scores = {}
# Name similarity (weighted 50%)
name_score = max(
fuzz.token_sort_ratio(normalize_name(candidate.name), normalize_name(reference.name)),
fuzz.token_sort_ratio(phonetic_key(candidate.name), phonetic_key(reference.name))
) / 100
scores["name"] = name_score
# Date of birth match (weighted 30%)
if candidate.date_of_birth and reference.date_of_birth:
scores["dob"] = 1.0 if candidate.date_of_birth == reference.date_of_birth else 0.0
# Nationality match (weighted 20%)
if candidate.nationality and reference.nationality:
scores["nationality"] = 1.0 if candidate.nationality.upper() == reference.nationality.upper() else 0.2
weights = {"name": 0.5, "dob": 0.3, "nationality": 0.2}
available_weight = sum(w for k, w in weights.items() if k in scores)
if available_weight == 0:
return {"confidence": 0.0, "factors": scores}
weighted_score = sum(scores[k] * weights[k] for k in scores) / available_weight
return {"confidence": weighted_score, "factors": scores}
Now when we compare "Mohammed Hassan (DOB: 1975-03-12, Malaysian)" against a sanctions entry for "Mohammed Al Hassan (DOB: 1968-07-22, Yemeni)", the DOB mismatch pulls the confidence score well below the match threshold — no false positive.
Using a Sanctions Screening API in Production
Building and maintaining your own sanctions screening logic has a hidden cost: you need to continuously update the underlying data. OFAC's SDN list changes almost daily. The UN Consolidated List, EU sanctions, HM Treasury — each has its own format and update schedule.
For production use, a purpose-built sanctions API handles all of this for you. Here is how to integrate SanctionShield AI into a FastAPI endpoint:
import httpx
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Optional
app = FastAPI(title="Compliance Screening Service")
SANCTIONSHIELD_KEY = "your_rapidapi_key_here"
SANCTIONSHIELD_HOST = "sanctionshield-ai.p.rapidapi.com"
class ScreeningRequest(BaseModel):
name: str
date_of_birth: Optional[str] = None
country: Optional[str] = None
class ScreeningResult(BaseModel):
screened_name: str
is_flagged: bool
confidence_score: float
matched_entities: list[dict]
lists_checked: list[str]
@app.post("/screen", response_model=ScreeningResult)
async def screen_entity(request: ScreeningRequest):
async with httpx.AsyncClient() as client:
response = await client.post(
f"https://{SANCTIONSHIELD_HOST}/screen",
headers={
"x-rapidapi-key": SANCTIONSHIELD_KEY,
"x-rapidapi-host": SANCTIONSHIELD_HOST,
},
json={
"name": request.name,
"dob": request.date_of_birth,
"country": request.country,
"threshold": 0.85,
},
timeout=10.0,
)
if response.status_code != 200:
raise HTTPException(status_code=502, detail="Screening service unavailable")
data = response.json()
return ScreeningResult(
screened_name=request.name,
is_flagged=data.get("flagged", False),
confidence_score=data.get("max_confidence", 0.0),
matched_entities=data.get("matches", []),
lists_checked=data.get("lists_checked", []),
)
The key advantage of a dedicated sanctions API is coverage: a good screening service checks OFAC SDN, OFAC Non-SDN Consolidated, UN Security Council, EU Financial Sanctions, HM Treasury, and regional lists — all in a single request, with data refreshed automatically.
Logging for Auditability
Modern regulators do not just want you to screen — they want you to prove that you screened, and explain why a record was cleared. Here is a minimal audit logging pattern:
import json
import datetime
from pathlib import Path
AUDIT_LOG = Path("audit/screening_log.jsonl")
def log_screening_result(request: ScreeningRequest, result: ScreeningResult):
"""Append an immutable audit record for each screening decision."""
record = {
"timestamp": datetime.datetime.utcnow().isoformat(),
"entity": {
"name": request.name,
"dob": request.date_of_birth,
"country": request.country,
},
"decision": "FLAGGED" if result.is_flagged else "CLEARED",
"confidence": result.confidence_score,
"lists_checked": result.lists_checked,
"matched_entities": result.matched_entities,
}
AUDIT_LOG.parent.mkdir(exist_ok=True)
with open(AUDIT_LOG, "a") as f:
f.write(json.dumps(record) + "\n")
Store these logs in append-only storage — S3 with Object Lock, or a write-once database table — so records cannot be modified after the fact.
Putting It All Together
A production-grade sanctions screening pipeline for a fintech onboarding flow looks like this:
- Normalize input — strip titles, normalize unicode, handle transliteration variants
- Pre-filter — skip obvious non-matches before calling the API (saves cost)
- Screen via API — with multi-list coverage and a configurable confidence threshold
- Log every result — flagged and cleared, for a regulatory audit trail
- Route flagged matches — to a human review queue, not automatic rejection
- Escalate on API failure — fail safe, not fail open (hold or reject, never silently pass through)
The false positive rate achievable with proper multi-factor screening is under 5%, compared to 90%+ with naive string matching. That is the difference between a compliance team that is overwhelmed by noise and one that can focus on real risks.
Conclusion
Sanctions screening is a solved problem in theory but a painful one in practice. The gap between "we check names against a list" and "we have a defensible, auditable, low-false-positive compliance pipeline" is exactly where most engineering teams get stuck.
Combining proper name normalization, multi-factor confidence scoring, and a purpose-built API that maintains up-to-date coverage across all major sanctions lists gets you there — without the operational burden of building and refreshing the data yourself.
If you are building AML/KYC pipelines and want a drop-in sanctions screening service, check out SanctionShield AI on RapidAPI.
Dave Sng is an API builder based in Malaysia, specializing in compliance, financial data, and document automation APIs. Find his work on RapidAPI.
Top comments (0)