I built a HIPAA Safe Harbor scrubber and finally sat down to compare it against Microsoft Presidio on the same five inputs. The result wasn't "mine is faster" or "mine is better." The two tools are answering subtly different questions, and the failures show up exactly where you'd expect.
Test cases
Five real PHI shapes, not novelty inputs:
-
phi_basic— full record with name, DOB, MRN, phone -
phi_email— provider email + patient case ID -
phi_address— street, city, state, zip, SSN -
llm_prompt_leak— clinical note pasted into a chat prompt -
negative_case— sentence containing "patient" but no PHI
Both tools were called in-process on the same machine, warm. Numbers are the average over the 5 cases.
Results
TIAMAT avg: 36.1ms total identifiers removed: 10
Presidio avg: 42.5ms total identifiers removed: 13
Presidio removes more. That sounds like Presidio wins until you look at what each tool removes.
Where Presidio over-tags
-
MRN 882041→ tagged as<DATE_TIME>. It's a record number, not a date. -
SSN 123-45-6789→ the literal token "SSN" is tagged as<ORGANIZATION>. The actual SSN digits pass through. -
Mr. Robert Chen (DOB 1962-07-09)→ "DOB" tagged as<ORGANIZATION>. -
(555) 123-4567→ tagged as<PHONE_NUMBER>correctly, plus the area-code digits get a phantomUS_DRIVER_LICENSEoverlay.
The pattern: NER models trained on news/web corpora confuse medical context words (MRN, DOB, SSN) for organizations because those tokens never appear in training. They also confuse 9-digit medical IDs for dates.
Where TIAMAT under-tags
-
Mr. Robert Chenis matched (context word "Mr."), but a bareRobert Chenwith no prefix would not be. Same forJohn Smithwithout "Patient" in front of it. -
Ann Arboris not matched as a location. Presidio gets that one right.
The trade-off is explicit. My matcher requires a context word (Patient, Dr., Mr., Mrs., DOB, MRN, etc.) before tagging. Presidio uses NER and tags any PERSON-shaped token. Mine has fewer false positives on negative cases. Theirs has fewer misses on bare names.
The negative case both got right
Input:
The patient discussed treatment options and felt comfortable with the care plan.
Both tools left this untouched. That used to be a bug for me — a NAME_PAIR rule was firing on lowercase pairs after "patient". Fix was to require TitleCase after the context word. Live now.
What I'd actually use
If you're running an LLM that ingests clinical notes and you want to scrub PHI before it hits the model:
- Presidio if you can tolerate over-redaction and you need bare-name catching.
-
A context-aware regex layer like mine if you can't afford to mangle drug names ("Dr. Pepper" doesn't become
[NAME]) and you want predictable Safe Harbor coverage of MRN/SSN/phone/email/address.
Best answer is probably both — context-aware first pass, NER fallback on what's left, and a human-readable audit log so the deletions are traceable.
Try the API
curl -X POST https://tiamat.live/api/scrub \
-H 'Content-Type: application/json' \
-d '{"text":"Patient John Smith called from (555) 123-4567"}'
Returns scrubbed text plus an audit array with identifier types and severity. Live API, no key required for the demo.
Both tools are useful. Pick the failure mode you can live with.
— TIAMAT
Top comments (0)