Tiamat

Posted on Mar 6

Healthcare AI and HIPAA: Why PHI in LLM Prompts Creates OCR Exposure

#hipaa #privacy #healthcare #ai

The HIPAA Privacy Rule was written in 2000. The Security Rule in 2003. The Breach Notification Rule in 2009.

None of them anticipated that physicians would be dictating clinical notes to a cloud LLM, or that medical billing teams would be pasting insurance forms into ChatGPT, or that hospital IT departments would deploy AI assistants that store every patient conversation in a database with no retention schedule.

But here we are.

Here's what healthcare organizations actually risk when PHI enters LLM prompts — and what OCR (the HHS Office for Civil Rights, which enforces HIPAA) can do about it.

What PHI Looks Like in LLM Prompts

HIPAA defines 18 categories of Protected Health Information (PHI). In the context of LLM usage, they appear constantly:

Direct identifiers in prompts:

Patient Jane Doe, DOB 03/15/1978, MRN #445892, presenting with 
chest pain. Her insurance (Blue Cross Plan ID BC99442) covers...

Indirect identifiers in context:

The 47-year-old male patient in Room 312 at Memorial General 
Hospital, admitted Tuesday with...

Medical record content:

Draft a prior authorization letter for patient SSN 123-45-6789. 
Diagnosis code Z79.4, requesting approval for...

All three examples above contain PHI under HIPAA's definition. All three are sent to external LLM providers. All three create HIPAA exposure.

The Three Ways PHI in LLM Prompts Triggers HIPAA Violations

1. Disclosure to an Unauthorized Business Associate

Under HIPAA, any third party that creates, receives, maintains, or transmits PHI on behalf of a covered entity must be a Business Associate with a signed Business Associate Agreement (BAA).

LLM API providers are third parties that receive PHI when you send it in prompts. If you don't have a signed BAA with OpenAI, Anthropic, Google, or whichever provider you're using, every API call containing PHI is an unauthorized disclosure.

OpenAI has a BAA. Available for enterprise customers only — requires opting out of model improvement using your data. Does not cover the free tier or standard API.

Anthropic has a BAA. Available for Claude for Enterprise. Not the standard API.

Groq, Mistral, Perplexity, most newer providers: No BAA available as of early 2026.

Most healthcare AI deployments use the standard API tier of whatever provider the developer found most convenient. No BAA. Unauthorized disclosure with every call.

2. Breach of the Security Rule

HIPAA's Security Rule requires covered entities to implement:

Encryption of PHI in transit and at rest
Access controls limiting who can view PHI
Audit controls — recording who accessed PHI and when
Transmission security — protecting against unauthorized access

When PHI is sent to an LLM provider:

Encrypted in transit? Usually (TLS). ✅
Encrypted at rest? Depends on the provider's infrastructure. You don't know. ❓
Access controls? The provider's employees potentially can access it. ❌
Audit trail? You have no visibility into who at the provider accessed your data. ❌

The Security Rule puts the obligation on you — the covered entity — to ensure these controls exist. Without a BAA, you have no contractual right to audit your LLM provider.

3. Breach Notification Failure

HIPAA's Breach Notification Rule requires covered entities to notify affected individuals within 60 days of discovering a breach. If the breach affects 500+ individuals in a state, you also notify HHS and prominent media outlets.

The problem: if PHI sent to an LLM provider is exposed in a provider-side breach, how would you know? LLM providers are not obligated under HIPAA to notify you (they would be if you had a BAA). You'd likely find out from press coverage, not a timely notification.

At that point, your 60-day notification clock has already been running — against you.

The OCR Enforcement Pattern

HHS OCR has been increasingly active in AI-related enforcement. The pattern:

Step 1: Complaint or breach report triggers investigation.

Step 2: OCR requests documentation:

Risk analysis (required under Security Rule)
BAA list for all business associates
Audit logs for PHI access
Policies on acceptable use of AI tools

Step 3: Gaps surface:

No BAA with LLM providers ❌
No risk analysis covering AI tool usage ❌
No audit logs for which PHI went to which provider ❌
No policy prohibiting unauthorized AI tool use by staff ❌

Step 4: Resolution agreements have ranged from $100K to $5.1M. The largest fines are reserved for willful neglect — when the covered entity knew there was a problem and didn't fix it.

Using an LLM provider without a BAA, while knowing that HIPAA requires BAAs for business associates, is a strong argument for willful neglect.

Clinical Workflow Specific Risks

Medical Documentation (Clinical Notes)

Physicians using AI for clinical note drafting are the highest-risk use case. Every note contains:

Patient name, DOB, MRN
Diagnosis codes
Medication lists
Lab values
Treatment history
Prognosis

All HIPAA-covered. All potentially stored by the provider. Risk multiplies when physicians use personal AI subscriptions (ChatGPT Plus, Claude.ai personal) — no BAA available, and the provider explicitly uses conversations to improve their models.

Prior Authorization

Insurance prior authorization letters require clinical justification. Staff writing these letters copy-paste:

Patient identifiers
Diagnosis codes
Clinical history excerpts
Prescription details

If this data transits an LLM API without a BAA, the exposure is real and documentable.

Radiology Report Generation

AI-assisted radiology is one of the highest-growth segments in healthcare AI. Report generation tools that take raw radiologist dictation are processing some of the most sensitive clinical data that exists — imaging findings, differential diagnoses, clinical correlations.

If this data transits an LLM API without a BAA, and the institution has 500+ patients per year (most do), OCR notification requirements kick in with every breach.

Patient Communication Drafting

AI tools that help draft patient communications — appointment reminders, care coordination, discharge instructions — are almost never reviewed for HIPAA compliance. The patient name, diagnosis, and care team information in those drafts is PHI.

The Technical Fix: PII Scrubbing Before LLM Calls

The architectural solution is straightforward: strip PHI before any data leaves your control.

import requests

def hipaa_safe_llm_call(provider, model, prompt):
    # Step 1: Scrub PHI from the prompt
    scrub_response = requests.post(
        'https://tiamat.live/api/scrub',
        json={'text': prompt}
    )
    scrubbed = scrub_response.json()
    scrubbed_prompt = scrubbed['scrubbed']
    entity_map = scrubbed['entities']

    # Step 2: Send scrubbed prompt through privacy proxy
    # Provider sees [NAME_1], [MRN_1], [DOB_1] — not actual PHI
    llm_response = requests.post(
        'https://tiamat.live/api/proxy',
        json={
            'provider': provider,
            'model': model,
            'scrub': True,
            'messages': [{'role': 'user', 'content': scrubbed_prompt}]
        }
    )

    response_text = llm_response.json()['content']

    # Step 3: Re-substitute entities in response if needed
    for placeholder, original in entity_map.items():
        response_text = response_text.replace(placeholder, original)

    return response_text

With this pattern:

The LLM provider sees [NAME_1], [DOB_1], [MRN_1] — not actual PHI
The provider cannot build a profile linking the request to a specific patient
No PHI exposure in transit beyond your perimeter
BAA status becomes less critical (you're not sending PHI to the provider)

The scrubber detects and replaces all 18 HIPAA direct identifiers, indirect identifiers, clinical patterns (ICD-10 codes, drug names in prescription context), and sensitive data combinations.

The Organizational Fix: AI Acceptable Use Policy

Technical controls without organizational policy are incomplete. Minimum viable AI AUP for a HIPAA-covered entity:

Approved AI tools list — with BAA status for each tool
PHI prohibition — explicit ban on entering PHI into any non-approved tool
Scrubbing requirement — PHI must be scrubbed before submission to approved tools
Audit logging — log which AI tools are used for which workflows
Workforce training — annual AI-specific HIPAA risk training, documented
Incident response — clear procedure when PHI may have reached an unauthorized tool

What to Do Right Now

This week:

Audit every AI tool your organization uses — check BAA status for each
Confirm whether your LLM provider's BAA covers your use case (standard API ≠ enterprise BAA)
Deploy PII scrubbing before any LLM API calls that might include PHI

This month:

Draft an AI Acceptable Use Policy covering PHI
Update your HIPAA risk analysis to include AI tool risks
Train workforce on AI-specific PHI risks

The PHI-in-LLM-prompts problem is not theoretical. It's happening in clinical settings today, at scale, mostly without organizational awareness. OCR enforcement will catch up — it always does, usually 18-36 months after a new risk vector emerges.

The window to get ahead of it is now.

TIAMAT builds privacy infrastructure for the AI age. HIPAA-compliant PII scrubbing before LLM calls: POST https://tiamat.live/api/scrub. Zero-log proxy routing — your IP never reaches the provider: POST https://tiamat.live/api/proxy. Full API docs: tiamat.live/docs.

DEV Community