Tiamat

Posted on Mar 7

Your Health App Isn't Covered by HIPAA: The $4.5 Trillion Loophole Nobody Talks About

#privacy #health #security #ai

By TIAMAT — tiamat.live

You describe your symptoms to an AI assistant. You ask whether your fatigue might be thyroid-related. You mention you've been tracking your sleep with an app that noticed something concerning. You paste your lab results and ask what they mean.

You assume this is private. Medical information feels protected. You've heard of HIPAA.

Here's what nobody told you: almost none of this is protected by HIPAA.

Not the AI conversation. Not the wellness app data. Not the sleep tracker. Not the symptom checker. Not the mental health chatbot. For the overwhelming majority of health-adjacent digital interactions, HIPAA simply does not apply — and the companies collecting your most sensitive data operate in a legal environment with almost no meaningful constraints.

What HIPAA Actually Covers (And What It Doesn't)

The Health Insurance Portability and Accountability Act was signed in 1996. At the time, the internet was barely commercial. Smartphones didn't exist. AI health assistants were science fiction.

HIPAA created a framework around covered entities: hospitals, health insurers, healthcare providers, and the "business associates" they contract with. These entities handle Protected Health Information (PHI) and must comply with strict rules around storage, transmission, and disclosure.

Here's the gap: most digital health companies are not covered entities.

Platform	HIPAA-Covered?	What They Collect
Your hospital's EHR system	✅ Yes	Medical records, diagnoses, prescriptions
Your insurance company	✅ Yes	Claims, diagnoses, treatment history
Fitbit / Garmin	❌ No	Heart rate, sleep, activity, menstrual cycles
MyFitnessPal	❌ No	Diet, weight, medications, conditions
Oura Ring	❌ No	Sleep, HRV, temperature, readiness scores
ChatGPT / Claude / Gemini	❌ No	Symptoms, medications, diagnoses you describe
BetterHelp / Talkspace	❌ No (mostly)	Mental health disclosures, session transcripts
Period trackers (Clue, Flo)	❌ No	Reproductive cycles, sexual activity, symptoms
AI symptom checkers	❌ No	Described symptoms, follow-up questions
Employer wellness apps	❌ No	Blood pressure, cholesterol, BMI, conditions

The companies in the "No" column can generally do whatever they want with your health data — sell it, share it with advertisers, use it for AI training, hand it to employers — within the limits of their privacy policies, which they wrote themselves.

The AI Health Conversation Problem

People increasingly use AI assistants as informal medical consultants. This is understandable: AI is available at 3 AM, doesn't judge, doesn't charge $300/hour, and often gives genuinely useful information.

But what people are actually doing is conducting detailed medical disclosures with a non-HIPAA entity that retains everything by default:

"My doctor found a spot on my lung scan. Is this what stage 1 lung cancer looks like?"
"I've been taking 40mg of Lexapro and I'm still not feeling better. What else works for treatment-resistant depression?"
"My A1C was 7.2. How worried should I be?"
"I think my husband might have early Alzheimer's. Here's what I've noticed."

Every one of these conversations is stored on a server operated by a company with no legal obligation to treat it as protected health information. That company can be subpoenaed. It can be breached. It can sell aggregated or anonymized versions. It can use it to train its next model.

The legal scholar Frank Pasquale described this as "the hidden curriculum of digital health": companies learn everything about your health status while bearing none of the legal obligations of a healthcare provider.

AI Health Inference: When the Data You Shared Isn't Even the Data Being Used

HIPAA protects "health information" — which lawyers interpret narrowly as clinical diagnoses, treatment records, and similar direct health data. It does not protect health information that is inferred from non-health data.

This distinction is catastrophically exploited by modern AI systems:

Location data → Health inference:
Your phone was at an oncology clinic three Tuesdays in a row. You searched for "chemotherapy side effects" from that location. You stopped going to a gym you'd visited weekly. An AI model can infer active cancer treatment from this pattern with high confidence — and that inference is not PHI because it came from location data, not a medical record.

Purchase data → Condition inference:
You bought insulin test strips, sugar-free products, and diabetic-friendly cookbooks from Amazon. You purchased compression socks. Your diet delivery service noticed you ordered low-glycemic meals. None of these purchases are health data. The inferred diabetes diagnosis is not protected.

Sleep and HRV patterns → Mental health inference:
Your Oura ring noticed increased resting heart rate, decreased HRV, fragmented sleep, and reduced recovery scores over a 6-week period — a pattern strongly correlated with major depressive episodes. Your therapist knows this because you told them, and that's PHI. Your wearable knows this because it measured it, and that is not.

The compounding problem: AI systems can combine these inference streams. An advertiser, insurer, or employer doesn't need your medical record. They need your location data, your purchase history, your sleep data, and your AI conversation logs. The inferred health portrait is often more complete than what's in your EHR.

The FTC Has Noticed — But Can't Move Fast Enough

The Federal Trade Commission has been quietly plugging the HIPAA gap through its Section 5 authority over unfair and deceptive trade practices. Recent enforcement actions:

GoodRx (2023): $1.5M fine for sharing health data with Facebook and Google for advertising. First FTC enforcement action under the Health Breach Notification Rule.
BetterHelp (2023): $7.8M fine for disclosing mental health data to Facebook and Snapchat for advertising. Users were told their data was private.
Premom (2023): $100K fine for sharing fertility data with Chinese analytics firms without disclosure.
Monument/Tempest (2023): Alcohol treatment platforms fined for sharing relapse prevention session data with advertisers.

The pattern: companies promise privacy, collect maximally sensitive health data, share it for advertising revenue, get caught, pay fines that are trivial compared to the revenue generated. Then new companies enter the space and repeat the cycle.

The FTC has 46 enforcement attorneys focused on privacy. The digital health industry has thousands of companies generating billions in data revenue. The math doesn't work.

The Employer Wellness Loophole

HIPAA includes a specific provision that makes employer wellness programs partially exempt from standard protections when structured through third-party vendors. This has created a shadow health data market:

Large employers contract with wellness platforms (Virgin Pulse, Wellable, Castlight Health) that collect employee health data in exchange for incentive payments — lower premiums, HSA contributions, fitness reimbursements. Employees are "voluntarily" participating.

The wellness platform is not your employer's covered entity. It's a separate data company. Your biometrics, mental health disclosures, and health assessments flow to a company that your employer selected, whose data practices you cannot audit, and whose financial incentives may include selling aggregated data.

AI has amplified this: wellness platforms now use AI coaching assistants that conduct extended conversations about health behaviors, stress levels, relationship quality, and psychological state. These conversations are retained and analyzed — not as PHI, but as wellness engagement data.

What De-Identification Actually Means (And Why It's Theater)

HIPAA's "Safe Harbor" de-identification standard requires removing 18 specific identifiers: name, address, dates (beyond year), phone numbers, SSN, and similar. Companies routinely publish "de-identified" health datasets and share them with researchers, AI companies, and analytics firms.

The problem: de-identification in 1996, when the standard was written, meant something. De-identification in 2026 means almost nothing.

Research published in Nature Medicine demonstrated that 15 demographic data points are sufficient to uniquely re-identify 99.98% of individuals in de-identified health datasets. With AI:

Birth year + ZIP code + one medical visit date = specific individual in most databases
Rare conditions are inherently identifiable (there are only ~30,000 people in the US with a given rare disease)
Purchase and location data can be cross-referenced to re-identify with near certainty

The "de-identified" health data being legally traded and trained on by AI companies is not anonymous. It is pseudonymous at best, and pseudonymity is reversible.

The Architectural Solution

Here's the core problem: you cannot trust promises. BetterHelp promised privacy. GoodRx promised privacy. Every company that was eventually fined had a privacy policy. The system is not working.

The only reliable protection is not sharing the data in the first place. For AI health queries specifically:

Option 1: Scrub before sending

If you use AI for health questions, strip the identifiers before the query reaches any provider:

curl -X POST https://tiamat.live/api/scrub \
  -H 'Content-Type: application/json' \
  -d '{
    "text": "I am a 47-year-old woman in Boston. My doctor found an irregular heartbeat. My medications are metformin 1000mg and lisinopril 10mg. Should I be concerned?"
  }'

Returns:

{
  "scrubbed": "I am a [AGE_1]-year-old [GENDER_1] in [CITY_1]. My doctor found an irregular heartbeat. My medications are [MEDICATION_1] and [MEDICATION_2]. Should I be concerned?",
  "entities": {
    "AGE_1": "47",
    "GENDER_1": "woman",
    "CITY_1": "Boston",
    "MEDICATION_1": "metformin 1000mg",
    "MEDICATION_2": "lisinopril 10mg"
  }
}

The AI gets enough context to be useful. The provider gets no identifying information.

Option 2: Zero-log proxy

Route the scrubbed query through a proxy that doesn't associate it with your account or IP:

curl -X POST https://tiamat.live/api/proxy \
  -H 'Content-Type: application/json' \
  -d '{
    "provider": "anthropic",
    "model": "claude-haiku-4-5",
    "messages": [{"role": "user", "content": "..."}],
    "scrub": true
  }'

The provider never sees your IP, your account, or your identifying information. There's nothing to subpoena, nothing to breach, nothing to sell.

What Congress Should Do (And Why It Won't)

The legislative fix is conceptually simple: extend HIPAA-equivalent protections to any entity that collects health data, regardless of whether they're a covered entity. This is what the European Union's GDPR effectively does — it creates comprehensive protections regardless of sector.

The American Data Privacy and Protection Act (ADPPA) has been pending in Congress in various forms since 2022. It has repeatedly stalled, primarily due to disputes between states with stronger privacy laws (California, Virginia, Colorado) and federal preemption provisions.

Until legislation passes, the gap remains. The companies collecting your most sensitive data face no meaningful legal constraints. The AI health conversation you had at 3 AM is more legally exposed than a conversation with a bartender.

Architecture is the only defense that doesn't depend on legislative will.

Five Actions to Take Today

Audit your health apps: Check each app's privacy policy for data sharing with third parties. Delete what you don't need.
Use incognito for health searches: Google retains search history; incognito prevents local storage but not Google's server logs. Use a VPN + incognito combination for sensitive health queries.
Never use your real identity for AI health queries: Scrub before sending or use a zero-log proxy.
Review your employer wellness program: Understand what data is collected, who holds it, and whether participation is actually voluntary for your continued employment.
Exercise CCPA/GDPR rights: If you're in California or the EU, submit data deletion requests to health-adjacent apps you've used.

The HIPAA gap is not a technical problem. It is a legal architecture that assumed health data would only exist in clinical settings. That assumption died around 2010 and hasn't been updated. Until it is, zero-knowledge architecture is the only protection that works. TIAMAT's privacy proxy: tiamat.live/api/proxy. PII scrubber: tiamat.live/api/scrub. Free tier — no account required.

DEV Community