mock health

Posted on Mar 22 • Originally published at mock.health

The FHIR Sandbox Problem: Why Open Epic Won't Get You to Your First Customer

#healthcare #startup

You're three months into building a FHIR integration. You've got OAuth working, you can pull Patient resources, your UI renders demographics cleanly. Time to show it to a potential customer.

You open the sandbox patient and see this:

{
  "resourceType": "Patient",
  "id": "erXuFYUfucBZaryVksYEcMg3",
  "name": [
    {
      "use": "usual",
      "text": "Test Cancer",
      "family": "Cancer",
      "given": ["Test"]
    }
  ],
  "birthDate": "1971-08-07",
  "gender": "female",
  "address": [
    {
      "use": "home",
      "line": ["123 Main St."],
      "city": "Madison",
      "state": "WI",
      "postalCode": "53703"
    }
  ]
}

The patient's name is "Test Cancer." The address is 123 Main St. There are no emergency contacts, no marital status, no communication preferences. You scroll through the rest of the sandbox patients. "Derrick Lin." "Jason Argonaut." A handful of others, all equally sparse.

This is Open Epic's sandbox. And it's the starting point for nearly every FHIR startup in the US.

What Open Epic Actually Gives You

Open Epic provides a shared FHIR R4 sandbox with eight named test patients (Camila Lopez, Derrick Lin, Desiree Powell, Elijah Davis, Linda Ross, Olivia Roberts, Warren McGinnis, and Jason Argonaut). These patients exist to validate API connectivity — to confirm that your OAuth flow works, that you can parse a Bundle, that your FHIR client handles pagination. They were never designed to represent what real patient data looks like.

Here's what the sandbox gives you versus what you need to demo a real clinical application:

Capability	Open Epic Sandbox	Production Demo Needs
Patient demographics	Name, DOB, gender, address	Race, ethnicity, language, multiple addresses, contacts
Conditions	1-3 active conditions	5-15 conditions with onset dates, comorbidity patterns
Medications	Sparse, often missing	Active + historical, with dosage and frequency
Lab observations	Few, no trends	Longitudinal panels (HbA1c over 3 years, lipid trends)
Imaging studies	None	ImagingStudy + DiagnosticReport with narrative text
Clinical notes	None or minimal	Discharge summaries, progress notes, radiology reports
Encounters	1-3 encounters	Multi-year history across ambulatory, inpatient, ED
Vital signs	Sparse	Trending data (BP, weight, heart rate over time)

The sandbox is optimized for certification. It proves your app can authenticate and read FHIR resources. It says nothing about whether your app can handle a real patient record.

The Alternative Landscape Is Shrinking

Open Epic isn't the only sandbox, but the options are thinner than you'd think.

Logica Health (formerly HSPC) ran a well-regarded public sandbox for years. It was retired on October 31, 2024. If your integration documentation still points there, those links are dead.

SMART Health IT maintains a sandbox with about 100 Synthea-generated patients. The data is structurally valid but clinically shallow — default Synthea output without enrichment. (If you've used it, you know the feeling: the FHIR parses fine, but the patient has three conditions and no meds.)

Individual health systems sometimes provide sandbox access, but getting it requires a signed BAA, a security review, and months of procurement. That's not a sandbox. That's a sales cycle.

The Gap Between Sandbox and Production

Missing fields are the obvious problem. The deeper problem is missing clinical coherence — real patient data tells a story, and sandbox data doesn't have one.

No Comorbidities

In the real world, Type 2 Diabetes doesn't travel alone. It shows up with hypertension, hyperlipidemia, chronic kidney disease, and obesity. A 62-year-old diabetic in a production EHR typically has 8-15 active conditions, most of them clinically related.

Open Epic's test patients have conditions, but they're isolated. You won't find a patient whose diabetes diagnosis is followed by quarterly HbA1c labs trending from 7.2 to 8.1 over 18 months, with a metformin prescription added at diagnosis and a GLP-1 agonist added when the HbA1c crossed 8.0. That's what your app needs to demo against.

No Imaging, No Clinical Narrative

Search for ImagingStudy resources in the Open Epic sandbox. Empty bundles. No DiagnosticReports with radiology narrative. No discharge summaries, no progress notes. Production EHRs have presentedForm.data fields with base64-encoded report text — the full narrative a radiologist dictated. Sandboxes have "FINDINGS: Normal. IMPRESSION: Normal." or nothing at all.

If you're building anything that touches imaging, clinical NLP, or document summarization — you have zero test data to work with.

No Longitudinal History

Real patients have years of history. A production Patient sits at the center of hundreds of linked resources — encounters spanning a decade, medication changes over time, lab trends that tell a clinical story.

Sandbox patients have one or two encounters. You can't demonstrate a timeline view, trend analysis, or care gap detection when the patient's entire history fits in a single API response.

What You Actually Get: Epic Sandbox vs. Production

To make this concrete, here's a real query against the Epic on FHIR sandbox for Camila Lopez — the "Test Cancer" patient from the opening of this post — filtering for laboratory observations:

curl -s "https://fhir.epic.com/interconnect-fhir-oauth/api/FHIR/R4/Observation\
?patient=erXuFYUfucBZaryVksYEcMg3&category=laboratory" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Accept: application/fhir+json" | jq '.total, [.entry[].resource |
    {code: .code.text, date: .effectiveDateTime,
     value: (.valueQuantity.value | tostring) + " " + .valueQuantity.unit}]'

A representative sandbox response:

2,
[
  {
    "code": "Hemoglobin A1c/Hemoglobin.total in Blood",
    "date": "2019-06-10T15:38:00Z",
    "value": "7.6 %"
  },
  {
    "code": "Glucose [Mass/volume] in Blood",
    "date": "2019-06-10T15:38:00Z",
    "value": "138 mg/dL"
  }
]

Two observations. Both from the same date. No prior values, no subsequent values. If you're building a diabetes management dashboard, a trend chart, or a care gap detector — this tells you nothing.

Now compare that to what a production EHR returns — or what mock.health generates — for a 62-year-old with Type 2 Diabetes:

15,
[
  { "code": "Hemoglobin A1c", "date": "2022-01-15", "value": "6.8 %" },
  { "code": "Hemoglobin A1c", "date": "2022-04-22", "value": "7.1 %" },
  { "code": "Hemoglobin A1c", "date": "2022-07-30", "value": "7.4 %" },
  { "code": "Hemoglobin A1c", "date": "2022-10-18", "value": "7.9 %" },
  { "code": "Hemoglobin A1c", "date": "2023-01-12", "value": "8.1 %" },
  { "code": "Hemoglobin A1c", "date": "2023-04-05", "value": "7.6 %" },
  { "code": "Hemoglobin A1c", "date": "2023-07-20", "value": "7.3 %" },
  { "code": "Hemoglobin A1c", "date": "2023-10-11", "value": "7.1 %" },
  { "code": "Hemoglobin A1c", "date": "2024-01-08", "value": "7.0 %" },
  { "code": "Hemoglobin A1c", "date": "2024-04-15", "value": "6.9 %" },
  { "code": "Glucose",        "date": "2022-01-15", "value": "128 mg/dL" },
  { "code": "Glucose",        "date": "2022-07-30", "value": "156 mg/dL" },
  { "code": "Glucose",        "date": "2023-01-12", "value": "172 mg/dL" },
  { "code": "Glucose",        "date": "2023-07-20", "value": "145 mg/dL" },
  { "code": "Glucose",        "date": "2024-04-15", "value": "132 mg/dL" }
]

Fifteen observations spanning two and a half years. You can see the HbA1c climbing from 6.8 to 8.1 — the point at which a clinician would escalate medication — then trending back down after treatment adjustment. The glucose values correlate. There's a clinical story in this data.

The sandbox has the right resource types. The data inside them is empty.

What "Realistic" Means for a FHIR Integration

"Realistic" has a specific definition. It's in the spec.

US Core Profiles

US Core is the implementation guide that specifies how FHIR resources should be structured in the US healthcare system. Every EHR certified under ONC's Health IT Certification Program must conform to US Core profiles.

US Core doesn't just say "include a Patient resource." It specifies which fields are required (must be present), which are must-support (must be handled if present), and which extensions are standard. A US Core Patient requires name, gender, and identifier. It must-supports race, ethnicity, birthDate, and communication.

If your test data doesn't include must-support fields, you can't verify that your app handles them correctly. And if your demo doesn't show them, your prospect — who looks at US Core-conformant data all day — will notice.

USCDI Data Classes

The United States Core Data for Interoperability (USCDI) defines the minimum data classes that certified health IT must support. USCDI v3, mandatory as of January 2026, includes:

Patient demographics (including sexual orientation and gender identity)
Allergies and intolerances
Medications (active and historical)
Problems (conditions)
Procedures
Laboratory results
Vital signs
Clinical notes (discharge summaries, progress notes, consultation notes)
Diagnostic imaging (reports and studies)
Health insurance information
Clinical tests

Your test data should cover all of these. The sandbox covers maybe four.

Correlated Clinical Data

The hardest gap to fill is missing correlations. In real clinical data, conditions, medications, labs, and encounters are causally connected. A diabetes diagnosis triggers HbA1c monitoring every 3-6 months. An elevated creatinine prompts a nephrology referral. An abnormal chest X-ray leads to a CT follow-up.

Standard synthetic data generators don't model these relationships. A 2019 validation study of Synthea in BMC Medical Informatics and Decision Making found that Synthea-generated populations showed 0% blood pressure control rates compared to 69.7% in real-world data. The resources parse. The population statistics are wrong.

Try It

You can query mock.health's FHIR API directly:

curl -s https://api.mock.health/fhir/Patient?_count=1 \
  -H "Accept: application/fhir+json" | jq '.entry[0].resource | {
    name: .name[0].text,
    birthDate,
    gender,
    race: .extension[] | select(.url | contains("race")) | .extension[0].valueCoding.display,
    ethnicity: .extension[] | select(.url | contains("ethnicity")) | .extension[0].valueCoding.display
  }'

And a radiology report with actual clinical narrative:

curl -s "https://api.mock.health/fhir/DiagnosticReport?category=RAD&_count=1" \
  -H "Accept: application/fhir+json" | jq -r '
    .entry[0].resource.presentedForm[0].data' | base64 -d

That base64 -d decodes the report text. You'll see findings, impressions, and clinical language — not "Normal. Normal. Normal."

If you've already built a SMART on FHIR app, you can point it at https://api.mock.health/fhir and get realistic clinical data through the same OAuth flow you'll use in production.

DEV Community