How to Parse HL7 Messages and Convert to FHIR R4 Programmatically
HL7 v2 is the most widely used healthcare messaging standard in the world. Despite being decades old, it still carries the majority of clinical data between hospitals, labs, pharmacies, and EHR systems. If you have ever opened a raw HL7 message, you know the problem immediately: it is a pipe-delimited wall of text with no labels, no hierarchy, and no obvious way to extract the data you need.
This article explains what HL7 v2 messages look like, why parsing them is harder than it appears, and how to parse and convert them programmatically using a free tool and MCP server.
What makes HL7 v2 hard to parse
An HL7 v2 message looks something like this:
MSH|^~\&|EPIC|HOSPITAL|LAB|LAB|20260316120000||ADT^A01|MSG00001|P|2.5
EVN|A01|20260316120000
PID|1||MRN12345^^^HOSPITAL^MR||DOE^JOHN^A||19800115|M|||123 MAIN ST^^DALLAS^TX^75201
PV1|1|I|ICU^101^A|E|||1234567890^SMITH^ROBERT^J^^^MD|5678901234^JONES^MARY^L^^^MD
NK1|1|DOE^JANE|SPO|555-123-4567
DG1|1||E11.65^Type 2 diabetes mellitus with hyperglycemia^I10
This is an ADT^A01 message, which signals a patient admission. Every segment (MSH, PID, PV1, etc.) has a different layout. Every field within a segment has a specific position, and many fields contain components separated by ^ and sub-components separated by &. The encoding characters themselves are defined in MSH.1 and MSH.2, so they can technically vary between messages.
The real difficulty is that field positions have no labels in the message itself. PID.5 is the patient name, but you only know that if you have memorized the HL7 specification or have a reference like the CARISTIX HL7 dictionary. PID.5.1 is the family name, PID.5.2 is the given name, PID.5.3 is the middle name. PV1.7 is the attending physician. DG1.3 is the diagnosis code. None of this is self-documenting.
Parsing an ADT^A01 message
The HL7 tools MCP server (@easysolutions906/hl7-tools) parses any HL7 v2 message and labels every field with its CARISTIX name. Here is what parsing the message above produces:
{
"messageType": "ADT^A01",
"version": "2.5",
"controlId": "MSG00001",
"segments": [
{
"name": "MSH",
"fields": {
"MSH.9 - Message Type": "ADT^A01",
"MSH.10 - Message Control ID": "MSG00001",
"MSH.11 - Processing ID": "P",
"MSH.12 - Version ID": "2.5"
}
},
{
"name": "PID",
"fields": {
"PID.3 - Patient Identifier List": "MRN12345^^^HOSPITAL^MR",
"PID.5 - Patient Name": "DOE^JOHN^A",
"PID.7 - Date/Time of Birth": "19800115",
"PID.8 - Administrative Sex": "M",
"PID.11 - Patient Address": "123 MAIN ST^^DALLAS^TX^75201"
}
},
{
"name": "PV1",
"fields": {
"PV1.2 - Patient Class": "I",
"PV1.3 - Assigned Patient Location": "ICU^101^A",
"PV1.4 - Admission Type": "E",
"PV1.7 - Attending Doctor": "1234567890^SMITH^ROBERT^J^^^MD"
}
}
]
}
Every field now has a human-readable name. You do not need to count pipes or consult the specification. The parser handles all standard HL7 v2 message types including ADT (admissions/discharges/transfers), ORM (orders), ORU (results), SIU (scheduling), and MDM (documents).
Converting to FHIR R4
FHIR R4 is the modern healthcare data standard. Converting HL7 v2 messages to FHIR bundles is one of the most common integration tasks in health IT. The conversion maps HL7 segments to FHIR resources:
- PID becomes a Patient resource
- PV1 becomes an Encounter resource
- DG1 becomes a Condition resource
- OBX becomes an Observation resource
- NK1 becomes a RelatedPerson resource
Here is what the FHIR R4 output looks like for the ADT^A01 above:
{
"resourceType": "Bundle",
"type": "transaction",
"entry": [
{
"resource": {
"resourceType": "Patient",
"identifier": [{ "value": "MRN12345", "system": "HOSPITAL" }],
"name": [{ "family": "DOE", "given": ["JOHN", "A"] }],
"gender": "male",
"birthDate": "1980-01-15",
"address": [{
"line": ["123 MAIN ST"],
"city": "DALLAS",
"state": "TX",
"postalCode": "75201"
}]
}
},
{
"resource": {
"resourceType": "Encounter",
"class": { "code": "IMP", "display": "inpatient encounter" },
"location": [{ "location": { "display": "ICU^101^A" } }]
}
},
{
"resource": {
"resourceType": "Condition",
"code": {
"coding": [{ "system": "http://hl7.org/fhir/sid/icd-10-cm", "code": "E11.65" }],
"text": "Type 2 diabetes mellitus with hyperglycemia"
}
}
}
]
}
The conversion handles date formatting (HL7 uses YYYYMMDD, FHIR uses YYYY-MM-DD), gender code mapping (M to male), and patient class mapping (I to IMP).
Using the tools
There are three ways to use the HL7 parser and FHIR converter.
Web tool
For quick one-off parsing, the web tool at easysolutions906.github.io/hl7.html lets you paste a message and see parsed output and FHIR conversion instantly in your browser. No data leaves your machine -- parsing runs entirely client-side.
MCP server for AI assistants
Add the MCP server to Claude Desktop or Cursor:
{
"mcpServers": {
"hl7-tools": {
"command": "npx",
"args": ["-y", "@easysolutions906/hl7-tools"]
}
}
}
Then ask Claude: "Parse this HL7 message" and paste the raw text. Claude calls the hl7_parse tool and returns labeled fields. Ask "Convert it to FHIR" and Claude calls hl7_to_fhir and returns a FHIR R4 bundle.
Programmatic use
Install the npm package directly:
npx @easysolutions906/hl7-tools
The server exposes tools via the Model Context Protocol, making it usable from any MCP-compatible client.
Why this matters
Healthcare integration teams spend significant time decoding HL7 messages during interface builds. Having a tool that instantly labels every field by its CARISTIX name and converts to FHIR R4 eliminates the manual lookup step. Whether you are building an EHR integration, debugging an interface engine, or converting a legacy system to FHIR, parsing is the first step -- and it should not require memorizing field positions.
Top comments (0)