Healthcare data engineering is hard. Healthcare risk adjustment data engineering is a category of hard that deserves its own word.
If you've ever worked in health IT, you know the stack: EHR extracts that arrive in twelve different formats, ICD-10 codes that map to HCC categories under rules that change every model year, RAF scores that determine how much a health plan gets paid for every enrolled member, and CMS audits that can claw back millions if the documentation doesn't hold up.
At VBC Risk Analytics, we've spent years building API-first tools in this space. This introductory post covers what risk adjustment actually is, why it's technically interesting, and what we're building — with the hope of connecting with developers, data engineers, and health IT professionals who work in this domain.
What Is Risk Adjustment and Why Does It Matter?
Risk adjustment is the process Medicare uses to ensure that health plans are paid fairly based on the health status of their enrolled members. Sicker members cost more to care for, so a plan that enrolls a predominantly sick population gets higher payments to offset those costs. A plan that enrolls predominantly healthy members gets lower payments.
The mechanism that drives this is the Hierarchical Condition Category (HCC) model — specifically, the CMS-HCC model maintained by the Centers for Medicare and Medicaid Services. Every Medicare Advantage member gets a Risk Adjustment Factor (RAF) score that reflects their predicted cost relative to the average Medicare beneficiary.
A RAF score of 1.0 means the member is expected to cost exactly as much as the average. A score of 1.5 means 50% more than average. A score of 0.7 means 30% less.
The score is built by:
- Taking the member's demographic data (age, sex, Medicaid eligibility, etc.)
- Mapping their ICD-10 diagnosis codes to HCC categories
- Applying interaction factors for certain combinations of conditions
- Summing the coefficients from the CMS-HCC model
For a health plan with 100,000 Medicare Advantage members, even small errors in this calculation — missed diagnoses, mapping mistakes, documentation gaps — compound into significant over- or underpayment.
The Technical Complexity
This sounds straightforward until you get into the actual implementation.
The ICD-10-to-HCC mapping is not a simple lookup table. There are approximately 70,000 ICD-10-CM codes and 86 HCC categories in the CMS-HCC V28 model. Not all codes map to HCCs. Some codes map to multiple HCCs. The "hierarchical" part of HCC means that more severe conditions in a disease hierarchy suppress less severe ones — so if a patient has both HCC 18 (Diabetes with chronic complications) and HCC 19 (Diabetes without complication), only HCC 18 counts.
The model coefficients change with each model version. CMS transitioned from V24 to V28 over 2024-2026, blending the two models at different percentages each year. Code that was correct for V24 produces wrong answers for V28 if you don't update the coefficient tables.
The demographic adjusters depend on whether a member is community-dwelling or institutionalized, whether they have Medicaid, and whether they're in their initial enrollment period. Getting these wrong affects every member, not just the complex ones.
RADV audits add another layer. The Risk Adjustment Data Validation audit process has CMS selecting a sample of medical records and verifying that each HCC in the RAF score is supported by adequate documentation. Plans that fail RADV audits repay the overpayment — plus potential extrapolation penalties.
What VBC Risk Analytics Builds
Our platform at vbcriskanalytics.com addresses the risk adjustment workflow end-to-end. A few things we've built:
RAF Score API: A REST endpoint that takes a member's demographics and ICD-10 codes and returns a fully calculated RAF score with HCC mapping details, model version, and coefficient breakdown. Handles V24, V28, and the blended transition percentages.
ICD-10 Data Lookup API: Fast lookup for ICD-10-CM codes — descriptions, HCC mappings, validity flags, hierarchy relationships. Useful for coding workflow tools, eligibility systems, and CDI applications.
RADV Audit Scrubber: Before a RADV audit happens, this tool reviews the documentation supporting each HCC against CMS audit criteria. It flags potential documentation deficiencies so they can be addressed before CMS asks for the medical records.
NPI Lookup API: Provider verification via the National Plan and Provider Enumeration System (NPPES), useful for linking clinical data to provider records.
Why API-First?
Most risk adjustment software is built as monolithic platforms — you buy the whole system or you buy nothing. We took an API-first approach because:
- Most health plans and provider groups already have analytics infrastructure. They need specific capabilities, not replacement systems.
- Health IT developers building EHR integrations, population health tools, and care management platforms need programmatic access to risk adjustment data.
- APIs are testable, versionable, and composable in ways that dashboard-only tools aren't.
If you're building something in the health IT space and you need RAF scoring, ICD-10 lookup, or provider verification, check out the healthcare APIs at VBC Risk Analytics. We have documentation and sandbox access available.
What's Next
In future posts on Dev.to, I'll be going deeper on specific technical topics:
- The ICD-10-to-HCC mapping algorithm — how it actually works under the hood
- CMS-HCC V28 changes and what broke in existing implementations
- RADV audit data modeling — structuring your documentation review pipeline
- Building a risk stratification system from claims data
- NPI verification edge cases and the mess that is provider data quality
If you work in health IT or healthcare data engineering, I'd love to connect. Drop a comment below or reach out through vbcriskanalytics.com.
Top comments (0)