DEV Community

VBC Risk Analytics
VBC Risk Analytics

Posted on

Risk Stratification as a Data Pipeline: Turning RAF Into a Worklist

Risk stratification sounds like a clinical strategy, but for the engineers who build it, it's a ranking pipeline. You take a population, score each member's expected need, sort, and hand a care team a prioritized list. This post is about how that pipeline is actually wired in a Medicare Advantage context.


The goal, stated as a function

stratify(population) -> ranked_list_of_members_by_expected_need
Enter fullscreen mode Exit fullscreen mode


The hard parts are choosing the score, making it explainable, and refreshing it on a cadence that's useful to humans.


The inputs

A useful stratification model blends several signals. In Medicare Advantage, the natural backbone is the same data that drives funding accuracy:

- RAF (Risk Adjustment Factor) — the normalized expected-cost score built from demographics and HCCs (CMS-HCC V28).

  • HCC compositionwhich conditions, not just the total. Two members with the same RAF can need very different interventions.
  • Utilization signals — recent admissions, ED visits, polypharmacy.
  • Gaps — suspected-but-undocumented conditions (a stratification model that ignores gaps under-ranks the sickest, least-documented members).  This is exactly the approach walked through in this medicare advantage risk adjustment case study, where risk data became a concrete outreach list.  ## A minimal scoring sketch  python def member_score(m): """Synthetic, illustrative weighting only.""" score = m.raf * W_RAF score += m.recent_admissions * W_ADMIT score += len(m.open_gaps) * W_GAP score += m.polypharmacy_flag * W_RX return score  ranked = sorted(population, key=member_score, reverse=True)

The weights are policy decisions, not engineering ones — surface them in config, don't bury them in code. Your clinical leadership will want to tune them, and your auditors will want to see them.


Explainability is not optional

A stratification score that a care manager can't interrogate is a score they won't trust. For every ranked member, emit the contribution breakdown:


{
  "member_id": "SYNTH-10293",
  "score": 3.8,
  "drivers": {
    "raf": 2.1,
    "recent_admissions": 1.0,
    "open_gaps": 0.5,
    "polypharmacy": 0.2
  }
}
Enter fullscreen mode Exit fullscreen mode


Now "why is this member #1?" has a one-glance answer.


Refresh cadence and stability

- Recompute on a schedule (often monthly) and snapshot each run.

  • Watch for churn. If members thrash in and out of the top tier week to week, your weights are too sensitive — care teams need stability to actually act.
  • Track outcomes back to the score. The point isn't the ranking; it's whether outreach to high-ranked members changed anything. 

Engineering guardrails

  • Synthetic data only in dev and test. Generate illustrative populations.
  • Version the model. A stratification run should be reproducible later.
  • Separate scoring from action. The pipeline ranks; humans decide. Don't auto-trigger interventions off a raw score. 

The payoff

Done well, stratification connects the backward-looking RAF to forward-looking care: instead of reconciling last year's risk, you're deciding who to reach this month. For the full data-side walkthrough that complements this pipeline view, see the Medicare Advantage risk stratification guide.


VBC Risk Analytics. Educational only — not coding, billing, or clinical advice; verify against the current CMS Rate Announcement. Synthetic data only.

Top comments (0)