Predetermined change-control plans for AI/ML devices — practical steps that actually survive an audit

#qms #medtech #regulatory

I’ve spent the last three years defending algorithm updates to notified bodies and answering product-team questions that start “but it’s just a model tweak”. AI/ML devices are different in one important way: the thing you put on the market can — by design — change behaviour over time. That breaks the old mental model of “frozen” medical software unless you have a predetermined change control plan and a validation posture that regulators can actually accept.

Here is what I do, what auditors ask for, and a checklist you can start using today.

Why a predetermined plan matters (and what it buys you)

To the regulator, change control is governance, not paperwork. ISO 13485 Section 8.5.6 and FDA 21 CFR 820.70 are explicit about documenting design and development changes and their verification/validation. For adaptive AI/ML you need to show you foresaw change, assessed risk, and designed controls up front — not retrofitted them after the first model drift.

Practically this means:

You decide in advance which changes are acceptable without a full new submission, and which ones trigger re-validation, clinical evaluation updates, or notified body involvement.
You define how you will measure ongoing performance and when you will intervene.
You set boundaries around data provenance, labelling, and retraining.

To be fair, the FDA has been explicit with the SaMD pre-specification / algorithm change protocol idea; the MDR doesn’t use that language, but Annex I + Article 10 require the same outcome: documented, risk-based control of changes and ongoing performance monitoring. So we build to both sets of expectations.

What notified bodies will actually ask for

From experience across three audits, the common threads are:

A clear artefact called a “Change Control Plan” or “Predetermined Change Protocol” in the Technical Documentation.
Risk assessments tied to specific change types (data drift, retraining dataset changes, architecture tuning).
Objective performance metrics and acceptance criteria defined up front.
Traceability from change request → risk assessment → verification/validation evidence → clinical impact assessment.
Post-market monitoring plans showing how you will detect performance degradation in the field.

Naja — different notified bodies phrase things differently, but they’re all looking for the same evidence trail: did you think it through before the change and can you prove it afterwards.

Practical checklist for a predetermined change-control plan

Treat this as the working document you will show auditors and engineers. Minimum contents:

Scope: which models/features are covered (product version, clinical intended use).
Change taxonomy: defined classes such as
- Minor parameter tuning (no architecture change, same dataset)
- Retraining with additional labelled data
- Architecture change (new network, new pre-processing)
- Data pipeline changes (different sensors, upstream labelling process)
For each class, define:
- Risk impact (safety, intended use, data privacy)
- Required documentation (code, training logs, datasets, model cards)
- Verification and validation activities (unit tests, hold-out validation, prospective performance test)
- Release criteria (metric thresholds, clinical performance limits)
- Post-release monitoring (which metrics, frequency, alert thresholds)
Roles and responsibilities (who approves, who runs validation, who signs clinical updates).
Traceability requirements and artefacts (link change tickets to risk assessment and evidence).

If you use an eQMS, map each change class to a controlled workflow so the traceability is automatic. I’m a fan of systems that force the change owner to attach model training logs and hold-out set results before a change can move to approval — simple, but you’d be surprised how often it’s missing.

Validation strategy for algorithmic changes

Validation needs to be both technical and clinical:

Technical validation:
- Reproducibility: can you rerun training and get comparable results? Capture seed, environment, and container images.
- Performance on hold-out datasets that mirror clinical populations.
- Robustness tests (noise, adversarial, worst-case inputs relevant to the device).
Clinical validation:
- Demonstrate maintained clinical-performance non-inferiority to baseline where applicable.
- If changes are substantial, consider a targeted clinical study or an enhanced PMCF cohort.
Verification evidence:
- Unit/integration tests for preprocessing and inference pipelines (IEC 62304 expectations).
- Regression test suite that runs automatically and records results.
Risk control evidence:
- Updated ISO 14971 risk assessments linked to the change ticket.
- New hazard controls (warnings, limiters, fallback behaviour) implemented and verified.

Validation is not a one-time activity. Define cadence for automated checks in production and triggers for manual review.

What I push back on during audits

Two things come up repeatedly:

“We’ll validate in production.” Granted, continuous monitoring is essential, but you still need predefined acceptance criteria before release. “Validate as you go” without pre-specified gates is not persuasive.
Black-box assertions. If an update improves an aggregate metric but increases error in a clinical subgroup, that’s a regression. I require subgroup analysis and documented mitigation.

I also insist on demonstrable traceability — not a Word doc summary, but links from the change ticket to model artefacts, test results, risk assessments, and clinical-evaluation outcomes. That’s where automated CAPAs and traceability pay off; you can generate an audit trail without chasing emails.

A short template to start with (one page)

Product:
Covered model/version:
Change classes (list):
Release gates (metric thresholds and tests):
Post-release metrics (list + frequency):
Approvals required (roles):

Put it in your Technical File under software lifecycle and link it from your change-control SOP per ISO 13485 8.5.6.

Final thought

AI-assisted tools can help automate parts of this — automated CAPAs that flag regressions, scripts to capture training provenance, connected workflow for change approval — but they must feed reviewable, auditable records. The question I still bring to internal teams and auditors is: did we proactively set the boundary of acceptable change, and can we prove we met it before and after every update?

How have you handled a notified-body challenge to an algorithm update — did your predetermined plan survive, or did you rebuild it under audit pressure?