Most medical AI systems today are still designed as prediction systems.
A typical pipeline looks like this:
data = collect_patient_data()
features = extract_features(data)
risk = model.predict(features)
return risk
This can be useful.
A prediction model can answer questions such as:
- What is the estimated risk of a disease?
- Does an image contain an abnormal finding?
- Which risk group does this person belong to?
- What is the probability of a future clinical event?
But a world model asks a different question.
A prediction model asks:
future = predict(state)
A world model asks:
next_state = simulate_transition(state, action)
In other words:
Prediction asks: what may happen next?
A world model asks: what may happen if we take a specific action?
That difference matters a lot in medicine.
Medicine is not only about recognizing risk. It is also about deciding what to do under uncertainty.
The Minimal Structure of a World Model
A simplified world model can be described with five objects:
State What is the system like now?
Action What can be done?
Transition How may the system change after the action?
Objective What direction are we trying to move toward?
Feedback What did we observe after the action?
A very small abstraction could look like this:
class WorldModel:
def observe_state(self, system):
pass
def define_action(self, action_input):
pass
def simulate_transition(self, state, action):
pass
def collect_feedback(self, system, action):
pass
def update(self, state, action, feedback):
pass
This is already different from a simple predictive model.
A predictive model can work without an explicit action object.
A world model cannot.
If there is no action, there is no action-conditioned transition.
Why Medical AI Needs More Than Risk Prediction
Many medical AI systems are good at recognition and prediction:
- image classification;
- risk scoring;
- disease detection;
- prognosis estimation;
- anomaly detection;
- population stratification.
But many real medical and health-management problems are not just classification problems.
They are action problems.
For example:
- Should an intervention be considered?
- Which variable should be monitored first?
- What evidence supports the expected change?
- What feedback should be collected?
- If the result is different from expected, what should be updated?
These are not just prediction questions. They require a system to reason about state, action, transition, evidence, and feedback.
That is where the idea of a medical world model becomes useful.
A Prediction Model vs. a Medical World Model
A prediction model may look like this:
class PredictionModel:
def predict_risk(self, patient_data):
features = self.extract_features(patient_data)
risk_score = self.model.predict(features)
return risk_score
It maps data to a risk score.
A medical world model needs a different structure:
class MedicalWorldModel:
def simulate_transition(self, state, action, evidence):
if not self.safety_gate(state, action):
return {
"status": "blocked",
"reason": "Safety gate failed. Human review required."
}
transition = self.transition_model.estimate(
state=state,
action=action,
evidence=evidence
)
return {
"status": "hypothesis_generated",
"state": state,
"action": action,
"expected_transition": transition,
"evidence": evidence,
"disclaimer": "Hypothesis-generating only. Not medical advice."
}
The output is not a treatment recommendation.
It is a transition hypothesis.
That distinction is critical.
State: Do Not Reduce a Person to a Risk Score
A risk score may be useful:
{
"cardiovascular_risk": 0.23
}
But it is not enough for a world model.
A world model needs a richer representation of state.
Example schema:
{
"subject_id": "anonymous_001",
"timestamp": "2026-05-20",
"metabolic_state": {
"fasting_glucose": 5.6,
"hba1c": 5.7,
"fasting_insulin": 12.4,
"triglycerides": 1.8
},
"inflammation_state": {
"hs_crp": 2.1
},
"lifestyle_state": {
"sleep_duration": 6.2,
"weekly_exercise_minutes": 90,
"diet_pattern": "high_refined_carbohydrate"
},
"risk_context": {
"family_history": ["type_2_diabetes"],
"medications": [],
"known_conditions": []
}
}
This is only an illustrative schema, not a clinical standard.
In a real system, every state variable should have:
- source;
- unit;
- timestamp;
- measurement context;
- missing-value handling;
- data-quality metadata;
- uncertainty annotation.
A simple Python representation might look like this:
from dataclasses import dataclass
from typing import Dict, Any, List
@dataclass
class HealthState:
subject_id: str
timestamp: str
biomarkers: Dict[str, Any]
lifestyle: Dict[str, Any]
symptoms: Dict[str, Any]
medications: List[str]
context: Dict[str, Any]
data_quality: Dict[str, Any]
For a medical world model, state representation is not a preprocessing detail.
It is the foundation.
Action: Interventions Must Become Computable Objects
In a chatbot, an intervention might appear as natural language:
Improve sleep, exercise more, and eat better.
That is not enough for a world model.
A world model needs computable action objects.
Example:
{
"action_id": "increase_zone2_exercise",
"type": "lifestyle",
"target": "weekly_exercise_minutes",
"change": {
"from": 90,
"to": 150
},
"duration": "12_weeks",
"monitoring": [
"resting_heart_rate",
"sleep_quality",
"fasting_glucose"
],
"safety_notes": [
"requires clinician review if known cardiovascular disease exists"
]
}
Python representation:
@dataclass
class MedicalAction:
action_id: str
action_type: str
target: str
parameters: Dict[str, Any]
duration: str
monitoring: List[str]
safety_notes: List[str]
In medicine, an action could be:
- a medication change;
- a lifestyle intervention;
- a monitoring plan;
- a nutrition strategy;
- a sleep intervention;
- an exercise protocol;
- a follow-up test;
- a referral;
- a decision to wait and observe.
The key is that actions should be explicit, parameterized, time-bounded, and auditable.
Evidence: Transitions Should Be Evidence-Bound
A medical world model should not freely hallucinate transitions.
Every transition hypothesis should be bound to evidence.
A minimal evidence object:
{
"evidence_id": "evidence_001",
"claim": "Increasing weekly aerobic exercise may improve insulin sensitivity in selected metabolic-risk populations.",
"evidence_type": [
"clinical_guideline",
"peer_reviewed_study",
"mechanistic_rationale"
],
"strength": "moderate",
"applicability": {
"population_match": "partial",
"condition_match": "partial",
"uncertainty": "individual response may vary"
},
"limitations": [
"not a personalized treatment prediction",
"requires safety screening",
"effect size depends on baseline state and adherence"
]
}
A simple evidence builder:
@dataclass
class EvidenceItem:
evidence_id: str
claim: str
evidence_type: List[str]
strength: str
applicability: Dict[str, Any]
limitations: List[str]
class EvidenceBuilder:
def build(self, state: HealthState, action: MedicalAction):
evidence_items = self.retrieve_relevant_evidence(state, action)
filtered = self.filter_by_applicability(evidence_items, state)
return filtered
def retrieve_relevant_evidence(self, state, action):
# In production, this should query curated knowledge bases,
# clinical guidelines, systematic reviews, or trusted literature indexes.
return []
def filter_by_applicability(self, evidence_items, state):
# Filter by population, context, condition, safety boundary,
# measurement quality, and uncertainty.
return evidence_items
A useful rule:
Do not generate advice. Generate evidence-bound transition hypotheses.
Transition: Hypothesis, Not Promise
In a medical world model, a transition should not be framed as a promise.
Bad framing:
This intervention will improve the outcome.
Better framing:
Given the current state and evidence constraints, this action may produce the following state changes, with uncertainty.
A transition object:
@dataclass
class TransitionHypothesis:
from_state: HealthState
action: MedicalAction
expected_changes: Dict[str, Any]
time_window: str
evidence: List[EvidenceItem]
uncertainty: Dict[str, Any]
safety_flags: List[str]
Example output:
{
"expected_changes": {
"weekly_exercise_minutes": {
"direction": "increase",
"expected_from": 90,
"expected_to": 150
},
"insulin_sensitivity": {
"direction": "potential_improvement",
"confidence": "low_to_moderate"
},
"fasting_glucose": {
"direction": "possible_decrease",
"confidence": "uncertain"
}
},
"time_window": "8_to_12_weeks",
"uncertainty": {
"adherence": "unknown",
"baseline_variability": "high",
"measurement_noise": "moderate"
},
"safety_flags": [
"screen cardiovascular risk before increasing exercise intensity"
]
}
Implementation sketch:
class TransitionModel:
def estimate(self, state, action, evidence):
expected_changes = self.estimate_expected_changes(
state=state,
action=action,
evidence=evidence
)
uncertainty = self.estimate_uncertainty(
state=state,
action=action,
evidence=evidence
)
safety_flags = self.check_safety_flags(state, action)
return TransitionHypothesis(
from_state=state,
action=action,
expected_changes=expected_changes,
time_window=action.parameters.get("time_window", "unknown"),
evidence=evidence,
uncertainty=uncertainty,
safety_flags=safety_flags
)
The name matters.
Use TransitionHypothesis, not TreatmentPlan.
Safety Gate: Medical World Models Must Be Safety-First
A medical world model should not simulate all actions freely.
Before transition simulation, there should be a safety gate.
class SafetyGate:
def check(self, state: HealthState, action: MedicalAction):
checks = [
self.check_contraindications(state, action),
self.check_required_human_review(state, action),
self.check_action_intensity(state, action),
self.check_data_quality(state),
]
return all(checks)
def check_contraindications(self, state, action):
# Placeholder only.
# Production systems require curated medical rules and human review.
return True
def check_required_human_review(self, state, action):
# Some actions should never be autonomous.
return True
def check_action_intensity(self, state, action):
return True
def check_data_quality(self, state):
return True
Then:
def simulate_medical_transition(state, action):
if not safety_gate.check(state, action):
return {
"status": "blocked",
"reason": "Safety gate failed. Human review required."
}
evidence = evidence_builder.build(state, action)
transition = transition_model.estimate(state, action, evidence)
return transition
A useful design principle:
medical_world_model = safety_first + evidence_bound + feedback_calibrated
Feedback: Without Feedback, It Is Not a World Model
A world model should not be a one-shot answer generator.
It should support a loop:
Observe state → Define action → Simulate transition → Collect feedback → Update model
Pseudo-code:
def world_model_loop(subject, action):
state_t0 = observe_state(subject)
evidence = build_evidence_chain(state_t0, action)
transition_hypothesis = simulate_transition(
state=state_t0,
action=action,
evidence=evidence
)
feedback = collect_feedback(
subject=subject,
action=action,
time_window=transition_hypothesis.time_window
)
updated_state = update_model(
previous_state=state_t0,
action=action,
hypothesis=transition_hypothesis,
feedback=feedback
)
return updated_state
Feedback can include:
- repeated biomarkers;
- symptoms;
- wearable trends;
- adherence records;
- adverse events;
- clinician review;
- patient-reported outcomes;
- environmental or lifestyle changes.
Feedback object:
{
"feedback_id": "feedback_001",
"action_id": "increase_zone2_exercise",
"time_window": "12_weeks",
"observations": {
"weekly_exercise_minutes": 145,
"fasting_glucose": 5.4,
"sleep_duration": 6.5,
"subjective_energy": "improved"
},
"adherence": "partial",
"adverse_events": [],
"notes": "Interpret carefully; multiple concurrent changes existed."
}
Update logic:
class FeedbackUpdater:
def update(self, previous_state, action, hypothesis, feedback):
comparison = self.compare_expected_vs_observed(
expected=hypothesis.expected_changes,
observed=feedback["observations"]
)
return {
"previous_state": previous_state,
"action": action,
"hypothesis": hypothesis,
"feedback": feedback,
"comparison": comparison,
"update_reason": self.infer_update_reason(comparison)
}
Without feedback, the system cannot calibrate.
Audit Log: Every Transition Should Be Traceable
For medical AI, auditability is not optional.
A transition should have an audit trail:
{
"audit_id": "audit_001",
"timestamp": "2026-05-20T23:00:00+08:00",
"state_version": "state_v1",
"action_version": "action_v1",
"evidence_version": "evidence_v1",
"transition_version": "transition_v1",
"model_version": "world_model_v0.1",
"human_review": {
"required": true,
"status": "pending"
},
"disclaimer": "Hypothesis-generating only. Not a treatment recommendation."
}
Audit logger:
class AuditLogger:
def log_transition(self, state, action, evidence, transition):
return {
"state": state,
"action": action,
"evidence": evidence,
"transition": transition,
"model_version": self.get_model_version(),
"human_review_required": True,
"disclaimer": "Hypothesis-generating only. Not medical advice."
}
For medical world models, audit logs are part of the core architecture.
A Minimal Medical World Model
Putting the pieces together:
class MinimalMedicalWorldModel:
def __init__(
self,
safety_gate,
evidence_builder,
transition_model,
feedback_updater,
audit_logger
):
self.safety_gate = safety_gate
self.evidence_builder = evidence_builder
self.transition_model = transition_model
self.feedback_updater = feedback_updater
self.audit_logger = audit_logger
def run(self, state, action):
if not self.safety_gate.check(state, action):
return {
"status": "blocked",
"message": "Safety gate failed. Human review required."
}
evidence = self.evidence_builder.build(state, action)
transition = self.transition_model.estimate(
state=state,
action=action,
evidence=evidence
)
audit_log = self.audit_logger.log_transition(
state=state,
action=action,
evidence=evidence,
transition=transition
)
return {
"status": "hypothesis_generated",
"transition": transition,
"audit_log": audit_log
}
def update_with_feedback(self, previous_state, action, transition, feedback):
return self.feedback_updater.update(
previous_state=previous_state,
action=action,
hypothesis=transition,
feedback=feedback
)
Usage:
state = observe_state(subject)
action = define_action(intervention)
result = medical_world_model.run(state, action)
if result["status"] == "hypothesis_generated":
feedback = collect_feedback(subject, action)
updated = medical_world_model.update_with_feedback(
previous_state=state,
action=action,
transition=result["transition"],
feedback=feedback
)
The loop is:
State → Action → Evidence → Transition → Audit → Feedback → Update
Where SteeraMed Fits
SteeraMed can be understood as a steerable biomedical world model framework.
In developer terms, it is closer to:
state-action-transition-evidence-feedback architecture
than to:
a chatbot that gives medical advice
The goal is not to automate treatment decisions.
The goal is to make biomedical AI systems more:
- state-aware;
- action-explicit;
- evidence-bound;
- feedback-calibrated;
- safety-gated;
- auditable;
- human-reviewable.
This is especially relevant for long-term health and longevity medicine, where the problem is not a single prediction task but longitudinal state management under uncertainty.
Developer Takeaways
1. Do not start with a chatbot
A chatbot interface may be useful later, but it should not be the core architecture.
Start with state representation.
state = observe_state(subject)
2. Do not stop at risk prediction
Risk prediction is useful, but it is not a world model.
risk = predict(state)
is not enough.
3. Make actions explicit
Without explicit actions, there is no action-conditioned transition.
next_state = simulate(state, action)
4. Treat transitions as hypotheses
A medical transition is not a promise.
transition = generate_hypothesis(state, action, evidence)
5. Build evidence chains
A transition without evidence is not acceptable in medical AI.
evidence = build_evidence_chain(state, action)
6. Add safety gates before simulation
Do not simulate unsafe or unsupported actions.
if not safety_gate.check(state, action):
require_human_review()
7. Close the loop with feedback
Without feedback, the model cannot calibrate.
model.update(feedback)
8. Log everything
Every state, action, evidence item, transition, feedback signal, and model version should be traceable.
audit_logger.log(state, action, evidence, transition, feedback)
Final Thought
A medical world model is not a larger prediction model.
It is a system architecture for reasoning about:
state + action + evidence + transition + feedback
under uncertainty.
For developers, the key shift is this:
Do not just build systems that predict what may happen.
Build systems that can represent state, encode actions, simulate evidence-bound transitions, collect feedback, and remain auditable.
That is the difference between prediction and a world model.
References and Project Links
- Ha, D., & Schmidhuber, J. Recurrent World Models Facilitate Policy Evolution. Advances in Neural Information Processing Systems 31, 2018. https://papers.nips.cc/paper/7512-recurrent-world-models-facilitate-policy-evolution; arXiv version: https://arxiv.org/abs/1803.10122
- LeCun, Y. A Path Towards Autonomous Machine Intelligence. OpenReview, 2022. https://openreview.net/forum?id=BZ5a1r-kVsf
- Yang, Y., Wang, Z.-Y., Liu, Q., Sun, S., Wang, K., Chellappa, R., Zhou, Z., Yuille, A., Zhu, L., Zhang, Y.-D., & Chen, J. Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning. arXiv:2506.02327, 2025. https://arxiv.org/abs/2506.02327; project page: https://yijun-yang.github.io/MeWM/
- Qazi, M. A., Nadeem, M., & Yaqub, M. Beyond Generative AI: World Models for Clinical Prediction, Counterfactuals, and Planning. arXiv:2511.16333, 2025. https://arxiv.org/abs/2511.16333
- Katsoulakis, E., Wang, Q., Wu, H., et al. Digital twins for health: a scoping review. npj Digital Medicine, 7, 77, 2024. https://doi.org/10.1038/s41746-024-01073-0
- Emmert-Streib, F., Parkkila, S., Laubenbacher, R., et al. The role of digital twins in P4 medicine: A paradigm for modern healthcare. npj Digital Medicine, 8, 735, 2025. https://doi.org/10.1038/s41746-025-02115-x
- Xiong, J. World Models for Biomedicine: A Steerability Framework. Preprints.org, 2026. https://doi.org/10.20944/preprints202605.0366.v1
- SteeraMed project: https://SteeraMed.com
- Steerable World project: https://steerable.world
Top comments (0)