If you have worked with computational biology, multi-omics analysis, pathway modeling, biomedical knowledge graphs, or systems biology, the phrase medical world model may sound suspicious at first.
You might ask:
Isn't this just systems biology with a new name?
That is a fair question.
Life sciences have been modeling gene regulatory networks, signaling pathways, metabolic systems, disease mechanisms, and biological perturbations for decades. If a medical world model were only a rebranding of systems biology, digital twins, or large language models, it would not add much technical value.
But from a system-design perspective, the difference is not mainly about terminology.
It is about object boundaries.
A systems biology model often focuses on:
component -> relation -> pathway -> network -> mechanism
A medical world model needs to additionally represent:
state + action + evidence -> transition hypothesis -> feedback update
In other words:
- systems biology helps us understand how biological systems work;
- medical world models aim to reason about how an individual state may change under a defined action;
- steerable medical world models further add objectives, constraints, safety gates, human review, feedback, and auditability.
This article explains the distinction from a developer's perspective.
1. Three model types, three engineering boundaries
A useful way to avoid confusion is to compare prediction models, systems biology models, and medical world models.
| Model type | Core question | Main objects | Typical output | Engineering keywords |
|---|---|---|---|---|
| Prediction model | How high is the future risk? | features, labels | risk score, class, probability | classification / regression |
| Systems biology model | How does the biological system work? | genes, proteins, pathways, networks | mechanism, network dynamics | graph / ODE / network model |
| Medical world model | How may state change after an action? | state, action, transition, evidence, feedback | transition hypothesis, audit trail | state-action-feedback loop |
A standard medical risk model may look like:
risk = predict_risk(patient_features)
A systems biology model may look like:
network_state = simulate_pathway_dynamics(
pathway_graph,
initial_conditions,
perturbation
)
A medical world model is closer to:
transition = estimate_transition_hypothesis(
state=current_patient_state,
action=candidate_intervention,
evidence=evidence_chain
)
feedback = collect_feedback(
patient_id=patient_id,
action=candidate_intervention,
time_window_weeks=8
)
updated_state = update_state(
previous_state=current_patient_state,
action=candidate_intervention,
transition=transition,
feedback=feedback
)
The important difference is not that one model is more complex than another.
The difference is the modeling target:
feature prediction
-> mechanism modeling
-> action-conditioned transition reasoning
2. Systems biology can model perturbations, but action is not always a decision object
Systems biology is not action-free.
It can model:
- gene knockout;
- drug perturbation;
- pathway activation or inhibition;
- environmental change;
- ODE-based dynamics;
- network control;
- multi-omics perturbation response.
So this statement would be wrong:
systems biology has no action
medical world models have action
A better distinction is:
Systems biology can model perturbations and responses. A medical world model needs to turn action into a structured decision object and place it inside an evidence, transition, feedback, safety, and audit loop.
In systems biology, a perturbation may be an input parameter:
result = simulate_network(
graph=pathway_graph,
perturbation={"gene_x": "knockout"}
)
In a medical world model, an action is not just a perturbation parameter. It must be executable, recordable, auditable, monitorable, and feedback-compatible.
For example:
from dataclasses import dataclass
from typing import List
@dataclass
class InterventionAction:
action_id: str
category: str
description: str
target_mechanisms: List[str]
intensity: str
duration_weeks: int
monitoring_markers: List[str]
safety_constraints: List[str]
Example:
action = InterventionAction(
action_id="nutrition_low_glycemic_8w",
category="nutrition",
description="8-week low-glycemic dietary adjustment",
target_mechanisms=[
"postprandial_glucose_variability",
"insulin_resistance",
"weight_management"
],
intensity="moderate",
duration_weeks=8,
monitoring_markers=[
"fasting_glucose",
"hba1c",
"weight",
"waist_circumference"
],
safety_constraints=[
"not a treatment prescription",
"clinical review required if medication is involved",
"stop or refer if red flags appear"
]
)
The engineering distinction is:
perturbation parameter != intervention action object
3. Systems biology often starts from a mechanism graph
A simplified systems biology model can be represented as a graph.
from dataclasses import dataclass
from typing import List
@dataclass
class BiologicalNode:
node_id: str
node_type: str # gene, protein, metabolite, pathway, phenotype
name: str
@dataclass
class BiologicalEdge:
source: str
target: str
relation: str # activates, inhibits, regulates, correlates_with
evidence_strength: str
@dataclass
class MechanismGraph:
nodes: List[BiologicalNode]
edges: List[BiologicalEdge]
Example:
mechanism_graph = MechanismGraph(
nodes=[
BiologicalNode("n1", "pathway", "insulin_signaling"),
BiologicalNode("n2", "phenotype", "glucose_variability"),
BiologicalNode("n3", "phenotype", "fatigue")
],
edges=[
BiologicalEdge(
source="n1",
target="n2",
relation="regulates",
evidence_strength="moderate"
),
BiologicalEdge(
source="n2",
target="n3",
relation="associated_with",
evidence_strength="low"
)
]
)
This structure is valuable.
It helps represent:
- mechanisms;
- pathways;
- regulatory relationships;
- phenotype associations;
- biological modules;
- possible system-level interactions.
But it is not yet a full medical world model.
It still does not explicitly answer:
What is the current individual state?
What action may be taken?
How might state change after the action?
What evidence supports that transition?
What feedback window should be used?
How should the next cycle be updated if feedback differs from expectation?
That is where the world-model framing becomes useful.
4. A medical world model needs a State object
A medical world model starts with an individual state representation.
from dataclasses import dataclass
from typing import Dict, List, Optional
@dataclass
class PatientState:
patient_id: str
demographics: Dict
clinical_markers: Dict
lifestyle: Dict
symptoms: List[str]
medications: List[str]
omics: Optional[Dict] = None
wearable: Optional[Dict] = None
mechanism_context: Optional[Dict] = None
Example:
state = PatientState(
patient_id="P001",
demographics={
"age": 52,
"sex": "unspecified"
},
clinical_markers={
"bmi": 29.1,
"fasting_glucose": 6.2,
"hba1c": 6.0,
"triglycerides": 2.1,
"hdl_c": 0.95
},
lifestyle={
"sleep_hours": 5.8,
"exercise_frequency_per_week": 1,
"diet_pattern": "high_refined_carbohydrate"
},
symptoms=[
"fatigue",
"post_meal_sleepiness"
],
medications=[],
mechanism_context={
"possible_insulin_resistance": True,
"possible_glucose_variability": True,
"data_quality": "partial"
}
)
The key principle is:
State should not be a data dump. It should be referenceable by actions, transitions, evidence, and feedback.
If a field cannot influence action selection, transition reasoning, safety filtering, or feedback updates, it may be noise rather than useful state.
5. Transition is not treatment-effect prediction
Developers may be tempted to write:
next_state = predict_next_state(state, action)
In medical settings, this is risky language.
It can sound like the model predicts individual treatment effects.
A safer and more precise abstraction is:
transition = estimate_transition_hypothesis(
state=state,
action=action,
evidence=evidence_chain
)
That is: a transition hypothesis.
@dataclass
class TransitionHypothesis:
expected_direction: Dict
mechanism_rationale: List[str]
uncertainty_level: str
time_window_weeks: int
assumptions: List[str]
limitations: List[str]
Example:
transition = TransitionHypothesis(
expected_direction={
"fasting_glucose": "decrease_possible",
"postprandial_glucose": "decrease_possible",
"weight": "slight_decrease_possible",
"energy_level": "may_improve"
},
mechanism_rationale=[
"lower refined carbohydrate intake may reduce postprandial glucose excursions",
"weight reduction may improve insulin sensitivity",
"improved sleep may reduce metabolic stress"
],
uncertainty_level="moderate",
time_window_weeks=8,
assumptions=[
"adequate adherence",
"no major medication change",
"baseline data quality is acceptable"
],
limitations=[
"individual response may vary",
"not a treatment effect prediction",
"not a substitute for clinical judgment"
]
)
Notice the wording:
decrease_possible
may_improve
hypothesis
uncertainty
limitations
Not:
will decrease
will reverse
will cure
This distinction matters in medical AI.
A medical world model should not make deterministic treatment promises. It should produce mechanism-informed, evidence-bounded, uncertainty-aware transition hypotheses.
6. EvidenceChain: recommendations are not enough
If a system only outputs:
recommendation = "reduce refined carbohydrates and increase exercise"
it is not yet a medical world model.
A medical world model should explain:
- why this action is proposed;
- which mechanism it targets;
- what evidence supports it;
- what uncertainty remains;
- where the safety boundary is.
A simple evidence chain can look like:
@dataclass
class EvidenceItem:
source_type: str # guideline, trial, mechanism, omics, individual_context
description: str
strength: str
reference: str | None = None
@dataclass
class EvidenceChain:
items: List[EvidenceItem]
overall_strength: str
uncertainty: str
limitations: List[str]
Example:
evidence_chain = EvidenceChain(
items=[
EvidenceItem(
source_type="mechanism",
description="Reduced refined carbohydrate intake may reduce postprandial glucose excursions.",
strength="moderate"
),
EvidenceItem(
source_type="individual_context",
description="Current state includes high refined carbohydrate pattern and low exercise frequency.",
strength="contextual"
),
EvidenceItem(
source_type="clinical_guideline",
description="Lifestyle intervention is commonly recommended for metabolic risk management.",
strength="high"
)
],
overall_strength="moderate",
uncertainty="moderate",
limitations=[
"adherence is uncertain",
"individual response may vary",
"clinical review required when disease or medication is involved"
]
)
Engineering rule:
recommendation without evidence object = weak output
action + transition + evidence + feedback plan = stronger world-model output
7. Feedback: the system must update
A world model should not be a one-shot answer generator.
It should support feedback updates.
@dataclass
class FollowUpFeedback:
patient_id: str
action_id: str
timepoint_weeks: int
observed_markers: Dict
adherence: Dict
symptom_changes: Dict
adverse_events: List[str]
Example:
feedback = FollowUpFeedback(
patient_id="P001",
action_id="nutrition_low_glycemic_8w",
timepoint_weeks=8,
observed_markers={
"fasting_glucose": 5.8,
"hba1c": 5.8,
"weight_change_kg": -2.1,
"waist_change_cm": -3.0
},
adherence={
"nutrition": "medium",
"exercise": "low",
"sleep": "unchanged"
},
symptom_changes={
"fatigue": "slightly_improved",
"post_meal_sleepiness": "improved"
},
adverse_events=[]
)
Feedback then updates the record:
def update_world_model_state(
previous_state: PatientState,
action: InterventionAction,
transition: TransitionHypothesis,
feedback: FollowUpFeedback
):
update_record = {
"previous_state": previous_state,
"action": action,
"expected_transition": transition,
"observed_feedback": feedback,
"interpretation": None,
"next_step": None
}
if feedback.adherence.get("nutrition") == "medium":
update_record["interpretation"] = (
"Partial improvement observed; adherence may limit effect size."
)
update_record["next_step"] = (
"Review adherence barriers and consider adjusting action intensity."
)
else:
update_record["interpretation"] = (
"Observed feedback should be interpreted with caution."
)
update_record["next_step"] = (
"Collect more context before updating the transition hypothesis."
)
return update_record
If the system cannot update from feedback, it is closer to a recommendation engine than a world model.
8. Causal boundaries: action-conditioned reasoning is not correlation
Once a system asks:
if action A, then what may happen?
it enters causal territory.
The transition should not be only correlation:
transition = correlate(state_features, future_outcomes)
The system should explicitly track causal assumptions and uncertainty:
@dataclass
class CausalAssumption:
assumption_id: str
description: str
possible_confounders: List[str]
applicable_population: str
evidence_level: str
uncertainty: str
Example:
causal_assumption = CausalAssumption(
assumption_id="CA001",
description=(
"Reducing refined carbohydrate intake may reduce postprandial glucose "
"excursions in individuals with diet-related glucose variability."
),
possible_confounders=[
"medication_change",
"physical_activity_change",
"sleep_change",
"stress_change",
"baseline_disease_status"
],
applicable_population="health-management context with mild metabolic risk",
evidence_level="moderate",
uncertainty="individual_response_varies"
)
This does not mean every implementation must ship a full causal inference engine.
It means:
If the system outputs action-conditioned transitions, it must record causal assumptions, applicability, uncertainty, and limitations.
Otherwise, transition hypotheses can easily become correlation-based extrapolations.
9. SafetyGate: medical systems need boundaries before optimization
A medical world model should not simply optimize for the most promising action.
It must first apply safety boundaries.
@dataclass
class SafetyGateResult:
passed: bool
red_flags: List[str]
contraindications: List[str]
required_review: List[str]
notes: List[str]
Example:
def run_safety_gate(
state: PatientState,
action: InterventionAction
) -> SafetyGateResult:
red_flags = []
contraindications = []
required_review = []
if state.clinical_markers.get("fasting_glucose", 0) > 13.9:
red_flags.append("very_high_glucose_requires_clinical_evaluation")
if "chest_pain" in state.symptoms:
red_flags.append("chest_pain_requires_urgent_evaluation")
if state.medications:
required_review.append("medication_context_requires_clinician_review")
passed = len(red_flags) == 0 and len(contraindications) == 0
return SafetyGateResult(
passed=passed,
red_flags=red_flags,
contraindications=contraindications,
required_review=required_review,
notes=[
"not medical advice",
"not a validated treatment planning system",
"human review required in clinical context"
]
)
Principle:
No safety gate, no medical world-model deployment.
10. AuditLog: every transition should leave a trace
A medical world model should be able to answer:
- What was the state at the time?
- Why was this action proposed?
- What was the transition hypothesis?
- What evidence supported it?
- Who reviewed it?
- Did feedback match the expectation?
- If not, how was the next cycle updated?
A minimal audit log:
@dataclass
class AuditLog:
record_id: str
patient_id: str
state_snapshot_id: str
action_id: str
transition_id: str
evidence_chain_id: str
safety_gate_id: str
reviewer: str
decision: str
timestamp: str
Example:
audit_log = AuditLog(
record_id="AUDIT_20260521_001",
patient_id="P001",
state_snapshot_id="STATE_20260521",
action_id="nutrition_low_glycemic_8w",
transition_id="TRANSITION_20260521_001",
evidence_chain_id="EVIDENCE_20260521_001",
safety_gate_id="SAFETY_20260521_001",
reviewer="human_expert",
decision="approved_for_health_management_context",
timestamp="2026-05-21T20:00:00+08:00"
)
The goal is not to generate a better-sounding answer.
The goal is to make the reasoning process traceable, auditable, feedback-driven, and reviewable.
11. A minimal medical world-model loop
Putting the objects together:
def medical_world_model_loop(patient_id: str):
# 1. Observe current state
state = observe_patient_state(patient_id)
# 2. Retrieve mechanism context
mechanism_context = retrieve_mechanism_context(state)
# 3. Generate candidate actions
candidate_actions = generate_candidate_actions(
state=state,
mechanism_context=mechanism_context
)
transition_candidates = []
for action in candidate_actions:
# 4. Safety gate first
safety = run_safety_gate(state, action)
if not safety.passed:
continue
# 5. Build evidence chain
evidence = build_evidence_chain(
state=state,
action=action,
mechanism_context=mechanism_context
)
# 6. Estimate transition hypothesis
transition = estimate_transition_hypothesis(
state=state,
action=action,
evidence=evidence
)
transition_candidates.append({
"action": action,
"transition": transition,
"evidence": evidence,
"safety": safety
})
# 7. Human-in-the-loop review
selected = human_expert_review(transition_candidates)
# 8. Collect follow-up feedback
feedback = collect_follow_up_feedback(
patient_id=patient_id,
action_id=selected["action"].action_id,
time_window_weeks=selected["transition"].time_window_weeks
)
# 9. Update model state
updated_record = update_world_model_state(
previous_state=state,
action=selected["action"],
transition=selected["transition"],
feedback=feedback
)
# 10. Write audit log
audit_log = write_audit_log(
state=state,
selected=selected,
feedback=feedback,
updated_record=updated_record
)
return {
"updated_record": updated_record,
"audit_log": audit_log
}
The workflow order matters:
state
-> mechanism context
-> candidate action
-> safety gate
-> evidence chain
-> transition hypothesis
-> human review
-> feedback
-> update
-> audit log
That is the engineering difference between a mechanism graph and a medical world-model loop.
12. JSON example: one transition record
A simplified transition record may look like:
{
"state": {
"patient_id": "P001",
"state_snapshot_id": "STATE_20260521",
"clinical_markers": {
"bmi": 29.1,
"fasting_glucose": 6.2,
"hba1c": 6.0,
"triglycerides": 2.1
},
"lifestyle": {
"sleep_hours": 5.8,
"exercise_frequency_per_week": 1,
"diet_pattern": "high_refined_carbohydrate"
},
"mechanism_context": {
"possible_insulin_resistance": true,
"possible_glucose_variability": true,
"data_quality": "partial"
}
},
"action": {
"action_id": "nutrition_low_glycemic_8w",
"category": "nutrition",
"duration_weeks": 8,
"target_mechanisms": [
"postprandial_glucose_variability",
"insulin_resistance"
],
"monitoring_markers": [
"fasting_glucose",
"hba1c",
"weight",
"waist_circumference"
]
},
"transition_hypothesis": {
"expected_direction": {
"fasting_glucose": "decrease_possible",
"postprandial_glucose": "decrease_possible",
"weight": "slight_decrease_possible"
},
"uncertainty_level": "moderate",
"time_window_weeks": 8,
"limitations": [
"individual_response_varies",
"not_a_treatment_effect_prediction"
]
},
"evidence_chain": {
"overall_strength": "moderate",
"items": [
{
"source_type": "mechanism",
"description": "Reduced refined carbohydrate intake may reduce postprandial glucose excursions."
},
{
"source_type": "individual_context",
"description": "Current lifestyle pattern includes high refined carbohydrate intake."
}
]
},
"safety_gate": {
"passed": true,
"red_flags": [],
"notes": [
"not_medical_advice",
"human_review_required_in_clinical_context"
]
},
"feedback_plan": {
"timepoint_weeks": 8,
"metrics": [
"fasting_glucose",
"hba1c",
"weight",
"waist_circumference",
"symptom_score"
]
}
}
This schema is not the point.
The point is that the reasoning becomes structured.
13. Developer principles
Principle 1: Do not start with a chatbot
Avoid starting with:
answer = llm.chat(user_question)
Start with objects:
state_schema = define_state_schema()
action_schema = define_action_schema()
transition_schema = define_transition_schema()
evidence_schema = define_evidence_schema()
feedback_schema = define_feedback_schema()
Principle 2: Do not frame transition as treatment-effect prediction
Avoid:
effect = predict_treatment_effect(state, action)
Prefer:
transition = estimate_transition_hypothesis(state, action, evidence)
Principle 3: A mechanism graph is not the full world model
mechanism_graph = build_mechanism_graph(omics_data)
This is useful, but not enough.
You still need:
action = define_intervention_action()
transition = estimate_transition_hypothesis(state, action, evidence)
feedback = collect_follow_up_feedback()
Principle 4: Evidence must be a first-class object
Avoid:
recommendation = generate_recommendation(state)
Prefer:
output = {
"state": state,
"action": action,
"transition_hypothesis": transition,
"evidence_chain": evidence_chain,
"safety_gate": safety_gate,
"feedback_plan": feedback_plan
}
Principle 5: Human-in-the-loop is core
A medical world model should not be designed as an automatic treatment system.
decision = human_expert_review(model_output)
This should be part of the architecture, not an afterthought.
Principle 6: No feedback, no strong world model
If the system cannot update:
updated_state = update_state(previous_state, action, feedback)
it is closer to a one-shot recommendation system than a medical world model.
14. A steerable medical world model
In this context, SteeraMed can be understood as a steerable biomedical world-model framework.
Its engineering focus is not automatic control of the human body.
It is the organization of these objects:
State
Action
Transition Hypothesis
Evidence Chain
Safety Gate
Human Review
Feedback
Audit Log
A simplified interface:
class SteerableMedicalWorldModel:
def observe_state(self, patient_id):
pass
def generate_actions(self, state):
pass
def run_safety_gate(self, state, action):
pass
def build_evidence_chain(self, state, action):
pass
def estimate_transition(self, state, action, evidence):
pass
def request_human_review(self, candidates):
pass
def collect_feedback(self, selected_action):
pass
def update_model(self, state, action, feedback):
pass
def write_audit_log(self, record):
pass
This is very different from a medical chatbot.
A chatbot can generate a plausible answer.
A medical world model should preserve the state, action, evidence, transition, feedback, and audit trail behind the answer.
15. Summary: mechanism layer vs action-simulation layer
Systems biology is essential.
It helps us understand biological networks, pathways, mechanisms, dynamics, and system-level regulation.
But from an engineering perspective, a systems biology model is usually not yet a complete medical world model.
A medical world model connects:
individual state
intervention action
mechanism-informed evidence
transition hypothesis
safety gate
human review
longitudinal feedback
audit log
The relationship can be summarized as:
Systems biology:
mechanism understanding
Medical world model:
mechanism-informed action simulation
Steerable medical world model:
goal-directed, evidence-bounded, feedback-calibrated intervention reasoning
So the distinction is not about replacing systems biology.
It is about extending mechanism understanding into action-conditioned, evidence-bounded, feedback-calibrated reasoning.
That is why medical AI still needs medical world models, even in a field where systems biology is already powerful.
References
- Kitano, H. Systems Biology: A Brief Overview. Science, 2002. https://doi.org/10.1126/science.1069492
- Kitano, H. Computational systems biology. Nature, 2002. https://doi.org/10.1038/nature01254
- Ideker, T., Galitski, T., & Hood, L. A new approach to decoding life: systems biology. Annual Review of Genomics and Human Genetics, 2001. https://doi.org/10.1146/annurev.genom.2.1.343
- Barabási, A.-L., Gulbahce, N., & Loscalzo, J. Network medicine: a network-based approach to human disease. Nature Reviews Genetics, 2011. https://doi.org/10.1038/nrg2918
- Noble, D. The Music of Life: Biology Beyond Genes. Oxford University Press, 2006.
- Ha, D., & Schmidhuber, J. Recurrent World Models Facilitate Policy Evolution. NeurIPS, 2018. https://arxiv.org/abs/1803.10122
- LeCun, Y. A Path Towards Autonomous Machine Intelligence. OpenReview, 2022. https://openreview.net/forum?id=BZ5a1r-kVsf
- Pearl, J., & Mackenzie, D. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018.
- Katsoulakis, E., Wang, Q., Wu, H., et al. Digital twins for health: a scoping review. npj Digital Medicine, 2024. https://doi.org/10.1038/s41746-024-01073-0
- Yang, Y., Wang, Z.-Y., Liu, Q., Sun, S., Wang, K., Chellappa, R., Zhou, Z., Yuille, A., Zhu, L., Zhang, Y.-D., & Chen, J. Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning. arXiv:2506.02327, 2025. https://arxiv.org/abs/2506.02327
- Xiong, J. World Models for Biomedicine: A Steerability Framework. Preprints.org, 2026. https://doi.org/10.20944/preprints202605.0366.v1
- SteeraMed project: https://SteeraMed.com
- Steerable World project: https://steerable.world
Top comments (0)