DEV Community

JXIONG
JXIONG

Posted on

What Is a World Model, and Why Is It More Than Prediction?

Most medical AI systems today are still designed as prediction systems.

A typical pipeline looks like this:

data = collect_patient_data()
features = extract_features(data)
risk = model.predict(features)
return risk
Enter fullscreen mode Exit fullscreen mode

This can be useful.

A prediction model can answer questions such as:

  • What is the estimated risk of a disease?
  • Does an image contain an abnormal finding?
  • Which risk group does this person belong to?
  • What is the probability of a future clinical event?

But a world model asks a different question.

A prediction model asks:

future = predict(state)
Enter fullscreen mode Exit fullscreen mode

A world model asks:

next_state = simulate_transition(state, action)
Enter fullscreen mode Exit fullscreen mode

In other words:

Prediction asks: what may happen next?

A world model asks: what may happen if we take a specific action?

That difference matters a lot in medicine.

Medicine is not only about recognizing risk. It is also about deciding what to do under uncertainty.


The Minimal Structure of a World Model

A simplified world model can be described with five objects:

State       What is the system like now?
Action      What can be done?
Transition  How may the system change after the action?
Objective   What direction are we trying to move toward?
Feedback    What did we observe after the action?
Enter fullscreen mode Exit fullscreen mode

A very small abstraction could look like this:

class WorldModel:
    def observe_state(self, system):
        pass

    def define_action(self, action_input):
        pass

    def simulate_transition(self, state, action):
        pass

    def collect_feedback(self, system, action):
        pass

    def update(self, state, action, feedback):
        pass
Enter fullscreen mode Exit fullscreen mode

This is already different from a simple predictive model.

A predictive model can work without an explicit action object.

A world model cannot.

If there is no action, there is no action-conditioned transition.


Why Medical AI Needs More Than Risk Prediction

Many medical AI systems are good at recognition and prediction:

  • image classification;
  • risk scoring;
  • disease detection;
  • prognosis estimation;
  • anomaly detection;
  • population stratification.

But many real medical and health-management problems are not just classification problems.

They are action problems.

For example:

  • Should an intervention be considered?
  • Which variable should be monitored first?
  • What evidence supports the expected change?
  • What feedback should be collected?
  • If the result is different from expected, what should be updated?

These are not just prediction questions. They require a system to reason about state, action, transition, evidence, and feedback.

That is where the idea of a medical world model becomes useful.


A Prediction Model vs. a Medical World Model

A prediction model may look like this:

class PredictionModel:
    def predict_risk(self, patient_data):
        features = self.extract_features(patient_data)
        risk_score = self.model.predict(features)
        return risk_score
Enter fullscreen mode Exit fullscreen mode

It maps data to a risk score.

A medical world model needs a different structure:

class MedicalWorldModel:
    def simulate_transition(self, state, action, evidence):
        if not self.safety_gate(state, action):
            return {
                "status": "blocked",
                "reason": "Safety gate failed. Human review required."
            }

        transition = self.transition_model.estimate(
            state=state,
            action=action,
            evidence=evidence
        )

        return {
            "status": "hypothesis_generated",
            "state": state,
            "action": action,
            "expected_transition": transition,
            "evidence": evidence,
            "disclaimer": "Hypothesis-generating only. Not medical advice."
        }
Enter fullscreen mode Exit fullscreen mode

The output is not a treatment recommendation.

It is a transition hypothesis.

That distinction is critical.


State: Do Not Reduce a Person to a Risk Score

A risk score may be useful:

{
  "cardiovascular_risk": 0.23
}
Enter fullscreen mode Exit fullscreen mode

But it is not enough for a world model.

A world model needs a richer representation of state.

Example schema:

{
  "subject_id": "anonymous_001",
  "timestamp": "2026-05-20",
  "metabolic_state": {
    "fasting_glucose": 5.6,
    "hba1c": 5.7,
    "fasting_insulin": 12.4,
    "triglycerides": 1.8
  },
  "inflammation_state": {
    "hs_crp": 2.1
  },
  "lifestyle_state": {
    "sleep_duration": 6.2,
    "weekly_exercise_minutes": 90,
    "diet_pattern": "high_refined_carbohydrate"
  },
  "risk_context": {
    "family_history": ["type_2_diabetes"],
    "medications": [],
    "known_conditions": []
  }
}
Enter fullscreen mode Exit fullscreen mode

This is only an illustrative schema, not a clinical standard.

In a real system, every state variable should have:

  • source;
  • unit;
  • timestamp;
  • measurement context;
  • missing-value handling;
  • data-quality metadata;
  • uncertainty annotation.

A simple Python representation might look like this:

from dataclasses import dataclass
from typing import Dict, Any, List

@dataclass
class HealthState:
    subject_id: str
    timestamp: str
    biomarkers: Dict[str, Any]
    lifestyle: Dict[str, Any]
    symptoms: Dict[str, Any]
    medications: List[str]
    context: Dict[str, Any]
    data_quality: Dict[str, Any]
Enter fullscreen mode Exit fullscreen mode

For a medical world model, state representation is not a preprocessing detail.

It is the foundation.


Action: Interventions Must Become Computable Objects

In a chatbot, an intervention might appear as natural language:

Improve sleep, exercise more, and eat better.
Enter fullscreen mode Exit fullscreen mode

That is not enough for a world model.

A world model needs computable action objects.

Example:

{
  "action_id": "increase_zone2_exercise",
  "type": "lifestyle",
  "target": "weekly_exercise_minutes",
  "change": {
    "from": 90,
    "to": 150
  },
  "duration": "12_weeks",
  "monitoring": [
    "resting_heart_rate",
    "sleep_quality",
    "fasting_glucose"
  ],
  "safety_notes": [
    "requires clinician review if known cardiovascular disease exists"
  ]
}
Enter fullscreen mode Exit fullscreen mode

Python representation:

@dataclass
class MedicalAction:
    action_id: str
    action_type: str
    target: str
    parameters: Dict[str, Any]
    duration: str
    monitoring: List[str]
    safety_notes: List[str]
Enter fullscreen mode Exit fullscreen mode

In medicine, an action could be:

  • a medication change;
  • a lifestyle intervention;
  • a monitoring plan;
  • a nutrition strategy;
  • a sleep intervention;
  • an exercise protocol;
  • a follow-up test;
  • a referral;
  • a decision to wait and observe.

The key is that actions should be explicit, parameterized, time-bounded, and auditable.


Evidence: Transitions Should Be Evidence-Bound

A medical world model should not freely hallucinate transitions.

Every transition hypothesis should be bound to evidence.

A minimal evidence object:

{
  "evidence_id": "evidence_001",
  "claim": "Increasing weekly aerobic exercise may improve insulin sensitivity in selected metabolic-risk populations.",
  "evidence_type": [
    "clinical_guideline",
    "peer_reviewed_study",
    "mechanistic_rationale"
  ],
  "strength": "moderate",
  "applicability": {
    "population_match": "partial",
    "condition_match": "partial",
    "uncertainty": "individual response may vary"
  },
  "limitations": [
    "not a personalized treatment prediction",
    "requires safety screening",
    "effect size depends on baseline state and adherence"
  ]
}
Enter fullscreen mode Exit fullscreen mode

A simple evidence builder:

@dataclass
class EvidenceItem:
    evidence_id: str
    claim: str
    evidence_type: List[str]
    strength: str
    applicability: Dict[str, Any]
    limitations: List[str]

class EvidenceBuilder:
    def build(self, state: HealthState, action: MedicalAction):
        evidence_items = self.retrieve_relevant_evidence(state, action)
        filtered = self.filter_by_applicability(evidence_items, state)
        return filtered

    def retrieve_relevant_evidence(self, state, action):
        # In production, this should query curated knowledge bases,
        # clinical guidelines, systematic reviews, or trusted literature indexes.
        return []

    def filter_by_applicability(self, evidence_items, state):
        # Filter by population, context, condition, safety boundary,
        # measurement quality, and uncertainty.
        return evidence_items
Enter fullscreen mode Exit fullscreen mode

A useful rule:

Do not generate advice. Generate evidence-bound transition hypotheses.


Transition: Hypothesis, Not Promise

In a medical world model, a transition should not be framed as a promise.

Bad framing:

This intervention will improve the outcome.
Enter fullscreen mode Exit fullscreen mode

Better framing:

Given the current state and evidence constraints, this action may produce the following state changes, with uncertainty.
Enter fullscreen mode Exit fullscreen mode

A transition object:

@dataclass
class TransitionHypothesis:
    from_state: HealthState
    action: MedicalAction
    expected_changes: Dict[str, Any]
    time_window: str
    evidence: List[EvidenceItem]
    uncertainty: Dict[str, Any]
    safety_flags: List[str]
Enter fullscreen mode Exit fullscreen mode

Example output:

{
  "expected_changes": {
    "weekly_exercise_minutes": {
      "direction": "increase",
      "expected_from": 90,
      "expected_to": 150
    },
    "insulin_sensitivity": {
      "direction": "potential_improvement",
      "confidence": "low_to_moderate"
    },
    "fasting_glucose": {
      "direction": "possible_decrease",
      "confidence": "uncertain"
    }
  },
  "time_window": "8_to_12_weeks",
  "uncertainty": {
    "adherence": "unknown",
    "baseline_variability": "high",
    "measurement_noise": "moderate"
  },
  "safety_flags": [
    "screen cardiovascular risk before increasing exercise intensity"
  ]
}
Enter fullscreen mode Exit fullscreen mode

Implementation sketch:

class TransitionModel:
    def estimate(self, state, action, evidence):
        expected_changes = self.estimate_expected_changes(
            state=state,
            action=action,
            evidence=evidence
        )

        uncertainty = self.estimate_uncertainty(
            state=state,
            action=action,
            evidence=evidence
        )

        safety_flags = self.check_safety_flags(state, action)

        return TransitionHypothesis(
            from_state=state,
            action=action,
            expected_changes=expected_changes,
            time_window=action.parameters.get("time_window", "unknown"),
            evidence=evidence,
            uncertainty=uncertainty,
            safety_flags=safety_flags
        )
Enter fullscreen mode Exit fullscreen mode

The name matters.

Use TransitionHypothesis, not TreatmentPlan.


Safety Gate: Medical World Models Must Be Safety-First

A medical world model should not simulate all actions freely.

Before transition simulation, there should be a safety gate.

class SafetyGate:
    def check(self, state: HealthState, action: MedicalAction):
        checks = [
            self.check_contraindications(state, action),
            self.check_required_human_review(state, action),
            self.check_action_intensity(state, action),
            self.check_data_quality(state),
        ]

        return all(checks)

    def check_contraindications(self, state, action):
        # Placeholder only.
        # Production systems require curated medical rules and human review.
        return True

    def check_required_human_review(self, state, action):
        # Some actions should never be autonomous.
        return True

    def check_action_intensity(self, state, action):
        return True

    def check_data_quality(self, state):
        return True
Enter fullscreen mode Exit fullscreen mode

Then:

def simulate_medical_transition(state, action):
    if not safety_gate.check(state, action):
        return {
            "status": "blocked",
            "reason": "Safety gate failed. Human review required."
        }

    evidence = evidence_builder.build(state, action)
    transition = transition_model.estimate(state, action, evidence)

    return transition
Enter fullscreen mode Exit fullscreen mode

A useful design principle:

medical_world_model = safety_first + evidence_bound + feedback_calibrated
Enter fullscreen mode Exit fullscreen mode

Feedback: Without Feedback, It Is Not a World Model

A world model should not be a one-shot answer generator.

It should support a loop:

Observe state → Define action → Simulate transition → Collect feedback → Update model
Enter fullscreen mode Exit fullscreen mode

Pseudo-code:

def world_model_loop(subject, action):
    state_t0 = observe_state(subject)

    evidence = build_evidence_chain(state_t0, action)

    transition_hypothesis = simulate_transition(
        state=state_t0,
        action=action,
        evidence=evidence
    )

    feedback = collect_feedback(
        subject=subject,
        action=action,
        time_window=transition_hypothesis.time_window
    )

    updated_state = update_model(
        previous_state=state_t0,
        action=action,
        hypothesis=transition_hypothesis,
        feedback=feedback
    )

    return updated_state
Enter fullscreen mode Exit fullscreen mode

Feedback can include:

  • repeated biomarkers;
  • symptoms;
  • wearable trends;
  • adherence records;
  • adverse events;
  • clinician review;
  • patient-reported outcomes;
  • environmental or lifestyle changes.

Feedback object:

{
  "feedback_id": "feedback_001",
  "action_id": "increase_zone2_exercise",
  "time_window": "12_weeks",
  "observations": {
    "weekly_exercise_minutes": 145,
    "fasting_glucose": 5.4,
    "sleep_duration": 6.5,
    "subjective_energy": "improved"
  },
  "adherence": "partial",
  "adverse_events": [],
  "notes": "Interpret carefully; multiple concurrent changes existed."
}
Enter fullscreen mode Exit fullscreen mode

Update logic:

class FeedbackUpdater:
    def update(self, previous_state, action, hypothesis, feedback):
        comparison = self.compare_expected_vs_observed(
            expected=hypothesis.expected_changes,
            observed=feedback["observations"]
        )

        return {
            "previous_state": previous_state,
            "action": action,
            "hypothesis": hypothesis,
            "feedback": feedback,
            "comparison": comparison,
            "update_reason": self.infer_update_reason(comparison)
        }
Enter fullscreen mode Exit fullscreen mode

Without feedback, the system cannot calibrate.


Audit Log: Every Transition Should Be Traceable

For medical AI, auditability is not optional.

A transition should have an audit trail:

{
  "audit_id": "audit_001",
  "timestamp": "2026-05-20T23:00:00+08:00",
  "state_version": "state_v1",
  "action_version": "action_v1",
  "evidence_version": "evidence_v1",
  "transition_version": "transition_v1",
  "model_version": "world_model_v0.1",
  "human_review": {
    "required": true,
    "status": "pending"
  },
  "disclaimer": "Hypothesis-generating only. Not a treatment recommendation."
}
Enter fullscreen mode Exit fullscreen mode

Audit logger:

class AuditLogger:
    def log_transition(self, state, action, evidence, transition):
        return {
            "state": state,
            "action": action,
            "evidence": evidence,
            "transition": transition,
            "model_version": self.get_model_version(),
            "human_review_required": True,
            "disclaimer": "Hypothesis-generating only. Not medical advice."
        }
Enter fullscreen mode Exit fullscreen mode

For medical world models, audit logs are part of the core architecture.


A Minimal Medical World Model

Putting the pieces together:

class MinimalMedicalWorldModel:
    def __init__(
        self,
        safety_gate,
        evidence_builder,
        transition_model,
        feedback_updater,
        audit_logger
    ):
        self.safety_gate = safety_gate
        self.evidence_builder = evidence_builder
        self.transition_model = transition_model
        self.feedback_updater = feedback_updater
        self.audit_logger = audit_logger

    def run(self, state, action):
        if not self.safety_gate.check(state, action):
            return {
                "status": "blocked",
                "message": "Safety gate failed. Human review required."
            }

        evidence = self.evidence_builder.build(state, action)

        transition = self.transition_model.estimate(
            state=state,
            action=action,
            evidence=evidence
        )

        audit_log = self.audit_logger.log_transition(
            state=state,
            action=action,
            evidence=evidence,
            transition=transition
        )

        return {
            "status": "hypothesis_generated",
            "transition": transition,
            "audit_log": audit_log
        }

    def update_with_feedback(self, previous_state, action, transition, feedback):
        return self.feedback_updater.update(
            previous_state=previous_state,
            action=action,
            hypothesis=transition,
            feedback=feedback
        )
Enter fullscreen mode Exit fullscreen mode

Usage:

state = observe_state(subject)
action = define_action(intervention)

result = medical_world_model.run(state, action)

if result["status"] == "hypothesis_generated":
    feedback = collect_feedback(subject, action)

    updated = medical_world_model.update_with_feedback(
        previous_state=state,
        action=action,
        transition=result["transition"],
        feedback=feedback
    )
Enter fullscreen mode Exit fullscreen mode

The loop is:

State → Action → Evidence → Transition → Audit → Feedback → Update
Enter fullscreen mode Exit fullscreen mode

Where SteeraMed Fits

SteeraMed can be understood as a steerable biomedical world model framework.

In developer terms, it is closer to:

state-action-transition-evidence-feedback architecture
Enter fullscreen mode Exit fullscreen mode

than to:

a chatbot that gives medical advice
Enter fullscreen mode Exit fullscreen mode

The goal is not to automate treatment decisions.

The goal is to make biomedical AI systems more:

  • state-aware;
  • action-explicit;
  • evidence-bound;
  • feedback-calibrated;
  • safety-gated;
  • auditable;
  • human-reviewable.

This is especially relevant for long-term health and longevity medicine, where the problem is not a single prediction task but longitudinal state management under uncertainty.


Developer Takeaways

1. Do not start with a chatbot

A chatbot interface may be useful later, but it should not be the core architecture.

Start with state representation.

state = observe_state(subject)
Enter fullscreen mode Exit fullscreen mode

2. Do not stop at risk prediction

Risk prediction is useful, but it is not a world model.

risk = predict(state)
Enter fullscreen mode Exit fullscreen mode

is not enough.

3. Make actions explicit

Without explicit actions, there is no action-conditioned transition.

next_state = simulate(state, action)
Enter fullscreen mode Exit fullscreen mode

4. Treat transitions as hypotheses

A medical transition is not a promise.

transition = generate_hypothesis(state, action, evidence)
Enter fullscreen mode Exit fullscreen mode

5. Build evidence chains

A transition without evidence is not acceptable in medical AI.

evidence = build_evidence_chain(state, action)
Enter fullscreen mode Exit fullscreen mode

6. Add safety gates before simulation

Do not simulate unsafe or unsupported actions.

if not safety_gate.check(state, action):
    require_human_review()
Enter fullscreen mode Exit fullscreen mode

7. Close the loop with feedback

Without feedback, the model cannot calibrate.

model.update(feedback)
Enter fullscreen mode Exit fullscreen mode

8. Log everything

Every state, action, evidence item, transition, feedback signal, and model version should be traceable.

audit_logger.log(state, action, evidence, transition, feedback)
Enter fullscreen mode Exit fullscreen mode

Final Thought

A medical world model is not a larger prediction model.

It is a system architecture for reasoning about:

state + action + evidence + transition + feedback
Enter fullscreen mode Exit fullscreen mode

under uncertainty.

For developers, the key shift is this:

Do not just build systems that predict what may happen.

Build systems that can represent state, encode actions, simulate evidence-bound transitions, collect feedback, and remain auditable.

That is the difference between prediction and a world model.

References and Project Links

  1. Ha, D., & Schmidhuber, J. Recurrent World Models Facilitate Policy Evolution. Advances in Neural Information Processing Systems 31, 2018. https://papers.nips.cc/paper/7512-recurrent-world-models-facilitate-policy-evolution; arXiv version: https://arxiv.org/abs/1803.10122
  2. LeCun, Y. A Path Towards Autonomous Machine Intelligence. OpenReview, 2022. https://openreview.net/forum?id=BZ5a1r-kVsf
  3. Yang, Y., Wang, Z.-Y., Liu, Q., Sun, S., Wang, K., Chellappa, R., Zhou, Z., Yuille, A., Zhu, L., Zhang, Y.-D., & Chen, J. Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning. arXiv:2506.02327, 2025. https://arxiv.org/abs/2506.02327; project page: https://yijun-yang.github.io/MeWM/
  4. Qazi, M. A., Nadeem, M., & Yaqub, M. Beyond Generative AI: World Models for Clinical Prediction, Counterfactuals, and Planning. arXiv:2511.16333, 2025. https://arxiv.org/abs/2511.16333
  5. Katsoulakis, E., Wang, Q., Wu, H., et al. Digital twins for health: a scoping review. npj Digital Medicine, 7, 77, 2024. https://doi.org/10.1038/s41746-024-01073-0
  6. Emmert-Streib, F., Parkkila, S., Laubenbacher, R., et al. The role of digital twins in P4 medicine: A paradigm for modern healthcare. npj Digital Medicine, 8, 735, 2025. https://doi.org/10.1038/s41746-025-02115-x
  7. Xiong, J. World Models for Biomedicine: A Steerability Framework. Preprints.org, 2026. https://doi.org/10.20944/preprints202605.0366.v1
  8. SteeraMed project: https://SteeraMed.com
  9. Steerable World project: https://steerable.world

Top comments (0)