JXIONG

Posted on May 21

What Is a World Model, and Why Is It More Than Prediction?

#ai #worldmodel #steeramed #deepome

Most medical AI systems today are still designed as prediction systems.

A typical pipeline looks like this:

data = collect_patient_data()
features = extract_features(data)
risk = model.predict(features)
return risk

This can be useful.

A prediction model can answer questions such as:

What is the estimated risk of a disease?
Does an image contain an abnormal finding?
Which risk group does this person belong to?
What is the probability of a future clinical event?

But a world model asks a different question.

A prediction model asks:

future = predict(state)

A world model asks:

next_state = simulate_transition(state, action)

In other words:

Prediction asks: what may happen next?

A world model asks: what may happen if we take a specific action?

That difference matters a lot in medicine.

Medicine is not only about recognizing risk. It is also about deciding what to do under uncertainty.

The Minimal Structure of a World Model

A simplified world model can be described with five objects:

State       What is the system like now?
Action      What can be done?
Transition  How may the system change after the action?
Objective   What direction are we trying to move toward?
Feedback    What did we observe after the action?

A very small abstraction could look like this:

class WorldModel:
    def observe_state(self, system):
        pass

    def define_action(self, action_input):
        pass

    def simulate_transition(self, state, action):
        pass

    def collect_feedback(self, system, action):
        pass

    def update(self, state, action, feedback):
        pass

This is already different from a simple predictive model.

A predictive model can work without an explicit action object.

A world model cannot.

If there is no action, there is no action-conditioned transition.

Why Medical AI Needs More Than Risk Prediction

Many medical AI systems are good at recognition and prediction:

image classification;
risk scoring;
disease detection;
prognosis estimation;
anomaly detection;
population stratification.

But many real medical and health-management problems are not just classification problems.

They are action problems.

For example:

Should an intervention be considered?
Which variable should be monitored first?
What evidence supports the expected change?
What feedback should be collected?
If the result is different from expected, what should be updated?

These are not just prediction questions. They require a system to reason about state, action, transition, evidence, and feedback.

That is where the idea of a medical world model becomes useful.

A Prediction Model vs. a Medical World Model

A prediction model may look like this:

class PredictionModel:
    def predict_risk(self, patient_data):
        features = self.extract_features(patient_data)
        risk_score = self.model.predict(features)
        return risk_score

It maps data to a risk score.

A medical world model needs a different structure:

class MedicalWorldModel:
    def simulate_transition(self, state, action, evidence):
        if not self.safety_gate(state, action):
            return {
                "status": "blocked",
                "reason": "Safety gate failed. Human review required."
            }

        transition = self.transition_model.estimate(
            state=state,
            action=action,
            evidence=evidence
        )

        return {
            "status": "hypothesis_generated",
            "state": state,
            "action": action,
            "expected_transition": transition,
            "evidence": evidence,
            "disclaimer": "Hypothesis-generating only. Not medical advice."
        }

The output is not a treatment recommendation.

It is a transition hypothesis.

That distinction is critical.

State: Do Not Reduce a Person to a Risk Score

A risk score may be useful:

{
  "cardiovascular_risk": 0.23
}

But it is not enough for a world model.

A world model needs a richer representation of state.

Example schema:

{
  "subject_id": "anonymous_001",
  "timestamp": "2026-05-20",
  "metabolic_state": {
    "fasting_glucose": 5.6,
    "hba1c": 5.7,
    "fasting_insulin": 12.4,
    "triglycerides": 1.8
  },
  "inflammation_state": {
    "hs_crp": 2.1
  },
  "lifestyle_state": {
    "sleep_duration": 6.2,
    "weekly_exercise_minutes": 90,
    "diet_pattern": "high_refined_carbohydrate"
  },
  "risk_context": {
    "family_history": ["type_2_diabetes"],
    "medications": [],
    "known_conditions": []
  }
}

This is only an illustrative schema, not a clinical standard.

In a real system, every state variable should have:

source;
unit;
timestamp;
measurement context;
missing-value handling;
data-quality metadata;
uncertainty annotation.

A simple Python representation might look like this:

from dataclasses import dataclass
from typing import Dict, Any, List

@dataclass
class HealthState:
    subject_id: str
    timestamp: str
    biomarkers: Dict[str, Any]
    lifestyle: Dict[str, Any]
    symptoms: Dict[str, Any]
    medications: List[str]
    context: Dict[str, Any]
    data_quality: Dict[str, Any]

For a medical world model, state representation is not a preprocessing detail.

It is the foundation.

Action: Interventions Must Become Computable Objects

In a chatbot, an intervention might appear as natural language:

Improve sleep, exercise more, and eat better.

That is not enough for a world model.

A world model needs computable action objects.

Example:

{
  "action_id": "increase_zone2_exercise",
  "type": "lifestyle",
  "target": "weekly_exercise_minutes",
  "change": {
    "from": 90,
    "to": 150
  },
  "duration": "12_weeks",
  "monitoring": [
    "resting_heart_rate",
    "sleep_quality",
    "fasting_glucose"
  ],
  "safety_notes": [
    "requires clinician review if known cardiovascular disease exists"
  ]
}

Python representation:

@dataclass
class MedicalAction:
    action_id: str
    action_type: str
    target: str
    parameters: Dict[str, Any]
    duration: str
    monitoring: List[str]
    safety_notes: List[str]

In medicine, an action could be:

a medication change;
a lifestyle intervention;
a monitoring plan;
a nutrition strategy;
a sleep intervention;
an exercise protocol;
a follow-up test;
a referral;
a decision to wait and observe.

The key is that actions should be explicit, parameterized, time-bounded, and auditable.

Evidence: Transitions Should Be Evidence-Bound

A medical world model should not freely hallucinate transitions.

Every transition hypothesis should be bound to evidence.

A minimal evidence object:

{
  "evidence_id": "evidence_001",
  "claim": "Increasing weekly aerobic exercise may improve insulin sensitivity in selected metabolic-risk populations.",
  "evidence_type": [
    "clinical_guideline",
    "peer_reviewed_study",
    "mechanistic_rationale"
  ],
  "strength": "moderate",
  "applicability": {
    "population_match": "partial",
    "condition_match": "partial",
    "uncertainty": "individual response may vary"
  },
  "limitations": [
    "not a personalized treatment prediction",
    "requires safety screening",
    "effect size depends on baseline state and adherence"
  ]
}

A simple evidence builder:

@dataclass
class EvidenceItem:
    evidence_id: str
    claim: str
    evidence_type: List[str]
    strength: str
    applicability: Dict[str, Any]
    limitations: List[str]

class EvidenceBuilder:
    def build(self, state: HealthState, action: MedicalAction):
        evidence_items = self.retrieve_relevant_evidence(state, action)
        filtered = self.filter_by_applicability(evidence_items, state)
        return filtered

    def retrieve_relevant_evidence(self, state, action):
        # In production, this should query curated knowledge bases,
        # clinical guidelines, systematic reviews, or trusted literature indexes.
        return []

    def filter_by_applicability(self, evidence_items, state):
        # Filter by population, context, condition, safety boundary,
        # measurement quality, and uncertainty.
        return evidence_items

A useful rule:

Do not generate advice. Generate evidence-bound transition hypotheses.

Transition: Hypothesis, Not Promise

In a medical world model, a transition should not be framed as a promise.

Bad framing:

This intervention will improve the outcome.

Better framing:

Given the current state and evidence constraints, this action may produce the following state changes, with uncertainty.

A transition object:

@dataclass
class TransitionHypothesis:
    from_state: HealthState
    action: MedicalAction
    expected_changes: Dict[str, Any]
    time_window: str
    evidence: List[EvidenceItem]
    uncertainty: Dict[str, Any]
    safety_flags: List[str]

Example output:

{
  "expected_changes": {
    "weekly_exercise_minutes": {
      "direction": "increase",
      "expected_from": 90,
      "expected_to": 150
    },
    "insulin_sensitivity": {
      "direction": "potential_improvement",
      "confidence": "low_to_moderate"
    },
    "fasting_glucose": {
      "direction": "possible_decrease",
      "confidence": "uncertain"
    }
  },
  "time_window": "8_to_12_weeks",
  "uncertainty": {
    "adherence": "unknown",
    "baseline_variability": "high",
    "measurement_noise": "moderate"
  },
  "safety_flags": [
    "screen cardiovascular risk before increasing exercise intensity"
  ]
}

Implementation sketch:

class TransitionModel:
    def estimate(self, state, action, evidence):
        expected_changes = self.estimate_expected_changes(
            state=state,
            action=action,
            evidence=evidence
        )

        uncertainty = self.estimate_uncertainty(
            state=state,
            action=action,
            evidence=evidence
        )

        safety_flags = self.check_safety_flags(state, action)

        return TransitionHypothesis(
            from_state=state,
            action=action,
            expected_changes=expected_changes,
            time_window=action.parameters.get("time_window", "unknown"),
            evidence=evidence,
            uncertainty=uncertainty,
            safety_flags=safety_flags
        )

The name matters.

Use TransitionHypothesis, not TreatmentPlan.

Safety Gate: Medical World Models Must Be Safety-First

A medical world model should not simulate all actions freely.

Before transition simulation, there should be a safety gate.

class SafetyGate:
    def check(self, state: HealthState, action: MedicalAction):
        checks = [
            self.check_contraindications(state, action),
            self.check_required_human_review(state, action),
            self.check_action_intensity(state, action),
            self.check_data_quality(state),
        ]

        return all(checks)

    def check_contraindications(self, state, action):
        # Placeholder only.
        # Production systems require curated medical rules and human review.
        return True

    def check_required_human_review(self, state, action):
        # Some actions should never be autonomous.
        return True

    def check_action_intensity(self, state, action):
        return True

    def check_data_quality(self, state):
        return True

Then:

def simulate_medical_transition(state, action):
    if not safety_gate.check(state, action):
        return {
            "status": "blocked",
            "reason": "Safety gate failed. Human review required."
        }

    evidence = evidence_builder.build(state, action)
    transition = transition_model.estimate(state, action, evidence)

    return transition

A useful design principle:

medical_world_model = safety_first + evidence_bound + feedback_calibrated

Feedback: Without Feedback, It Is Not a World Model

A world model should not be a one-shot answer generator.

It should support a loop:

Observe state → Define action → Simulate transition → Collect feedback → Update model

Pseudo-code:

def world_model_loop(subject, action):
    state_t0 = observe_state(subject)

    evidence = build_evidence_chain(state_t0, action)

    transition_hypothesis = simulate_transition(
        state=state_t0,
        action=action,
        evidence=evidence
    )

    feedback = collect_feedback(
        subject=subject,
        action=action,
        time_window=transition_hypothesis.time_window
    )

    updated_state = update_model(
        previous_state=state_t0,
        action=action,
        hypothesis=transition_hypothesis,
        feedback=feedback
    )

    return updated_state

Feedback can include:

repeated biomarkers;
symptoms;
wearable trends;
adherence records;
adverse events;
clinician review;
patient-reported outcomes;
environmental or lifestyle changes.

Feedback object:

{
  "feedback_id": "feedback_001",
  "action_id": "increase_zone2_exercise",
  "time_window": "12_weeks",
  "observations": {
    "weekly_exercise_minutes": 145,
    "fasting_glucose": 5.4,
    "sleep_duration": 6.5,
    "subjective_energy": "improved"
  },
  "adherence": "partial",
  "adverse_events": [],
  "notes": "Interpret carefully; multiple concurrent changes existed."
}

Update logic:

class FeedbackUpdater:
    def update(self, previous_state, action, hypothesis, feedback):
        comparison = self.compare_expected_vs_observed(
            expected=hypothesis.expected_changes,
            observed=feedback["observations"]
        )

        return {
            "previous_state": previous_state,
            "action": action,
            "hypothesis": hypothesis,
            "feedback": feedback,
            "comparison": comparison,
            "update_reason": self.infer_update_reason(comparison)
        }

Without feedback, the system cannot calibrate.

Audit Log: Every Transition Should Be Traceable

For medical AI, auditability is not optional.

A transition should have an audit trail:

{
  "audit_id": "audit_001",
  "timestamp": "2026-05-20T23:00:00+08:00",
  "state_version": "state_v1",
  "action_version": "action_v1",
  "evidence_version": "evidence_v1",
  "transition_version": "transition_v1",
  "model_version": "world_model_v0.1",
  "human_review": {
    "required": true,
    "status": "pending"
  },
  "disclaimer": "Hypothesis-generating only. Not a treatment recommendation."
}

Audit logger:

class AuditLogger:
    def log_transition(self, state, action, evidence, transition):
        return {
            "state": state,
            "action": action,
            "evidence": evidence,
            "transition": transition,
            "model_version": self.get_model_version(),
            "human_review_required": True,
            "disclaimer": "Hypothesis-generating only. Not medical advice."
        }

For medical world models, audit logs are part of the core architecture.

A Minimal Medical World Model

Putting the pieces together:

class MinimalMedicalWorldModel:
    def __init__(
        self,
        safety_gate,
        evidence_builder,
        transition_model,
        feedback_updater,
        audit_logger
    ):
        self.safety_gate = safety_gate
        self.evidence_builder = evidence_builder
        self.transition_model = transition_model
        self.feedback_updater = feedback_updater
        self.audit_logger = audit_logger

    def run(self, state, action):
        if not self.safety_gate.check(state, action):
            return {
                "status": "blocked",
                "message": "Safety gate failed. Human review required."
            }

        evidence = self.evidence_builder.build(state, action)

        transition = self.transition_model.estimate(
            state=state,
            action=action,
            evidence=evidence
        )

        audit_log = self.audit_logger.log_transition(
            state=state,
            action=action,
            evidence=evidence,
            transition=transition
        )

        return {
            "status": "hypothesis_generated",
            "transition": transition,
            "audit_log": audit_log
        }

    def update_with_feedback(self, previous_state, action, transition, feedback):
        return self.feedback_updater.update(
            previous_state=previous_state,
            action=action,
            hypothesis=transition,
            feedback=feedback
        )

Usage:

state = observe_state(subject)
action = define_action(intervention)

result = medical_world_model.run(state, action)

if result["status"] == "hypothesis_generated":
    feedback = collect_feedback(subject, action)

    updated = medical_world_model.update_with_feedback(
        previous_state=state,
        action=action,
        transition=result["transition"],
        feedback=feedback
    )

The loop is:

State → Action → Evidence → Transition → Audit → Feedback → Update

Where SteeraMed Fits

SteeraMed can be understood as a steerable biomedical world model framework.

In developer terms, it is closer to:

state-action-transition-evidence-feedback architecture

than to:

a chatbot that gives medical advice

The goal is not to automate treatment decisions.

The goal is to make biomedical AI systems more:

state-aware;
action-explicit;
evidence-bound;
feedback-calibrated;
safety-gated;
auditable;
human-reviewable.

This is especially relevant for long-term health and longevity medicine, where the problem is not a single prediction task but longitudinal state management under uncertainty.

Developer Takeaways

1. Do not start with a chatbot

A chatbot interface may be useful later, but it should not be the core architecture.

Start with state representation.

state = observe_state(subject)

2. Do not stop at risk prediction

Risk prediction is useful, but it is not a world model.

risk = predict(state)

is not enough.

3. Make actions explicit

Without explicit actions, there is no action-conditioned transition.

next_state = simulate(state, action)

4. Treat transitions as hypotheses

A medical transition is not a promise.

transition = generate_hypothesis(state, action, evidence)

5. Build evidence chains

A transition without evidence is not acceptable in medical AI.

evidence = build_evidence_chain(state, action)

6. Add safety gates before simulation

Do not simulate unsafe or unsupported actions.

if not safety_gate.check(state, action):
    require_human_review()

7. Close the loop with feedback

Without feedback, the model cannot calibrate.

model.update(feedback)

8. Log everything

Every state, action, evidence item, transition, feedback signal, and model version should be traceable.

audit_logger.log(state, action, evidence, transition, feedback)

Final Thought

A medical world model is not a larger prediction model.

It is a system architecture for reasoning about:

state + action + evidence + transition + feedback

under uncertainty.

For developers, the key shift is this:

Do not just build systems that predict what may happen.

Build systems that can represent state, encode actions, simulate evidence-bound transitions, collect feedback, and remain auditable.

That is the difference between prediction and a world model.

References and Project Links

Ha, D., & Schmidhuber, J. Recurrent World Models Facilitate Policy Evolution. Advances in Neural Information Processing Systems 31, 2018. https://papers.nips.cc/paper/7512-recurrent-world-models-facilitate-policy-evolution; arXiv version: https://arxiv.org/abs/1803.10122
LeCun, Y. A Path Towards Autonomous Machine Intelligence. OpenReview, 2022. https://openreview.net/forum?id=BZ5a1r-kVsf
Yang, Y., Wang, Z.-Y., Liu, Q., Sun, S., Wang, K., Chellappa, R., Zhou, Z., Yuille, A., Zhu, L., Zhang, Y.-D., & Chen, J. Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning. arXiv:2506.02327, 2025. https://arxiv.org/abs/2506.02327; project page: https://yijun-yang.github.io/MeWM/
Qazi, M. A., Nadeem, M., & Yaqub, M. Beyond Generative AI: World Models for Clinical Prediction, Counterfactuals, and Planning. arXiv:2511.16333, 2025. https://arxiv.org/abs/2511.16333
Katsoulakis, E., Wang, Q., Wu, H., et al. Digital twins for health: a scoping review. npj Digital Medicine, 7, 77, 2024. https://doi.org/10.1038/s41746-024-01073-0
Emmert-Streib, F., Parkkila, S., Laubenbacher, R., et al. The role of digital twins in P4 medicine: A paradigm for modern healthcare. npj Digital Medicine, 8, 735, 2025. https://doi.org/10.1038/s41746-025-02115-x
Xiong, J. World Models for Biomedicine: A Steerability Framework. Preprints.org, 2026. https://doi.org/10.20944/preprints202605.0366.v1
SteeraMed project: https://SteeraMed.com
Steerable World project: https://steerable.world

DEV Community