Whoever builds the “state–intervention–transition” dataset for biomedicine may define the next generation of medical AI infrastructure.
Author: Jianghui Xiong
Medical AI is moving beyond classification, risk prediction, and question answering.
The next frontier is not just:
sample → label
or:
question → answer
It is:
state + action → next state
In other words:
current biological state + intervention → future biological state
To build real biomedical world models, we need more than bigger models. We need something analogous to ImageNet — not for images, but for biological state transitions.
I will call this idea, for now:
Biomedical TransitionNet
A shared infrastructure for recording, standardizing, and evaluating:
baseline biological state
- intervention
- follow-up biological state
- mechanism evidence
- uncertainty
This article explains why such an infrastructure is needed, why it matters, and why it is scientifically difficult.
It does not claim that a complete biomedical world model already exists.
1. ImageNet was not just a dataset. It was infrastructure.
When people talk about the deep learning revolution in computer vision, they often mention AlexNet, VGG, ResNet, and other neural network architectures.
That is correct, but incomplete.
One of the most important enabling factors was ImageNet.
ImageNet was not merely a large collection of images. Its deeper value was that it gave computer vision a shared coordinate system:
- a common task,
- a common label hierarchy,
- common training and test data,
- common benchmarks,
- and a way to compare progress across models and institutions.
Before ImageNet, many computer vision systems were difficult to compare because they were trained and evaluated on different datasets. ImageNet helped the field converge around shared evaluation.
That is why ImageNet became much more than a database. It became research infrastructure.
Medical AI may now need something similar.
But not an image dataset.
Medicine needs an ImageNet for state transitions.
2. Medical AI has many models, but not enough transition data
Today, we already have many types of medical AI systems:
- medical large language models,
- medical question-answering systems,
- radiology models,
- pathology models,
- omics foundation models,
- virtual cell models,
- digital twin systems,
- clinical decision support tools,
- AI drug discovery platforms.
These are important.
But if we think the future of medical AI is only “a bigger medical chatbot”, we may miss the real challenge.
Medicine is not only about answering questions.
Medicine is about understanding and changing biological trajectories.
A clinician does not only ask:
What disease does this patient have?
They also ask:
Why is this biological state happening?
What is driving deterioration?
Which mechanisms are actionable?
Which intervention may shift the trajectory?
How should the response be measured?
What if the expected response does not happen?
What if an adverse response appears?
These are not just language problems.
They are state transition problems.
Most medical AI today is still closer to:
sample → label
or:
question → answer
But biomedical world models require something closer to:
state + action → next state
That is the key shift.
3. What is a biomedical world model?
In AI, a world model is usually understood as an internal model that helps an agent simulate how the environment changes after an action.
A simple abstraction is:
current state + action → future state
In robotics, this may mean:
robot pose + motor command → next scene state
In autonomous driving, it may mean:
traffic scene + driving action → future traffic scene
In biomedicine, the analogous formulation would be:
biological state + intervention → future biological state
This could apply at multiple scales:
cell state + perturbation → cellular response
tissue state + treatment → tissue response
patient state + intervention → follow-up state
A biomedical world model should therefore not be understood as a medical chatbot.
It is not merely:
medical text in → medical text out
A more meaningful biomedical world model would combine:
state representation
- intervention representation
- transition modeling
- mechanism evidence
- uncertainty estimation
- feedback correction
That is much harder than ordinary medical QA.
And it requires a different kind of data.
4. Why medicine needs its own ImageNet
In computer vision, a basic supervised learning unit can often be simplified as:
image + label
For biomedical world models, the basic unit should look more like:
baseline state + action + follow-up state
Or mathematically:
S(t) + A → S(t + Δt)
Where:
S(t) = biological state before intervention
A = action or intervention
S(t + Δt) = biological state after intervention
Δt = time interval
This is fundamentally different from a static medical database.
A biomedical world model does not only need:
- medical images,
- electronic health records,
- omics profiles,
- drug-target databases,
- clinical notes,
- literature graphs.
Those are useful, but insufficient.
It needs structured longitudinal data describing:
what the biological state was,
what action was taken,
what changed afterward,
over what time scale,
with what evidence,
and with what uncertainty.
This is why medicine needs something like a Biomedical TransitionNet.
Not a direct copy of ImageNet.
A new infrastructure designed for biological state transitions.
5. What should one data unit look like?
A conventional supervised learning sample may look like:
x → y
Examples:
image → diagnosis label
clinical note → ICD code
genomic variant → risk category
A biomedical world-model sample should look more like:
state_before
- intervention
- state_after
- time_interval
- evidence_chain
- uncertainty
A simplified schema might look like this:
{
"baseline_state": {
"molecular": "...",
"clinical": "...",
"phenotype": "...",
"lifestyle": "...",
"context": "..."
},
"action": {
"type": "...",
"dose": "...",
"frequency": "...",
"duration": "...",
"mechanism": "..."
},
"follow_up_state": {
"molecular": "...",
"clinical": "...",
"phenotype": "...",
"adverse_events": "..."
},
"transition": {
"direction": "...",
"magnitude": "...",
"time_scale": "...",
"confidence": "..."
},
"evidence_chain": {
"target": "...",
"pathway": "...",
"biomarker": "...",
"phenotype": "...",
"validation": "..."
}
}
This is obviously simplified.
But the principle matters:
A biomedical world model should learn not only:
what this sample is
but:
how this biological system changed after a defined intervention
6. Five layers of a biomedical ImageNet
If we want to build an ImageNet-like infrastructure for biomedical world models, it should include at least five layers.
6.1 State representation
The first question is:
What is the biological state?
A patient state is not just a diagnosis label.
Terms such as:
diabetes
hypertension
aging
inflammation
fatigue
frailty
are useful, but they are high-level descriptions.
A real biological state may include:
- genome,
- DNA methylation,
- transcriptome,
- proteome,
- metabolome,
- immune state,
- inflammatory state,
- organ function,
- microbiome,
- sleep,
- activity,
- diet,
- medication history,
- environmental exposure,
- clinical background.
A simplified representation may be:
individual_state =
molecular_state
- pathway_state
- organ_state
- phenotype_state
- lifestyle_context
- clinical_context
Without a state representation, a biomedical world model does not know what it is simulating.
6.2 Action ontology
A world model needs actions.
In medicine, actions are complex.
They may include:
- drugs,
- supplements,
- diet,
- exercise,
- sleep intervention,
- stress management,
- cell therapy,
- gene therapy,
- regenerative medicine,
- combination therapy,
- N-of-1 personalized intervention.
Even a drug intervention requires many parameters:
drug name
dose
frequency
route
duration
combination
adherence
contraindications
adverse events
Exercise intervention also requires:
type
intensity
frequency
duration
heart-rate zone
recovery condition
baseline fitness
If actions are not standardized, the model cannot learn meaningful transitions.
6.3 Transition record
The core of a biomedical world model is the transition:
before → after
Examples:
inflammatory state before intervention → inflammatory state after intervention
DNA methylation age before intervention → DNA methylation age after intervention
metabolic state before intervention → metabolic state after intervention
tumor state before treatment → tumor state after treatment
Without follow-up measurement, there is no transition.
Without transition, there is no world model.
Many medical datasets are still one-time measurements:
one-time measurement
Biomedical world models need:
longitudinal measurement
6.4 Evidence chain
A medical model should not only output a probability.
If a model says:
This intervention may help.
That is not enough.
It should also answer:
Which targets are involved?
Which pathways are affected?
Which abnormal state does this address?
Which biomarkers can validate the response?
Which evidence comes from experiments?
Which evidence comes from clinical data?
Which part is only model inference?
Which risks should be monitored?
In medicine, prediction alone is not sufficient.
A safer output should look more like:
prediction + mechanism + validation + uncertainty
This is especially important because medical AI should not become an uninspectable black box.
6.5 Benchmark task
ImageNet helped computer vision because different models could be compared on shared tasks.
Biomedical world models need benchmarks too.
Possible benchmark tasks include:
- cellular perturbation response prediction,
- gene expression response after drug perturbation,
- tumor state simulation after treatment,
- metabolic biomarker response prediction,
- inflammatory state transition prediction,
- aging-related biomarker transition prediction,
- N-of-1 intervention response direction prediction.
But the metrics cannot be copied directly from image classification.
Useful metrics may include:
directional accuracy
mechanistic consistency
biomarker validation
uncertainty calibration
risk awareness
cross-context generalization
This is much harder than top-1 accuracy.
But medicine requires it.
7. Related progress: promising, but still early
To be scientifically careful, we should not pretend that complete biomedical world models already exist.
They do not.
But several related directions are emerging.
7.1 ImageNet as an infrastructure analogy
ImageNet and ILSVRC showed how large-scale, standardized datasets and benchmarks can accelerate a field.
However, ImageNet is a benchmark for image classification and detection.
It is not equivalent to what biomedicine needs.
Here, ImageNet is used only as an infrastructure analogy.
The biomedical version must be longitudinal, dynamic, intervention-aware, and mechanism-sensitive.
7.2 World Models in AI
Ha and Schmidhuber’s World Models is a representative work in AI world modeling.
Its key idea is that an agent can learn an internal model of the environment and use it to simulate future states.
Medicine cannot directly copy this setting.
A human body is not a game environment.
Clinical intervention cannot be freely explored by trial and error.
But the abstraction:
state + action → future state
is still useful for thinking about medical AI.
7.3 Virtual cells and perturbation response
Arc Institute’s State model is a recent example of virtual-cell modeling.
It aims to predict how cells respond to drugs, cytokines, or genetic perturbations. Public descriptions indicate that State was trained on large-scale observational and perturbational single-cell data.
This is important because it directly touches the pattern:
cell state + perturbation → cellular response
However, State is primarily a cellular-level model.
It should not be confused with a complete patient-level biomedical world model.
7.4 Medical World Model for tumor evolution
Recent work using the term Medical World Model, such as MeWM, explores generative simulation of tumor evolution under treatment conditions.
This is relevant because it moves medical AI from static recognition toward treatment-conditioned disease dynamics.
But this direction is still early.
It should not be interpreted as a general solution to biomedical world modeling.
7.5 Digital twins and virtual physiological systems
Long before today’s AI world-model terminology, fields such as computational physiology, systems biology, virtual physiological systems, and digital twins already tried to connect biological structure, mechanism, dynamics, and measurable outputs.
That tradition matters.
A good biomedical world model should not be just a black-box predictor.
It should connect:
state
mechanism
dynamic change
measurement
feedback
Today’s biomedical world models can be seen as an extension of this older systems-modeling tradition into the era of AI, multi-omics, real-world data, and large-scale computation.
8. Why steerability matters
A biomedical world model that only predicts is not enough.
A model may predict that a patient’s risk is increasing.
But medicine needs more than that.
It needs to ask:
Which state can be measured?
Which abnormality can be explained?
Which intervention can be described?
Which transition can be tested?
Which deviation can be traced?
Which failure can be corrected?
This is why I emphasize steerability.
Going forward, I will use the name:
SteeraMed: A Steerable Biomedical World Model
Website:
https://SteeraMed.com
The earlier preprint name was:
SEWO / Steerable Medicine World Model
or in Chinese:
可驾驭医学世界模型
Whenever I mention SEWO / 可驾驭医学世界模型, it should be understood together with the new unified naming:
SteeraMed: A Steerable Biomedical World Model
The idea behind SEWO / SteeraMed is that biomedical world models should not only pursue predictive accuracy. They should also support:
- state definition,
- intervention description,
- transition hypothesis,
- mechanism audit,
- deviation tracing,
- uncertainty inspection,
- expert steering,
- and iterative correction.
The related ideas were introduced in the preprint:
World Models for Biomedicine: A Steerability Framework
and are also presented at:
https://steerable.world
Important clarification:
SEWO / SteeraMed is not a clinically validated treatment system.
It is not a medical device.
It is better understood as a structural framework and evidence-chain design principle for future biomedical world models.
The key question is not only:
Can the model predict?
but:
Can researchers and clinicians inspect, question, correct, and steer the model within clearly defined boundaries?
9. Why longevity medicine may be one entry point
Biomedical world models could start from many areas:
- oncology,
- cardiovascular disease,
- metabolic disease,
- immunology,
- neurodegeneration,
- drug discovery,
- virtual cells,
- longevity medicine.
Longevity medicine is not the only entry point.
But it is an interesting one.
Why?
9.1 Aging is a continuous state
Aging is not a single disease label.
It is a continuous, multi-system biological process involving:
- inflammation,
- metabolism,
- immunity,
- epigenetics,
- mitochondrial function,
- proteostasis,
- stem-cell exhaustion,
- cellular senescence,
- organ function decline.
That makes it naturally suitable for state modeling.
9.2 Longevity medicine requires repeated measurement
Longevity medicine is not a one-time diagnostic event.
It depends on repeated measurement over time.
A useful intervention must be evaluated through:
baseline state → intervention → follow-up state
This is exactly the structure needed for biomedical world modeling.
9.3 Interventions are diverse
Longevity-related interventions may include:
- diet,
- exercise,
- sleep,
- supplements,
- drugs,
- cell therapy,
- regenerative medicine,
- stress management,
- environmental exposure management.
This provides a rich action space.
9.4 Individual responses vary
The same intervention may produce different responses in different people.
That means longevity medicine cannot rely only on average effects.
It needs N-of-1 style transition data:
individual state → intervention → individual transition
Each well-structured N-of-1 intervention can be seen as a small world-model experiment.
10. Engineering implications
From an engineering perspective, the biomedical ImageNet is not just a dataset.
It is a data infrastructure problem.
It requires:
- data collection,
- data standardization,
- multimodal integration,
- time-series modeling,
- intervention encoding,
- causal confounding control,
- privacy protection,
- benchmark design,
- safety boundaries,
- evidence-chain tracking.
A simplified loop may look like:
measure state
↓
standardize state representation
↓
record intervention
↓
measure follow-up state
↓
construct transition sample
↓
train / evaluate world model
↓
generate testable hypothesis
↓
repeat and correct
This is not a static dataset.
It is a data flywheel.
11. Main challenges
This is scientifically and technically difficult.
Some of the main challenges include:
11.1 Biological state is complex
A human state cannot be compressed into one label.
We need ways to represent multi-omics, clinical metrics, imaging, lifestyle, symptoms, environmental exposure, and medical history as computable state variables.
11.2 Interventions are hard to standardize
Drugs, exercise, diet, sleep, supplements, and cell therapies all have complex parameters.
Without action standardization, transition learning will be noisy.
11.3 Follow-up data is scarce
Most medical data is not collected as structured pre/post intervention transition data.
This requires new data collection workflows.
11.4 Causal confounding is serious
In the real world, people often change many things at once:
diet
exercise
sleep
medication
supplements
stress
Attributing a state change to one factor is difficult.
This requires careful study design and statistical methods.
11.5 Safety and ethics are central
A biomedical world model cannot freely experiment like a game-playing agent.
Any intervention-related model must clearly distinguish:
research hypothesis
health-management suggestion
clinical decision support
medical recommendation
validated therapy
Clinical use would require prospective validation, safety evaluation, ethical review, regulatory review where applicable, and professional oversight.
11.6 Open standards and business incentives may conflict
If everything is closed, the field cannot build shared benchmarks.
If everything is open, companies may lack incentives to invest.
A practical ecosystem will need a balance among:
open benchmarks
privacy protection
commercial incentives
scientific collaboration
12. A minimal viable direction
A biomedical ImageNet should not begin by trying to simulate the entire human body.
A more realistic path is to start with minimal viable tasks.
Examples:
- cellular perturbation response prediction,
- tumor state change after treatment,
- metabolic biomarker response prediction,
- inflammatory state transition prediction,
- DNA methylation age transition,
- N-of-1 longevity intervention tracking.
A minimal task should define:
1. state variables
2. intervention variables
3. follow-up time
4. transition metrics
5. benchmark task
6. safety boundary
Start narrow.
Make it measurable.
Make it repeatable.
Make it auditable.
Then scale.
13. Whoever defines state, action, and transition may define the field
Medical AI will still need better models.
But bigger models alone cannot solve the problem of biomedical state transition learning.
The scarce asset is the infrastructure that allows models to learn:
how life systems change after intervention
Future platform-level medical AI companies may not be the ones with the largest language models.
They may be the ones that can build the strongest data flywheel:
measure biological state
standardize interventions
record follow-up changes
construct mechanism evidence chains
evaluate transition models
repeat
Whoever defines state defines what medical AI can see.
Whoever defines action defines how medical AI understands intervention.
Whoever defines transition defines how medical AI learns biological change.
Whoever defines the benchmark defines how the field measures progress.
Conclusion
ImageNet helped machines learn to see the world.
A biomedical ImageNet should help AI learn how life responds to intervention.
That does not mean replacing clinicians.
It means building a scientific infrastructure where models can learn:
how states form
how interventions act
how systems transition
how evidence is validated
The next decade of medical AI may not be limited by model size alone.
It may be limited by the lack of a shared infrastructure for biological state transitions.
That is the real opportunity.
References
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L.
ImageNet: A Large-Scale Hierarchical Image Database. CVPR. 2009.
https://ieeexplore.ieee.org/document/5206848Russakovsky O, Deng J, Su H, et al.
ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision. 2015.
https://arxiv.org/abs/1409.0575ImageNet official website.
https://www.image-net.org/Ha D, Schmidhuber J.
World Models. 2018.
https://worldmodels.github.io/Arc Institute.
Arc Institute’s first virtual cell model: State.
https://arcinstitute.org/news/virtual-cell-model-statePredicting cellular responses to perturbation across diverse contexts with State. bioRxiv. 2025.
https://www.biorxiv.org/content/10.1101/2025.06.26.661135v1Yang Y, Wang ZY, Liu Q, et al.
Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning. arXiv.
https://arxiv.org/abs/2506.02327IEEE Transactions on Biomedical Engineering.
Digital Twins / AI World Models.
https://www.embs.org/tbme/research-highlights/digital-twins-ai-world-models/Acosta JN, Falcone GJ, Rajpurkar P, Topol EJ.
Multimodal biomedical AI. Nature Medicine. 2022.
https://www.nature.com/articles/s41591-022-01981-2Xiong J.
World Models for Biomedicine: A Steerability Framework. Preprints.org. 2026.
https://www.preprints.org/manuscript/202605.0366
DOI: https://doi.org/10.20944/preprints202605.0366.v1SteeraMed: A Steerable Biomedical World Model.
https://steerable.world
Disclaimer
This article is for research, technical, and industry discussion only.
It is not medical advice, diagnostic advice, or treatment advice.
Any biomedical world model intended for clinical use would require prospective validation, safety evaluation, ethical review, regulatory review where applicable, and professional clinical oversight.
Top comments (0)