JXIONG

Posted on May 17

Medical World Models: Why Healthcare AI Needs Steerability, Not Just Harness Engineering

#ai #steeramed #healthtech #worldmodel

Large language models, RAG systems, AI agents, and tool-calling workflows are rapidly entering healthcare and biomedical applications.

This raises an important architectural question:

Is medical AI safety mainly a problem of better prompts, guardrails, workflows, and human review?

These are important. But they are not enough.

In this article, I want to distinguish two related but fundamentally different ideas:

Harness engineering
Steerable biomedical world models

Harness engineering controls an AI system from the outside.

A steerable biomedical world model structures biomedical reasoning from the inside.

The difference matters because medical AI is not only a text-generation problem. It is ultimately about representing biological states, modeling interventions, reasoning about possible state transitions, and inspecting failures when expected changes do not occur.

That requires more than external guardrails.

It requires a state-action-transition-feedback architecture.

1. What is harness engineering?

In modern AI engineering, we rarely expose a raw model directly to users.

Instead, we wrap it with layers of control:

prompt templates
system prompts
RAG pipelines
tool calling
function calling
workflow orchestration
output validators
safety filters
rule engines
human-in-the-loop review
audit logs
sandbox execution
permission control

This broad pattern can be described as harness engineering.

A typical harnessed AI system may look like this:


User Input

↓

Input Filter / Intent Classifier

↓

Prompt Template / System Prompt

↓

LLM / Agent

↓

Tool Calling / RAG / External APIs

↓

Output Validator

↓

Safety Filter

↓

Human Review, if needed

↓

Final Output

In software engineering, this is extremely useful.

For example, a code generation system may be wrapped with:


LLM Code Generator

↓

Static Analysis

↓

Unit Tests

↓

Type Checker

↓

Sandbox Execution

↓

Human Review

↓

Merge / Deploy

The core idea is simple:

We do not assume the model is perfectly reliable. Instead, we build external systems that make its behavior more controlled, inspectable, and reversible.

This approach is essential for deploying AI systems safely.

But in healthcare, it is not sufficient.

2. Medical AI absolutely needs harness engineering

Medical AI has higher risk than many other AI domains.

A poorly constrained medical AI system may:

mislead patients
fabricate medical references
overstep into diagnosis
suggest unsafe treatments
ignore contraindications
confuse health education with medical advice
overinterpret lab results
exaggerate efficacy
miss emergency red flags
delay proper care

So a medical AI system needs strong external controls.

For example:


Medical Knowledge Base

- RAG Retrieval
- Safety Rules
- Output Validation
- High-Risk Intent Detection
- Clinical Review
- Disclaimer Layer
- Audit Logging
- Permission Control

A basic medical AI workflow may look like:


User Question

↓

Risk Classification

↓

Medical Knowledge Retrieval

↓

LLM Response Generation

↓

Medical Safety Validation

↓

Diagnosis / Prescription / Emergency Risk Check

↓

Human Escalation, if needed

↓

Final Response

This kind of harness engineering is necessary.

But it mainly addresses AI behavior risk:


Is the AI hallucinating?

Is it overstepping?

Is it fabricating citations?

Is it giving unsafe advice?

Is it violating the product boundary?

These are important questions.

But they are not the whole problem.

Medical AI also faces a deeper problem:


Is the underlying biomedical reasoning valid?

That is where harness engineering is not enough.

3. Harnessed medical AI is not the same as a medical world model

Many medical AI agent systems look sophisticated:


LLM

- Medical Knowledge Base
- RAG
- Tool Calling
- Multi-Agent Workflow
- Report Generation
- Safety Filtering
- Clinician Review

Such systems can be useful.

They can help summarize records, review literature, explain medical terminology, generate reports, retrieve guidelines, and support administrative workflows.

But they are not necessarily medical world models.

Why?

Because they may not explicitly answer the following questions:


1. What is the current biological state?
2. How is an intervention represented as an action?
3. Given the current state and action, how might the state change?
4. How do alternative actions compare counterfactually?
5. If the expected change does not occur, where did the reasoning fail?

In short:


Workflow ≠ world model

RAG ≠ state representation

Agent ≠ transition model

Guardrail ≠ steerability

A medical world model is not just about making AI outputs safer.

It is about making biological state transitions more representable, inspectable, testable, and correctable.

4. What is a steerable biomedical world model?

A generic world model can be expressed as:


current state + action → predicted or hypothesized next state

Or:


S(t), A → S(t + Δt)

Where:

S(t) is the current state
A is an action
S(t + Δt) is the future state after that action

In robotics, this might mean:


robot position + motor command → next robot position

In a game environment, it might mean:


current frame + player action → next frame

In medicine, a more careful expression is:


current molecular / functional state

- intervention

→ testable hypothesis about biological state transition

This distinction is important.

In early biomedical systems, the “next state” should usually be interpreted as a testable transition hypothesis, not as a validated clinical outcome prediction.

A steerable biomedical world model should not be framed as:


This model can predict which treatment will work.

A more scientifically cautious framing is:


Given a current biological state, a candidate action, and mechanistic constraints, the model generates an auditable and testable hypothesis about the direction of state change.

In other words, early biomedical world models are better understood as:


state-action-transition hypothesis systems

not as:


validated clinical decision systems

This is the central idea behind the steerability framework I proposed in the preprint World Models for Biomedicine: A Steerability Framework.

The framework argues that biomedical world models should not merely forecast likely trajectories. They should support steerability through state representation, capability quantification, intervention-response semantics, counterfactual transition, and quality-control feedback.

Here, “steerability” does not mean that an AI model automatically controls the human body or replaces clinical judgment.

It means that the model makes state, action, transition hypotheses, mechanism evidence, uncertainty, and feedback inspection explicit enough for humans to examine, challenge, and revise.

5. Harness engineering vs steerable world modeling

The key difference can be summarized as:


Harness engineering controls the AI system from outside.

Steerable world modeling structures biomedical reasoning from inside.

Here is a more detailed comparison:

Dimension	Harness Engineering	Steerable Biomedical World Model
Core question	How do we make AI outputs safer?	How do we represent and inspect biological state transitions?
Main object	Model behavior, tools, workflows, outputs	Biological state, intervention action, transition hypothesis, feedback
Common components	Prompt, RAG, validator, guardrail, workflow	State, action, transition, counterfactual, QC
Risk addressed	AI behavior risk	Biomedical reasoning risk
Failure diagnosis	Did the model violate rules? Did a tool fail?	Was the state wrong? Was the action semantics wrong? Was the transition assumption wrong?
Engineering layer	External safety layer	Internal world-model layer
Sufficient for medical world modeling?	No	One core requirement
Typical goal	Safer output	More inspectable biomedical reasoning

Or more simply:


Harness Engineering:

LLM → safer output

Steerable World Model:

biological state → action → transition hypothesis → feedback

These are not competing approaches.

Medical AI needs both.

But they operate at different layers.

6. The core architecture: state-action-transition-feedback

A medical AI system that approaches world-model behavior should include more than:


LLM + RAG + guardrails

It should include at least the following components:


State Representation

Action Representation

Transition Estimation

Counterfactual Reasoning

Mechanism Evidence Chain

Quality-Control Feedback

A high-level architecture may look like this:


Patient Data

↓

State Representation

S(t)

↓

Candidate Action Representation

A

↓

Transition Hypothesis Estimation

Ŝ(t + Δt | A)

↓

Mechanism Evidence Chain

↓

Uncertainty / Confidence

↓

Quality-Control Feedback

↓

Next Iteration

Let’s unpack the main components.

6.1 State representation

The first question is:


What is the current biological state?

In medicine, this state should not be reduced to:


Disease = diabetes

Disease = depression

Disease = rheumatoid arthritis

A disease label is a phenotype-level description.

It is not yet a world-model state space.

A richer state representation may include:

DNA methylation state
transcriptomic state
proteomic state
metabolomic state
immune state
inflammatory state
organ function
pathway activity
network module state
aging-related module state
individual longitudinal trajectory

For example:


S(t) = [

immune_module_state,

mitochondrial_module_state,

metabolic_module_state,

inflammation_resolution_state,

organ_function_state,

...

]

This representation could be a vector, graph, hierarchy, or multimodal embedding.

The form may vary.

The key requirement is:

The model must define what biological state it is trying to simulate.

In the Capomics / mIC-vector framework described in my preprint, an individual can be represented as a combination of module-level intrinsic capability states, rather than as a single biological age or risk score.

6.2 Action representation

In medicine, an action is not merely a label.


drug A

exercise

nutrition intervention

sleep improvement

behavioral therapy

These are surface names.

A world model needs to represent an action in terms of biological semantics:


A = {

target_modules,

mechanism,

direction,

dose,

timing,

duration,

sequence,

context,

uncertainty

}

For example, the same “exercise intervention” may have different biological meanings in different individuals:


Individual 1: improves insulin sensitivity

Individual 2: increases inflammatory burden

Individual 3: improves mitochondrial adaptation

Individual 4: causes recovery failure due to overtraining

Therefore, medical actions must enter an intervention-response semantics layer.

They cannot remain simple database labels.

6.3 Transition estimation

The core question of a world model is:


Given current state S(t) and action A, how might the state change?

Formally:


S(t), A → Ŝ(t + Δt | A)

But in medicine, this must be handled carefully.

An early system should not claim:


The model predicts this treatment will work.

A more rigorous expression is:


The model proposes a mechanistically constrained and auditable hypothesis about the direction of state transition.

This can be described as:


knowledge-constrained transition tendency

Meaning:

Based on available biological mechanisms, network structure, individual state, and intervention semantics, the model estimates a possible direction of state change, which still requires experimental, longitudinal, or clinical validation.

This is one of the main boundaries between ordinary medical question-answering and medical world modeling.

A medical QA system asks:


What does the literature say?

A medical world model asks:


If the current state is like this, and this action is applied, which direction might the biological state move?

Can this transition hypothesis be tested?

If it fails, where might the failure occur?

6.4 Counterfactual reasoning

Medical decision-making is naturally counterfactual.

We often want to ask:


What if we do A instead of B?

What if we sequence A before B?

What if we do nothing?

What if the same intervention is applied to different baseline states?

This cannot be solved by retrieval alone.

A world model should be able to represent:


Ŝ(t + Δt | A)

Ŝ(t + Δt | B)

Ŝ(t + Δt | no intervention)

Then compare those possible trajectories.

At the current stage, such comparisons should be understood as comparisons between counterfactual transition hypotheses, not as validated individualized clinical outcome predictions.

That distinction is critical.

6.5 Quality-control feedback

A medical world model should not only generate hypotheses.

It should also help inspect failure.

If an expected transition does not occur, the system should ask:


1. Was the state measured incorrectly?
2. Was the action defined incorrectly?
3. Did the expected module response fail to occur?
4. Did the state fail to move in the expected direction?
5. Did downstream phenotype propagation fail?
6. Was the time window inappropriate?
7. Was the dose or sequence inappropriate?
8. Was the individual baseline different from assumed?

This can be represented as:


Expected Transition

↓

Observed Transition

↓

Deviation Detected

↓

Failure Localization

↓

Model Revision / Hypothesis Revision

This is where steerability becomes important.

A conventional model may only report:


prediction error

A steerable biomedical world model should help localize the failure:


failure occurred at state measurement

failure occurred at action semantics

failure occurred at transition assumption

failure occurred at downstream propagation

In the preprint, I describe this as a shift from a “what-if simulator” toward a “why-not steering system.”

The model should not only ask:


What if we do this?

It should also ask:


Why did the expected transition not happen?

7. Example: drug ranking vs steerable state transition

Suppose a patient shows inflammation-related abnormalities.

A basic biomedical knowledge system may output:


This drug is related to inflammatory pathways.

A harnessed medical AI system may output:


This drug is related to inflammatory pathways, but this is not medical advice. Please consult a physician.

That response is safer.

But it is still not a world model.

A steerable biomedical world model should ask further:


1. What is the current inflammatory module state?
2. Are the patient-specific abnormalities located in the pathway affected by this drug?
3. What are the drug action’s targets, direction, and module-response semantics?
4. Is the action hypothesized to move the state toward a desired biological direction?
5. Is the predicted shift molecular, functional, or phenotypic?
6. If the intervention fails, where might the failure occur?

A comparison:

System type	Output behavior	World-model capability?
Medical knowledge base	Provides related knowledge	No
RAG medical QA	Retrieves and summarizes	No
Guardrailed medical agent	Produces safer answers	Not sufficient
Drug ranking system	Produces candidate rankings	Not sufficient
Steerable biomedical world model	Builds a state-action-transition-feedback evidence chain	Begins to approach world modeling

I use “begins to approach” intentionally.

A true medical world model also requires data quality, longitudinal validation, intervention datasets, uncertainty calibration, safety boundaries, task-specific evaluation, and clinical assessment.

8. What should SteeraMed point toward?

If SteeraMed is understood as a research, method, or platform direction for steerable medical AI, then it should not be limited to:


Medical LLM + RAG + safety guardrails

That is useful, but it is mainly an application-layer medical AI system.

The deeper question is:


How can medical AI become steerable rather than merely constrained?

In other words:


How can medical AI move from external control toward structured biomedical reasoning?

Architecturally, SteeraMed could be designed as two layers:


SteeraMed Architecture

1. Harness Layer
    - permission control
    - safety boundaries
    - compliance rules
    - output validation
    - human review
    - audit logging
2. Steerability Layer
    - state representation
    - action semantics
    - counterfactual transition
    - mechanism evidence chain
    - quality-control feedback

The first layer asks:


Can the AI speak safely?

The second layer asks:


Can the biomedical state be represented, simulated, inspected, and corrected?

Both are necessary.

9. A layered architecture for serious medical AI

A serious medical AI system may require at least five layers:


┌──────────────────────────────────┐

│ Human Oversight Layer             │

│ Clinicians, researchers, users     │

└──────────────────────────────────┘

↑

┌──────────────────────────────────┐

│ Clinical Governance Layer         │

│ Scope, responsibility, regulation  │

└──────────────────────────────────┘

↑

┌──────────────────────────────────┐

│ Harness Engineering Layer         │

│ Prompt / RAG / Guardrail / Audit  │

└──────────────────────────────────┘

↑

┌──────────────────────────────────┐

│ Steerable World Model Layer       │

│ State / Action / Transition / QC  │

└──────────────────────────────────┘

↑

┌──────────────────────────────────┐

│ Biomedical Data Layer             │

│ Omics / EHR / Wearables / Imaging │

└──────────────────────────────────┘

Each layer addresses a different problem:

Layer	Problem addressed
Biomedical Data Layer	Where the data come from
Steerable World Model Layer	How biological state and transition are modeled
Harness Engineering Layer	How AI behavior is constrained and validated
Clinical Governance Layer	Whether the system is appropriate for a real context
Human Oversight Layer	How humans interpret and decide

This helps avoid a common mistake:

External guardrails cannot replace an internal biomedical world model.

And a world-model concept cannot replace governance, validation, and safety deployment.

10. A checklist for medical AI builders

If you are building a medical AI system, here is a simple checklist.

Harness engineering checklist


[ ] Is there a system prompt?

[ ] Are medical safety boundaries defined?

[ ] Is high-risk intent detected?

[ ] Are diagnosis and prescription boundaries enforced?

[ ] Are RAG sources traceable?

[ ] Is output validation implemented?

[ ] Is human review available?

[ ] Are audit logs retained?

[ ] Is uncertainty expressed?

[ ] Is there a medical disclaimer?

Steerable world model checklist


[ ] Is the patient state space defined?

[ ] Can the system distinguish disease labels from biological states?

[ ] Can interventions be represented as actions?

[ ] Are intervention-response semantics defined?

[ ] Can transition direction be estimated?

[ ] Can counterfactual paths be compared?

[ ] Can mechanism evidence chains be generated?

[ ] Can uncertainty be represented?

[ ] Can failures be localized?

[ ] Can feedback revise the next hypothesis?

If a system only satisfies the first checklist, it is a harnessed medical AI system.

If it also begins to satisfy the second checklist, it starts moving toward a medical world model.

11. Scientific boundary and risk clarification

A steerable biomedical world model is not a clinical automation system.

It should not be interpreted as:


AI can replace physicians.

AI can automatically recommend treatment.

AI can predict individual treatment efficacy.

AI can be directly used for clinical decisions.

A more accurate framing is:


A research architecture for generating, organizing, and testing biomedical state-transition hypotheses.

Any clinical use would require:

prospective validation
clinical studies
safety evaluation
real-world follow-up
physician oversight
regulatory review
clearly defined scope
responsibility boundaries

At the current stage, steerability is best understood as:


research framework

engineering architecture

mechanism-reasoning system

hypothesis-generation system

not as a validated clinical product capability.

Disclaimer:

This article is for research and technical discussion only. It does not provide medical advice, diagnosis, or treatment recommendations. Any biomedical world model intended for clinical use would require prospective validation, safety evaluation, regulatory review where applicable, and clinical oversight.

12. Conclusion

Harness engineering and steerable biomedical world modeling are both important.

But they solve different problems.

Harness engineering asks:


How do we make AI systems safer, more controllable, and more compliant?

Steerable biomedical world modeling asks:


How do we make medical states, interventions, and state transitions representable, inspectable, testable, and correctable?

In one sentence:


Harness engineering makes medical AI safer to use.

Steerable world modeling makes biomedical reasoning more inspectable.

The future of medical AI should not be only:


larger models

- more medical literature
- more complex agent workflows

It also needs:


explicit state representation

- explicit action semantics
- testable transition hypotheses
- auditable mechanism chains
- diagnosable feedback loops

The first truly useful medical world models may not be the ones that claim to predict every treatment outcome.

They may be more modest, more auditable, and more falsifiable systems:

They do not claim to know the future.

They make every assumption about state, action, transition, mechanism, and uncertainty explicit enough to be tested.

That may be the key step from predictive medical AI toward steerable medical AI.

References

Xiong J. World Models for Biomedicine: A Steerability Framework. Preprints.org, 2026.

DOI: https://doi.org/10.20944/preprints202605.0366.v1
SEWO / Steerable Medicine World Model:

https://steerable.world
SteeraMed concept site:

https://steeramed.com
SteeraMed concept site:

https://steeramed.org
DeepOMe / DeepOMe Biology and Longevity AI:

https://deepome.com

DEV Community