DEV Community

JXIONG
JXIONG

Posted on

Building a Biomedical World Model in Python: SteeraMed Core

What if you could ask: "Which compound is most likely to reverse this specific patient's molecular aging?" — and get a 4-layer auditable evidence chain, not a black-box recommendation?

That's what SteeraMed Core does. It's an open-source Python package that applies the "world model" concept from reinforcement learning to biomedicine: quantify how an individual's biology deviates from normal, then simulate which compounds can steer it back.

No PyTorch. No TensorFlow. No GPU. Just numpy, pandas, scipy, and matplotlib.

pip install steeramed-core
Enter fullscreen mode Exit fullscreen mode

Why This Matters

Epigenetic clocks (Horvath 2013, Hannum 2013) can measure your "biological age" from DNA methylation. But measuring aging is only step one — the real challenge is intervention: how do you steer molecular state back toward a younger profile?

The Hallmarks of Aging framework (López-Otín et al., Cell 2013, >15k citations) defined 9 hallmarks of aging, expanded to 12 in the 2023 update (Hallmarks of aging: An expanding universe). SteeraMed operationalizes this framework using the MSigDB Hallmark 50 gene sets (Liberzon et al., Cell Systems 2015) — 50 curated biological pathway gene sets covering aging, cancer, immunity, and metabolism.

Note: MSigDB "Hallmark" (50 pathway gene sets) ≠ Hallmarks of Aging (12 aging pillars). SteeraMed uses the former as its functional module definitions.

What Is a Biomedical World Model?

In reinforcement learning, a world model (Ha & Schmidhuber, 2018) is an internal simulator that predicts the consequences of different actions. Think of AlphaGo mentally simulating "if I play here, what will my opponent do?"

Applied to biomedicine:

  • State representation — Quantify how an individual deviates across 50 biological pathway modules using DNA methylation
  • Action simulation — Simulate "if we apply compound X, which disrupted modules get corrected" on the PPI network
  • Auditable reasoning — Generate a 4-layer traceable evidence chain, not a black-box output
Traditional Systems Biology / AI Drug Discovery Biomedical World Model
Unit of analysis Population mean Individual (N-of-1)
Inference direction Forward (drug → effect) Reverse (deviation → corrective drug)
Output Drug repurposing candidates 4-layer individualized evidence chain
Confidence Clinical trial statistics Bootstrap resampling confidence

Project Architecture

steeramed_core/
├── __init__.py              # Entry: EvidenceChain + load_example_patient
├── __main__.py              # CLI: interactive case selector + batch mode
├── core/                    # Core algorithms
│   ├── config.py            # Global config + disease presets
│   ├── delta.py             # N-of-1 delta vector computation
│   ├── evidence_chain.py    # 4-layer evidence chain data structures
│   └── semo.py              # SA scoring + compound ranking
├── presets/                 # Pre-computed data (3 real clinical cases)
│   ├── catalog.json         # Case catalog
│   ├── datasets.json        # GEO dataset metadata
│   ├── positive_controls.json # Known drug ground truth
│   └── example_patients/    # 3 JSON patient files
├── viz/                     # Visualization (Nature-style theme)
│   ├── theme.py             # Color palette + rcParams
│   ├── hallmark_bar.py      # Hallmark perturbation bar chart
│   ├── drug_ranking.py      # Top-10 compound ranking chart
│   ├── evidence_network.py  # Drug-PPI-Hallmark network graph
│   ├── patient_card.py      # Single-page patient summary card
│   ├── evidence_view.py     # Scientist view (Paper Fig 6/8)
│   └── patient_view.py      # Patient view (Paper Fig 4/7)
└── examples/                # Reproduction scripts
    ├── reproduce_aging_patient_view.py
    ├── reproduce_ra_evidence_chain.py
    └── reproduce_dep_evidence_chain.py
Enter fullscreen mode Exit fullscreen mode

Minimal dependencies:

dependencies = [
    "numpy>=1.21",
    "pandas>=1.3",
    "scipy>=1.7",
    "matplotlib>=3.5",
]
Enter fullscreen mode Exit fullscreen mode

The 4-Layer Evidence Chain

Layer 1: Which Hallmark Pathways Are Disrupted?

Map DNA methylation to PPI (protein-protein interaction) network modules. Evaluate all 50 modules, find the ones significantly deviating from age-matched controls.

# core/delta.py — N-of-1 Delta vector
def compute_n1_delta(patient_genes, control_genes):
    """
    Δ_i = x_i - x̄_matched
    Matched controls: same sex, age ±5 years, K=10
    """
    matched_mean = control_genes.mean(axis=0)
    return patient_genes - matched_mean
Enter fullscreen mode Exit fullscreen mode

Example findings in the aging case:

  • NAD+ metabolism module disrupted (Loss of NAD+)
  • Inflammatory modules upregulated (TNFα/NF-κB, IL-6/JAK/STAT3)
  • Protein homeostasis disturbed (Unfolded Protein Response)
  • Some modules remain normal (Hedgehog, Notch signaling)

Layer 2: Which Compounds Can "Steer Back"?

Compute a Steerability Alignment Score (SA Score) — essentially a Welch-type contrast statistic comparing methylation deltas of compound target genes vs. non-target genes.

# core/semo.py
def compute_sa_score(delta, target_genes, all_genes):
    """
    SA Score = Welch t-statistic
    Compare compound target genes vs non-targets in disrupted modules
    """
    target_delta = delta[delta.index.isin(target_genes)]
    non_target_delta = delta[~delta.index.isin(target_genes)]
    return welch_t(target_delta, non_target_delta)
Enter fullscreen mode Exit fullscreen mode

Compound-target data comes from STITCH database. The ranking uses importance voting across bootstrap samples:

def rank_compounds_by_importance(sa_matrix, compounds):
    """
    Importance Voting: each sample's top-1 compound gets 1 vote
    """
    votes = defaultdict(int)
    for sample_sa in sa_matrix:
        top_compound = sample_sa.idxmax()
        votes[top_compound] += 1
    return sorted(votes.items(), key=lambda x: -x[1])
Enter fullscreen mode Exit fullscreen mode

Aging case results: Niacin #1 (targets NAD+ metabolism), Colchicine #2 (anti-inflammatory). 2/5 top hits are known geroprotectors.

Layer 3: Mechanism Traceability

Trace each compound's mechanism: compound targets → PPI network neighbors → hub genes → corresponding Hallmark pathway.

Example: Niacin → NAMPT/NAPRT → NAD+ metabolism module → Loss of NAD+

Layer 4: Bootstrap Confidence

1000 bootstrap resamples to test ranking stability. Top-1 compound retention rate determines the evidence level:

Level Bootstrap Stability Meaning
STRONG ≥80% Robust recommendation
MODERATE 50-80% Reasonable evidence
EXPLORATORY <50% Hypothesis-generating only

The evidence chain is a clean dataclass:

# core/evidence_chain.py
@dataclass
class EvidenceChain:
    patient_id: str
    disease: str
    perturbed_modules: List[PPIModule]     # Layer 1
    top_compounds: List[CompoundMatch]     # Layer 2
    mechanism_map: dict                    # Layer 3
    bootstrap_stability: dict              # Layer 4

    def summary(self) -> str: ...
    def to_dict(self) -> dict: ...
    def to_json(self, path: str) -> None: ...

    @classmethod
    def from_dict(cls, data: dict) -> 'EvidenceChain':
        # Backward compatible: ignores unknown fields
        ...
Enter fullscreen mode Exit fullscreen mode

Quick Start

Current version: SteeraMed Core is a proof-of-concept demo with 3 built-in real clinical cases from GEO. Custom data upload (450K/EPIC methylation arrays) is coming in future releases. Follow updates at steerable.world.

Interactive Mode

pip install steeramed-core
python -m steeramed_core
Enter fullscreen mode Exit fullscreen mode
SteeraMed Core — N-of-1 Evidence Chain Explorer
=================================================

Select a patient case:
  [1] Aging · Population Screening
  [2] RA · 51M · T-cell Perturbation
  [3] Depression · 52M · Innate Immunity

Enter choice [1-3]:
Enter fullscreen mode Exit fullscreen mode

Batch Mode

python -m steeramed_core --all             # Generate all cases
python -m steeramed_core --case ra_303     # Specific case
python -m steeramed_core --list            # List available cases
Enter fullscreen mode Exit fullscreen mode

Python API — Load & Inspect

from steeramed_core import EvidenceChain, load_example_patient

patient = load_example_patient("ra_patient_303")
print(patient.summary())

# Inspect the 4 layers
print(f"Disrupted modules: {len(patient.perturbed_modules)}")
print(f"Top compound: {patient.top_compounds[0].compound_id}")
print(f"Bootstrap stability: {patient.bootstrap_stability}")
Enter fullscreen mode Exit fullscreen mode

Python API — Generate All Charts

from steeramed_core import load_example_patient
from steeramed_core.viz.patient_card import plot_patient_card
from steeramed_core.viz.drug_ranking import plot_drug_ranking
from steeramed_core.viz.hallmark_bar import plot_hallmark_bar
from steeramed_core.viz.evidence_network import plot_evidence_network

data = load_example_patient("ra_patient_303").to_dict()

# 4 charts: hallmark perturbation, drug ranking, network, patient card
for fn, name in [
    (plot_hallmark_bar, "hallmark_bar"),
    (plot_drug_ranking, "drug_ranking"),
    (plot_evidence_network, "evidence_network"),
    (plot_patient_card, "patient_card"),
]:
    fig = fn(data)
    fig.savefig(f"{name}.png", dpi=300, bbox_inches="tight")
Enter fullscreen mode Exit fullscreen mode

Python API — Publication-Grade Figures

from steeramed_core.viz.evidence_view import plot_evidence_chain
from steeramed_core.viz.patient_view import plot_patient_view

# Scientist view — 3-panel evidence chain (Paper Fig 6/8 style)
fig = plot_evidence_chain(data)
fig.savefig("evidence_chain.png", dpi=300, bbox_inches="tight")

# Patient view — 3-panel card (Paper Fig 4/7 style)
fig = plot_patient_view(data)
fig.savefig("patient_view.png", dpi=300, bbox_inches="tight")
Enter fullscreen mode Exit fullscreen mode

Validation Results

Retrospective positive control validation on 3 GEO datasets:

Cohort Disease N Key Finding Evidence
GSE40279 (Hannum) Aging 656 Niacin #1, 2/5 geroprotectors MODERATE
GSE42861 Rheumatoid Arthritis 689 6/10 known RA drugs recovered, pentoxifylline #1 STRONG
GSE128235 Depression (MDD) 533 Creatine #1, innate immunity dominant EXPLORATORY

The RA cohort is the strongest result: known RA drugs recovered at 5.8x random expectation, confirming that PPI module-level alignment captures meaningful drug-disease matches.

The depression cohort's top-1 compound (creatine) has only 24.5% bootstrap stability — honestly flagged as EXPLORATORY. This reflects the high heterogeneity and weak methylation signal in MDD.

Configuration

All hyperparameters live in core/config.py:

# PPI network
PPI_SCORE_CUTOFF = 400    # STRING confidence threshold
PPI_MIN_SIZE = 20         # Min genes per module
PPI_MAX_SIZE = 800        # Max genes per module

# Compound targets
STITCH_SCORE = 200        # STITCH confidence threshold
# Target count range: [60, 300]

# Bootstrap
BOOTSTRAP_N1 = 200        # N-of-1 resampling iterations
BOOTSTRAP_GROUP = 100     # Group-level iterations

# Matching
MATCH_K = 10              # Number of matched controls
MATCH_CALIPER = 5         # Age matching window (years)
Enter fullscreen mode Exit fullscreen mode

Visualization API

Function Chart Type Size Purpose
plot_hallmark_bar() Horizontal bar 8xN in Hallmark perturbation magnitude
plot_drug_ranking() Horizontal bar 8xN in Top-10 compound ranking
plot_evidence_network() Bipartite network 10x7 in Drug-Hallmark alignment
plot_patient_card() Card layout 8x10 in Single-page patient summary
plot_evidence_chain() 3-panel 7.2x9.0 in Scientist view (Paper Fig 6/8)
plot_patient_view() 3-panel card 7.5x10.5 in Patient view (Paper Fig 4/7)

All viz functions use matplotlib.use('Agg') — works on headless servers and CI.

Honest Limitations

  1. Retrospective validation only — positive control recovery, not prospective clinical trials
  2. Bootstrap confidence varies — depression case is only 24.5% stable (EXPLORATORY)
  3. Single omics — DNA methylation only; transcriptomics/proteomics coming in future versions
  4. Simple matching — age ±5 years + sex, doesn't control for cell composition
  5. Demo stage — 3 preset cases, custom data upload coming soon

References

  1. López-Otín C, et al. The hallmarks of aging. Cell, 2013, 153(6): 1194-1217.
  2. López-Otín C, et al. Hallmarks of aging: An expanding universe. Cell, 2023, 186(2): 243-278.
  3. Horvath S. DNA methylation age of human tissues and cell types. Genome Biology, 2013, 14: R115.
  4. Hannum G, et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Molecular Cell, 2013, 49(4): 621-635.
  5. Ha D, Schmidhuber J. World models. NeurIPS, 2018.
  6. Liberzon A, et al. The Molecular Signatures Database Hallmark Gene Set Collection. Cell Systems, 2015, 1: 417-425.
  7. Xiong J. World Models for Biomedicine: A Steerability Framework. Preprints.org, 2026. DOI: 10.20944/preprints202605.0366.v1
  8. Xiong J. SteeraMed: A Biomedical World Model for N-of-1 Intervention Reasoning. Preprints.org, 2026. DOI: 10.20944/preprints202605.1578.v1

Links


SteeraMed is developed by the DeepoMe team. If you're interested in computational longevity, epigenetic clocks, precision medicine, or AI-driven drug discovery — let's connect!

Top comments (0)