What if you could ask: "Which compound is most likely to reverse this specific patient's molecular aging?" — and get a 4-layer auditable evidence chain, not a black-box recommendation?
That's what SteeraMed Core does. It's an open-source Python package that applies the "world model" concept from reinforcement learning to biomedicine: quantify how an individual's biology deviates from normal, then simulate which compounds can steer it back.
No PyTorch. No TensorFlow. No GPU. Just numpy, pandas, scipy, and matplotlib.
pip install steeramed-core
Why This Matters
Epigenetic clocks (Horvath 2013, Hannum 2013) can measure your "biological age" from DNA methylation. But measuring aging is only step one — the real challenge is intervention: how do you steer molecular state back toward a younger profile?
The Hallmarks of Aging framework (López-Otín et al., Cell 2013, >15k citations) defined 9 hallmarks of aging, expanded to 12 in the 2023 update (Hallmarks of aging: An expanding universe). SteeraMed operationalizes this framework using the MSigDB Hallmark 50 gene sets (Liberzon et al., Cell Systems 2015) — 50 curated biological pathway gene sets covering aging, cancer, immunity, and metabolism.
Note: MSigDB "Hallmark" (50 pathway gene sets) ≠ Hallmarks of Aging (12 aging pillars). SteeraMed uses the former as its functional module definitions.
What Is a Biomedical World Model?
In reinforcement learning, a world model (Ha & Schmidhuber, 2018) is an internal simulator that predicts the consequences of different actions. Think of AlphaGo mentally simulating "if I play here, what will my opponent do?"
Applied to biomedicine:
- State representation — Quantify how an individual deviates across 50 biological pathway modules using DNA methylation
- Action simulation — Simulate "if we apply compound X, which disrupted modules get corrected" on the PPI network
- Auditable reasoning — Generate a 4-layer traceable evidence chain, not a black-box output
| Traditional Systems Biology / AI Drug Discovery | Biomedical World Model | |
|---|---|---|
| Unit of analysis | Population mean | Individual (N-of-1) |
| Inference direction | Forward (drug → effect) | Reverse (deviation → corrective drug) |
| Output | Drug repurposing candidates | 4-layer individualized evidence chain |
| Confidence | Clinical trial statistics | Bootstrap resampling confidence |
Project Architecture
steeramed_core/
├── __init__.py # Entry: EvidenceChain + load_example_patient
├── __main__.py # CLI: interactive case selector + batch mode
├── core/ # Core algorithms
│ ├── config.py # Global config + disease presets
│ ├── delta.py # N-of-1 delta vector computation
│ ├── evidence_chain.py # 4-layer evidence chain data structures
│ └── semo.py # SA scoring + compound ranking
├── presets/ # Pre-computed data (3 real clinical cases)
│ ├── catalog.json # Case catalog
│ ├── datasets.json # GEO dataset metadata
│ ├── positive_controls.json # Known drug ground truth
│ └── example_patients/ # 3 JSON patient files
├── viz/ # Visualization (Nature-style theme)
│ ├── theme.py # Color palette + rcParams
│ ├── hallmark_bar.py # Hallmark perturbation bar chart
│ ├── drug_ranking.py # Top-10 compound ranking chart
│ ├── evidence_network.py # Drug-PPI-Hallmark network graph
│ ├── patient_card.py # Single-page patient summary card
│ ├── evidence_view.py # Scientist view (Paper Fig 6/8)
│ └── patient_view.py # Patient view (Paper Fig 4/7)
└── examples/ # Reproduction scripts
├── reproduce_aging_patient_view.py
├── reproduce_ra_evidence_chain.py
└── reproduce_dep_evidence_chain.py
Minimal dependencies:
dependencies = [
"numpy>=1.21",
"pandas>=1.3",
"scipy>=1.7",
"matplotlib>=3.5",
]
The 4-Layer Evidence Chain
Layer 1: Which Hallmark Pathways Are Disrupted?
Map DNA methylation to PPI (protein-protein interaction) network modules. Evaluate all 50 modules, find the ones significantly deviating from age-matched controls.
# core/delta.py — N-of-1 Delta vector
def compute_n1_delta(patient_genes, control_genes):
"""
Δ_i = x_i - x̄_matched
Matched controls: same sex, age ±5 years, K=10
"""
matched_mean = control_genes.mean(axis=0)
return patient_genes - matched_mean
Example findings in the aging case:
- NAD+ metabolism module disrupted (Loss of NAD+)
- Inflammatory modules upregulated (TNFα/NF-κB, IL-6/JAK/STAT3)
- Protein homeostasis disturbed (Unfolded Protein Response)
- Some modules remain normal (Hedgehog, Notch signaling)
Layer 2: Which Compounds Can "Steer Back"?
Compute a Steerability Alignment Score (SA Score) — essentially a Welch-type contrast statistic comparing methylation deltas of compound target genes vs. non-target genes.
# core/semo.py
def compute_sa_score(delta, target_genes, all_genes):
"""
SA Score = Welch t-statistic
Compare compound target genes vs non-targets in disrupted modules
"""
target_delta = delta[delta.index.isin(target_genes)]
non_target_delta = delta[~delta.index.isin(target_genes)]
return welch_t(target_delta, non_target_delta)
Compound-target data comes from STITCH database. The ranking uses importance voting across bootstrap samples:
def rank_compounds_by_importance(sa_matrix, compounds):
"""
Importance Voting: each sample's top-1 compound gets 1 vote
"""
votes = defaultdict(int)
for sample_sa in sa_matrix:
top_compound = sample_sa.idxmax()
votes[top_compound] += 1
return sorted(votes.items(), key=lambda x: -x[1])
Aging case results: Niacin #1 (targets NAD+ metabolism), Colchicine #2 (anti-inflammatory). 2/5 top hits are known geroprotectors.
Layer 3: Mechanism Traceability
Trace each compound's mechanism: compound targets → PPI network neighbors → hub genes → corresponding Hallmark pathway.
Example: Niacin → NAMPT/NAPRT → NAD+ metabolism module → Loss of NAD+
Layer 4: Bootstrap Confidence
1000 bootstrap resamples to test ranking stability. Top-1 compound retention rate determines the evidence level:
| Level | Bootstrap Stability | Meaning |
|---|---|---|
| STRONG | ≥80% | Robust recommendation |
| MODERATE | 50-80% | Reasonable evidence |
| EXPLORATORY | <50% | Hypothesis-generating only |
The evidence chain is a clean dataclass:
# core/evidence_chain.py
@dataclass
class EvidenceChain:
patient_id: str
disease: str
perturbed_modules: List[PPIModule] # Layer 1
top_compounds: List[CompoundMatch] # Layer 2
mechanism_map: dict # Layer 3
bootstrap_stability: dict # Layer 4
def summary(self) -> str: ...
def to_dict(self) -> dict: ...
def to_json(self, path: str) -> None: ...
@classmethod
def from_dict(cls, data: dict) -> 'EvidenceChain':
# Backward compatible: ignores unknown fields
...
Quick Start
Current version: SteeraMed Core is a proof-of-concept demo with 3 built-in real clinical cases from GEO. Custom data upload (450K/EPIC methylation arrays) is coming in future releases. Follow updates at steerable.world.
Interactive Mode
pip install steeramed-core
python -m steeramed_core
SteeraMed Core — N-of-1 Evidence Chain Explorer
=================================================
Select a patient case:
[1] Aging · Population Screening
[2] RA · 51M · T-cell Perturbation
[3] Depression · 52M · Innate Immunity
Enter choice [1-3]:
Batch Mode
python -m steeramed_core --all # Generate all cases
python -m steeramed_core --case ra_303 # Specific case
python -m steeramed_core --list # List available cases
Python API — Load & Inspect
from steeramed_core import EvidenceChain, load_example_patient
patient = load_example_patient("ra_patient_303")
print(patient.summary())
# Inspect the 4 layers
print(f"Disrupted modules: {len(patient.perturbed_modules)}")
print(f"Top compound: {patient.top_compounds[0].compound_id}")
print(f"Bootstrap stability: {patient.bootstrap_stability}")
Python API — Generate All Charts
from steeramed_core import load_example_patient
from steeramed_core.viz.patient_card import plot_patient_card
from steeramed_core.viz.drug_ranking import plot_drug_ranking
from steeramed_core.viz.hallmark_bar import plot_hallmark_bar
from steeramed_core.viz.evidence_network import plot_evidence_network
data = load_example_patient("ra_patient_303").to_dict()
# 4 charts: hallmark perturbation, drug ranking, network, patient card
for fn, name in [
(plot_hallmark_bar, "hallmark_bar"),
(plot_drug_ranking, "drug_ranking"),
(plot_evidence_network, "evidence_network"),
(plot_patient_card, "patient_card"),
]:
fig = fn(data)
fig.savefig(f"{name}.png", dpi=300, bbox_inches="tight")
Python API — Publication-Grade Figures
from steeramed_core.viz.evidence_view import plot_evidence_chain
from steeramed_core.viz.patient_view import plot_patient_view
# Scientist view — 3-panel evidence chain (Paper Fig 6/8 style)
fig = plot_evidence_chain(data)
fig.savefig("evidence_chain.png", dpi=300, bbox_inches="tight")
# Patient view — 3-panel card (Paper Fig 4/7 style)
fig = plot_patient_view(data)
fig.savefig("patient_view.png", dpi=300, bbox_inches="tight")
Validation Results
Retrospective positive control validation on 3 GEO datasets:
| Cohort | Disease | N | Key Finding | Evidence |
|---|---|---|---|---|
| GSE40279 (Hannum) | Aging | 656 | Niacin #1, 2/5 geroprotectors | MODERATE |
| GSE42861 | Rheumatoid Arthritis | 689 | 6/10 known RA drugs recovered, pentoxifylline #1 | STRONG |
| GSE128235 | Depression (MDD) | 533 | Creatine #1, innate immunity dominant | EXPLORATORY |
The RA cohort is the strongest result: known RA drugs recovered at 5.8x random expectation, confirming that PPI module-level alignment captures meaningful drug-disease matches.
The depression cohort's top-1 compound (creatine) has only 24.5% bootstrap stability — honestly flagged as EXPLORATORY. This reflects the high heterogeneity and weak methylation signal in MDD.
Configuration
All hyperparameters live in core/config.py:
# PPI network
PPI_SCORE_CUTOFF = 400 # STRING confidence threshold
PPI_MIN_SIZE = 20 # Min genes per module
PPI_MAX_SIZE = 800 # Max genes per module
# Compound targets
STITCH_SCORE = 200 # STITCH confidence threshold
# Target count range: [60, 300]
# Bootstrap
BOOTSTRAP_N1 = 200 # N-of-1 resampling iterations
BOOTSTRAP_GROUP = 100 # Group-level iterations
# Matching
MATCH_K = 10 # Number of matched controls
MATCH_CALIPER = 5 # Age matching window (years)
Visualization API
| Function | Chart Type | Size | Purpose |
|---|---|---|---|
plot_hallmark_bar() |
Horizontal bar | 8xN in | Hallmark perturbation magnitude |
plot_drug_ranking() |
Horizontal bar | 8xN in | Top-10 compound ranking |
plot_evidence_network() |
Bipartite network | 10x7 in | Drug-Hallmark alignment |
plot_patient_card() |
Card layout | 8x10 in | Single-page patient summary |
plot_evidence_chain() |
3-panel | 7.2x9.0 in | Scientist view (Paper Fig 6/8) |
plot_patient_view() |
3-panel card | 7.5x10.5 in | Patient view (Paper Fig 4/7) |
All viz functions use matplotlib.use('Agg') — works on headless servers and CI.
Honest Limitations
- Retrospective validation only — positive control recovery, not prospective clinical trials
- Bootstrap confidence varies — depression case is only 24.5% stable (EXPLORATORY)
- Single omics — DNA methylation only; transcriptomics/proteomics coming in future versions
- Simple matching — age ±5 years + sex, doesn't control for cell composition
- Demo stage — 3 preset cases, custom data upload coming soon
References
- López-Otín C, et al. The hallmarks of aging. Cell, 2013, 153(6): 1194-1217.
- López-Otín C, et al. Hallmarks of aging: An expanding universe. Cell, 2023, 186(2): 243-278.
- Horvath S. DNA methylation age of human tissues and cell types. Genome Biology, 2013, 14: R115.
- Hannum G, et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Molecular Cell, 2013, 49(4): 621-635.
- Ha D, Schmidhuber J. World models. NeurIPS, 2018.
- Liberzon A, et al. The Molecular Signatures Database Hallmark Gene Set Collection. Cell Systems, 2015, 1: 417-425.
- Xiong J. World Models for Biomedicine: A Steerability Framework. Preprints.org, 2026. DOI: 10.20944/preprints202605.0366.v1
- Xiong J. SteeraMed: A Biomedical World Model for N-of-1 Intervention Reasoning. Preprints.org, 2026. DOI: 10.20944/preprints202605.1578.v1
Links
- GitHub: github.com/DeepoMe/SteeraMed
- Live Demo: agent.steerable.world
- Project Page: steeramed.com
- Website: steerable.world
- Team: deepome.com
SteeraMed is developed by the DeepoMe team. If you're interested in computational longevity, epigenetic clocks, precision medicine, or AI-driven drug discovery — let's connect!
Top comments (0)