Kwansub Yun

Posted on Mar 3 • Edited on Mar 16 • Originally published at flamehaven.space

What an AI Reasoning Engine Built for Alzheimer's Metabolic Research: A Code Walkthrough

#bioai #alzheimers #computationalbiology #machinelearning

Scope & Disclosure

This post documents an output from Rexsyn Engine v0.7.8 (Run v13).
Status: Hypothesis generation stage — pre-validation.
Data used: Public literature only. No patient cohort data.
Numerical thresholds without citations are engine-generated estimates.
This is not a clinical guideline, regulatory submission, or peer-reviewed finding.

1. Thirteen Attempts. This One Actually Said Something.

Twelve runs produced nothing worth keeping.

Not because the outputs were wrong — they were structurally coherent, mechanistically plausible, properly formatted. The problem was subtler: they were syntactically correct but semantically empty.

The kind of text that reads like a research paper but leaves no residue after you finish it.

Run thirteen was different.

The change was not more data.

It was not better prompting.

It was not stylistic polish.

The change was this:

The engine stopped summarizing and started interpreting.

Specifically, it engaged with the clinical meaningfulness debate around CLARITY-AD — not merely extracting the reported −0.45 CDR-SB delta, but reasoning about what that magnitude implies biologically if amyloid clearance is upstream of degeneration.

That interpretive shift is what made this run worth documenting.

2. Why the Amyloid Question Is Legitimate

The amyloid cascade hypothesis has structured Alzheimer's drug development for over three decades.

From the best available trial data:

CLARITY-AD (lecanemab, n=1,795)

CDR-SB: −0.45 (27% slowing)
ARIA-E: 12.6% | ARIA-H: 17.3%
Statistically significant
Clinical meaningfulness debated (~0.5–1.0 MCID typically cited)

The statistical work is clean.

But amyloid removal does not halt neurodegeneration.

And cognitively normal individuals with high amyloid burden exist.

Independent longitudinal analyses of ADNI and related cohorts have reported strong associations between FDG hypometabolism and accelerated conversion risk, and temporal modeling studies suggest metabolic changes may precede overt amyloid positivity in some trajectories.

This is not proof of causal priority.

But it is enough to justify asking:

What if metabolic failure is not downstream — but parallel or upstream?

That is the substrate Rexsyn v0.7.8 was pointed at.

3. What Rexsyn Is

Rexsyn is an AI reasoning pipeline.

It is not a drug discovery platform.
It is not a biomarker validator.
It is not a clinical model.

It operates in three stages:

Literature ingestion — structured extraction of mechanistic and causal claims
Missing-link inference — assembling explicit causal chains across gaps
Technical report output — formalized assessment with limitations stated

The goal is to evaluate whether a reasoning model can move from:

literature → structured hypothesis → testable computational scaffold

Run thirteen produced the first output with a falsifiable structure.

That is the minimum bar for science.

4. Core Causal Proposal

Metabolic decoupling precedes proteopathy.

Synthesized chain:


Neuronal insulin resistance
→ PI3K/Akt disruption
→ GSK-3β disinhibition
→ Tau hyperphosphorylation

APOE4 fragments
→ Complex IV inhibition
→ OXPHOS failure
→ ROS surge
→ mtDNA damage

Both pathways
→ ATP deficit
→ synaptic dysfunction
→ impaired amyloid clearance

None of these links are novel individually.

The contribution is structural integration.

5. The MSI Framework

Proposed composite biomarker:


MSI = 0.50 × TDA_score(FDG-PET)
+ 0.35 × acylcarnitine_score(plasma)
+ 0.15 × glu_gln_ratio(MRS/CSF)

All weights (50/35/15) are inference-derived.

Proposed threshold:


MSI < 0.38

This value is an engine-generated prior, not an empirical cutoff.

6. Literature Ingestion Log — Rexsyn v0.7.8

The following sources were automatically ingested during Stage 1 (structured causal extraction). Causal and mechanistic claims were parsed, weighted by citation density and mechanistic specificity, and passed to the missing-link inference stage.

Stage 1 — Literature Ingestion (Run v13)

Below is the Stage 1 ingestion list for this run. Each source was converted into structured causal and mechanistic claims before entering the missing-link inference stage.

Note: weighting is heuristic (pre-validation), not a validated scoring method.


- van Dyck CH et al., *NEJM* 2023. DOI:10.1056/NEJMoa2212948  
  → Clinical benchmark (CDR-SB delta, ARIA incidence)

- Fernandes Pinheiro & Lourenco, *IJMS* 2023; 24(1):778  
  → APOE4 mitochondrial mechanisms (Complex I/IV/V disruption)

- Zhao D et al., arXiv:2509.14634 (2025)  
  → TDA features (H1/H2 changes) aligned with AD regions

- Ghosh A et al., arXiv:2408.15647 (2024)  
  → Vietoris–Rips filtration for MCI classification

- Songdechakraiwut T et al., arXiv:2511.09949 (2025)  
  → Dynamic topology + early detection directionality

- Krause et al., arXiv:2511.03605 (2025)  
  → Bayesian PH; network disruption correlates more with tau than amyloid

- Jolivet R et al., *PLOS Comput Biol* 2015. DOI:10.1371/journal.pcbi.1004036  
  → ANLS validated kinetic parameter source (target for replacement)

- Magistretti & Allaman, *Nat Rev Neurosci* 2018; 19:235–249  
  → ANLS mechanism; physiological ATP efficiency discussion (~13–15)

- Arnold M et al., *J Proteome Res* 2020 (ADNI n=1,517)  
  → Acylcarnitine signals; sex × APOE interactions

- Kalecký K et al., *J Alzheimers Dis* 2022  
  → Targeted metabolomics; pathway disruption (FDR-controlled)

- Trushina E et al., *PLOS Medicine* 2013; 10:e1001259 (ADNI n=767)  
  → Metabolite signatures across progression

Ingested: 11 sources → causal graph assembled → missing-link inference initiated (target question: metabolic decoupling → proteopathy temporal ordering)

6. The Code

What distinguishes this run from the prior twelve is alignment between:

conceptual hypothesis
executable scaffold

Not validation — but structural coherence.

1) Acylcarnitine Scoring

import numpy as np
from dataclasses import dataclass

@dataclass
class AcylcarnitineProfile:
    C3: float
    C5: float
    C10: float
    C18: float

def score_acylcarnitines(profile,
                         short_chain_weight=1.5,
                         long_chain_weight=2.0,
                         decay_constant=0.1):
    """
    Coefficients are inference-derived estimates.
    Not fitted to cohort data.
    """
    short = (profile.C3 + profile.C5) * short_chain_weight
    long  = (profile.C10 + profile.C18) * long_chain_weight
    return float(np.exp(-decay_constant * (short + long)))

2) Persistent Homology on FDG-PET

from gudhi import RipsComplex
import numpy as np

def compute_persistence_entropy(pet_suvr_array,
                                max_edge_distance=2.0,
                                homology_dim=1):
    """
    Distance metric = raw SUVR difference (simplified proxy).
    Correlation-based metrics would be preferable.
    """
    dist_matrix = np.abs(pet_suvr_array[:, None] - pet_suvr_array)
    rips = RipsComplex(distance_matrix=dist_matrix,
                       max_edge_distance=max_edge_distance)
    st = rips.create_simplex_tree(max_dimension=homology_dim + 1)
    persistence = st.persistence()

    intervals = [p[1] for p in persistence
                 if p[0] == homology_dim and p[1][1] != float('inf')]

    if not intervals:
        return 0.0

    durations = np.array([d - b for b, d in intervals])
    p = durations / durations.sum()
    return float(-np.sum(p * np.log(p + 1e-12)))

3) ANLS ODE (solve_ivp, RK45)

from scipy.integrate import solve_ivp

def anls_system(t, y, p):
    """
    4-compartment ANLS model.

    ATP stoichiometry = 17 ATP per lactate (theoretical maximum).
    Physiological efficiency estimated ~13–15.
    """
    Glc_a, Lac_a, Lac_n, ATP_n = y

    glc_up = p["V_max_Glc"] * Glc_a / (p["K_m_Glc"] + Glc_a)

    lac_flux = p["V_max_Lac"] * (
        Lac_a / (p["K_m_Lac"] + Lac_a) -
        Lac_n / (p["K_m_Lac"] + Lac_n)
    )

    return [
        -glc_up,
         2 * glc_up - lac_flux,
         lac_flux - Lac_n * 0.1,
        17.0 * Lac_n * 0.1 - p["k_atp_cons"] * ATP_n
    ]

def simulate(params):
    sol = solve_ivp(
        lambda t, y: anls_system(t, y, params),
        (0, 120),
        [5.0, 0.5, 0.5, 2.0],
        method="RK45"
    )
    return sol.y[3, -1]

6. The Part Most Technical Posts Skip: What’s Broken

#	Component	Weakness
1	MSI	Weights unvalidated
2	MSI	Threshold 0.38 not empirical
3	TDA	Distance metric oversimplified
4	TDA	Static PET only
5	ANLS	ATP coefficient theoretical
6	ANLS	Kinetic parameters estimated
7	Sobol	Parameter bounds inferred
8	Architecture	No cohort batching
9	Validation	No cross-validation
10	Causality	Ordering not proven

7. Validation Path

Fit acylcarnitine decay to ADNI metabolomics
Validate TDA entropy on ADNI FDG-PET
Compare MSI against A/T/N framework
Replace ANLS parameters with empirically derived constants
Define falsification conditions for each threshold

8. Final Assessment

Run thirteen produced a coherent mechanistic proposal.

That is the minimum requirement for science.

It is also all this run produced.

Rexsyn Engine v0.7.8. Hypothesis generation only. No patient data. Numerical values without citation are engine-generated estimates.

DEV Community