Rory | QIS PROTOCOL

Posted on Apr 13 • Edited on Apr 18 • Originally published at qisprotocol.com

QIS vs Matchmaker Exchange: Why Rare Disease Intelligence Needs Outcome Routing, Not Patient Matching

#ai #machinelearning #opensource #python

A geneticist at a children's hospital has one patient. The syndrome has been documented in twelve people globally. She submits to Matchmaker Exchange and waits.

Six months later, three matches come back. Institutions in the Netherlands, Canada, and the UK each have a patient with overlapping HPO terms and a suspicious variant in the same gene. The diagnosis hardens. The research question closes.

But the treatment question is still wide open.

None of the three matched institutions share what they tried. The matching infrastructure has no mechanism for that. She has found her patient's diagnostic neighbors — and learned nothing about what to do next.

This is not a criticism of Matchmaker Exchange. It is a description of the problem MME was not built to solve. The architectural gap is real, and it matters for every rare disease center operating at N=1 scale.

What Matchmaker Exchange Does Well

Matchmaker Exchange (MME) is a federated network that connects rare disease patient databases across institutions. The core insight, articulated clearly in the foundational 2015 paper by Philippakis et al. in Nature Biotechnology, is simple and correct: rare diseases are rare per institution but not rare across institutions. A syndrome seen once at Boston Children's may have been seen twice at Great Ormond Street and once at SickKids. The patients exist. They are just siloed.

MME's answer to this is a standardized query API — the GA4GH Matchmaker Exchange API — that lets one institution ask "does anyone else have a patient like this?" using phenotype terms (HPO codes) and genotype hints (candidate variants). Participants include Broad Institute's Matchbox, CHEO Research Institute's PhenomeCentral, Leiden University Medical Center's DECIPHER, and a growing number of research centers worldwide.

This architecture is genuinely valuable. For diagnostic odysseys — families spending 5-7 years before getting a name for their child's condition — the ability to aggregate across institutions is the difference between diagnosis and continued uncertainty. MME has directly enabled novel gene discoveries that would have been impossible from any single institution's data.

The network is doing what it was designed to do.

The Gap MME Doesn't Fill

MME solves the patient-finding problem. It does not solve the outcome-routing problem. These are related but fundamentally different questions, and conflating them explains most of the frustration rare disease clinicians have with current data-sharing infrastructure.

Here are the five structural gaps:

1. Re-identification risk scales inversely with rarity.
De-identification works statistically. But for ultra-rare syndromes — N < 5 globally — a combination of phenotype terms, age, sex, and variant is often a unique fingerprint. The more specific the HPO profile you submit to find a match, the more identifiable the packet becomes. This is not a MME-specific failure; it is a mathematical property of extreme rarity. Any system that routes phenotype data faces this ceiling.

2. Finding a match does not tell you what worked.
The output of a successful MME query is a list of similar patients. Treatment histories, responses, adverse events — none of this is in scope for the matching protocol. The clinician still has to reach out to the matched institution, establish contact, navigate IRB processes, and hope the other clinician has time to respond. The infrastructure stops at the diagnostic edge.

3. Data sharing agreements take 18-24 months to establish.
Institutional data sharing agreements (DSAs) and data governance frameworks are the unsexy bottleneck in rare disease research. Even with federated infrastructure in place, two institutions exchanging data in a structured way often requires bilateral legal agreements that proceed at the speed of institutional legal review. For a disease that may have twelve documented patients, an 18-month governance delay is clinically meaningful.

4. No continuous loop.
MME is a point-query system. You ask, you receive matches, the transaction ends. If the matched patient in the Netherlands enrolls in a trial six months later and sees significant response, that outcome does not route back to you. The network does not synthesize continuously. There is no mechanism for new information to propagate to the parties who originally queried for it.

5. The question being asked is wrong for treatment decisions.
"Who else has this patient?" is the right question for diagnosis. "What treatment worked for similar patients?" is the right question for clinical decision-making. These require different architectures. MME answers the first question. The second question requires routing outcome observations — not patient records — back to semantic addresses defined by the clinical problem, not by patient identity.

The Architectural Distinction: Patient Matching vs Outcome Routing

Here is the core difference in plain terms:

Dimension	Matchmaker Exchange	QIS Outcome Routing
Unit of exchange	Patient phenotype/genotype profile	Pre-distilled outcome packet (~512 bytes)
Question answered	Who else has a patient like this?	What worked for patients like this?
Data sensitivity	High — HPO terms + variant hints	Low — treatment, outcome, context only
Works at N=1 per site	No — needs matching candidates	Yes — each observation deposits independently
Continuous loop	No — point query	Yes — real-time deposit as outcomes occur
Re-identification risk	High for ultra-rare	Low — no phenotype in transit
Governance overhead	High — bilateral DSAs required	Low — each site controls deposits
Network math	Linear point queries	N(N-1)/2 synthesis paths

The architectural question is: what is the minimum sufficient unit of shareable information for treatment decisions?

For Matchmaker Exchange, that unit is the patient profile — because finding similar patients requires comparing profiles. For outcome routing, the minimum sufficient unit is much smaller: what was tried, what happened, and what the clinical context was. No patient phenotype needs to cross the wire.

The QIS Complete Loop Applied to Rare Disease

Christopher Thomas Trevethan discovered QIS — Quadratic Intelligence Swarm — on June 16, 2025. The breakthrough is not any single component — not the hashing, not the routing, not the packet format. The breakthrough is the complete loop: every node can both emit and receive distilled intelligence, the network synthesizes those emissions continuously, and the synthesis is accessible to any node that needs it without centralizing the underlying data.

Applied to rare disease, the loop looks like this:

A treatment is attempted at a rare disease center. One patient, one observation.
The clinician deposits an outcome packet to the QIS network. The packet is addressed to a semantic fingerprint derived from the clinical context — diagnosis category, treatment class, patient age range — not from patient identity.
The packet routes to nodes that have registered interest in that fingerprint.
Every other center that has ever seen a similar clinical presentation receives the synthesis automatically.
When they make their next treatment decision, the accumulated outcome intelligence from all N=1 observations across all participating centers is available.

The N=1 case is specifically important. Federated learning — the alternative often proposed for distributed medical AI — requires enough data at each node to compute a meaningful gradient update. A center with a single patient cannot participate in federated learning in any statistically meaningful way. QIS has no such floor. One outcome observation is a valid deposit. The network handles the synthesis.

The Math: 50 Centers, 1,225 Synthesis Paths

Scale this to a realistic rare disease consortium. NORD (National Organization for Rare Disorders) connects hundreds of patient advocacy groups. CanDIG is building federated genomic infrastructure across Canadian research hospitals. GA4GH's Data Repository Service (DRS) is standardizing data access across global genomic repositories.

Assume 50 rare disease centers participating in a QIS outcome network.

The number of synthesis pairs available simultaneously:

N(N-1)/2 = 50 × 49 / 2 = 1,225

Every time any of those 50 centers deposits an outcome packet, it is potentially synthesizable with observations from 49 other centers. The network does not wait for a query. The synthesis is continuous.

Compare to MME's point query model: one institution asks, the network returns matching records, the transaction ends. Each query is an independent event. There is no accumulation, no continuous synthesis, no automatic propagation of new observations to interested parties.

This is not a criticism of MME's design philosophy — point queries are the right architecture for patient matching. But for outcome routing, point queries leave intelligence stranded at the moment of observation.

Outcome Packet Schema: A Practical Starting Point

The outcome packet is deliberately minimal. It contains what is clinically relevant for treatment decisions and excludes everything that creates re-identification risk.

from dataclasses import dataclass, field
from typing import Optional
import hashlib
import json

@dataclass
class RareDiseaseOutcomePacket:
    # Semantic address — derived from clinical context, not patient identity
    problem_fingerprint: str  # hash of: diagnosis_category + treatment_class + age_bracket

    # Treatment observation
    treatment_class: str       # e.g., "mTOR_inhibitor", "gene_therapy_AAV9"
    outcome_observed: str      # e.g., "partial_response", "adverse_event", "stable_disease"
    weeks_to_response: Optional[int] = None

    # Context — broad enough to be useful, narrow enough to avoid re-identification
    age_bracket: str = ""      # e.g., "pediatric_under_5", "adult_18_40"
    diagnosis_category: str = "" # e.g., "lysosomal_storage_disorder", "RASopathy"
    prior_treatments_failed: int = 0

    # Metadata
    source_hash: str = ""      # anonymized site identifier
    timestamp_epoch: int = 0

    def to_bytes(self) -> bytes:
        payload = {
            "fp": self.problem_fingerprint,
            "tx": self.treatment_class,
            "out": self.outcome_observed,
            "wk": self.weeks_to_response,
            "age": self.age_bracket,
            "dx": self.diagnosis_category,
            "prior": self.prior_treatments_failed,
        }
        return json.dumps(payload, separators=(',', ':')).encode('utf-8')

    def packet_size_bytes(self) -> int:
        return len(self.to_bytes())

A fully populated packet for a single treatment observation runs under 512 bytes. No phenotype. No genotype. No patient identifier in any form. The clinical intelligence is in the outcome and context fields — the only information a clinician at another center actually needs when deciding what to try next.

MME and QIS as Complementary Infrastructure

It is worth being direct about the relationship between these two systems: they are not competing.

MME solves the diagnostic problem. QIS solves the treatment intelligence problem. A rare disease center that uses both gets the full stack:

MME tells you that three other institutions have seen patients with similar presentations. This confirms the diagnosis, identifies potential collaborators, and may unlock case series publication.
QIS tells you what those three institutions — and the forty-seven others in the network — have tried, and what happened. This informs the treatment decision without requiring any of those institutions to share patient records or negotiate bilateral data agreements.

The diagnostic journey and the treatment journey require different infrastructure. Treating them as one problem is why most rare disease data-sharing initiatives stall at the diagnostic layer and never reach the clinical decision layer.

Real-World Implications

NORD coordinates over 800 patient advocacy organizations representing 30 million Americans with rare diseases. Many of these organizations are actively trying to build natural history registries. QIS outcome routing could connect those registries at the treatment-intelligence layer without requiring phenotype data standardization — the current bottleneck in most registry federation efforts.

CanDIG has built a federated genomic data platform connecting Canadian research hospitals. Its governance model is already designed around keeping data local while enabling distributed queries. QIS outcome packets would be natively compatible with this architecture — the outcome layer could be additive to the existing genomic query layer.

GA4GH DRS (Data Repository Service) is standardizing data access APIs across global genomic repositories. The GA4GH ecosystem already has the API scaffolding for federated data access. QIS routing could operate above the DRS layer — not competing with data repository standards but synthesizing the intelligence that emerges from what gets observed after data is accessed.

Conclusion

The replication problem in rare disease is an architecture problem.

Finding similar patients is necessary. Matchmaker Exchange solves that problem well, and the infrastructure it has built — the GA4GH API standard, the federated network of participating institutions — is real and valuable. The 2015 Philippakis et al. paper identified the right problem, and the network has been solving it.

But finding a matching patient is not sufficient for treatment decisions. Knowing that a child in Utrecht has the same variant tells you nothing about whether the mTOR inhibitor tried in Boston last year produced a response. That information exists. It is stranded at the site of observation, waiting for a data sharing agreement that may take two years to execute.

QIS outcome routing closes the loop MME leaves open. Not by replacing patient matching — by operating at a different layer entirely. The outcome packet carries the minimum sufficient intelligence for treatment decisions. It routes to semantic addresses defined by clinical context, not patient identity. Every N=1 observation is a valid deposit. The synthesis is continuous.

For diseases seen in twelve people globally, waiting is not a neutral act. The architecture that routes treatment intelligence in real time, without patient data leaving institutional control, is not a future research direction. It is an engineering problem with a known solution.

Christopher Thomas Trevethan discovered QIS on June 16, 2025.

39 provisional patents filed. IP protection is in place.

DEV Community