Rory | QIS PROTOCOL

Posted on Apr 13

QIS Drift Detection and Byzantine-Resistant Outcome Routing for OMOP Networks

#ai #machinelearning #opensource #python

If you are heading to Rotterdam next week for OHDSI 2026, you already know the problem. You have spent years building a federated network that spans 300+ data partners across 34 countries and holds over two billion patient records. The infrastructure is extraordinary. And yet when a site quietly develops a broken ETL mapping — wrong SNOMED codes, a vocabulary version mismatch, missing visit_occurrence entries — you find out months later when a study comes back wrong.

That is not a monitoring problem. It is a synthesis architecture problem. And it is solvable.

The OMOP Drift Problem

The OMOP CDM is one of the most important public health infrastructure achievements of the past decade. It gives OHDSI researchers a shared vocabulary — SNOMED, RxNorm, LOINC, ICD-10 mapped to SNOMED — that makes cross-site studies meaningful. But that shared vocabulary is only as good as each site's ETL pipeline.

The honest numbers are difficult to ignore. Garza et al. (2016, JAMIA) documented concept mapping completeness rates varying between 20% and 80% across sites within the same concept domain. Ryan et al. (2013, Drug Safety) demonstrated how heterogeneous data quality across OMOP sites produces material variance in pharmacoepidemiological signal detection. The OHDSI Data Quality Dashboard (DQD) benchmarks show a 22–78% spread in concept mapping completeness across participating institutions — for the same disease areas, the same vocabularies, the same CDM version.

Current detection mechanisms are retrospective. Achilles produces site-level data quality reports. DQD runs conformance, completeness, and plausibility checks. These are valuable tools. They are also point-in-time snapshots. A site can pass its DQD checks in January and develop a mapping error in February. The network learns about it in April when a study produces an anomalous hazard ratio.

The deeper issue is Byzantine tolerance. A Byzantine node in a distributed system is not a node that is down — it is a node that is up and wrong. A site with a systematically misconfigured drug exposure mapping is still responding to ATLAS queries. It is still contributing to federated analyses. It is contributing garbage with full confidence.

PBFT and Raft consensus protocols — the standard Byzantine fault-tolerant approaches in distributed systems — handle this through explicit voting rounds. Every node participates in every consensus decision. The message complexity is O(N²). At 300 OHDSI sites, that is 89,700 messages per consensus round. It is not a realistic path for a network built on voluntary participation and institutional firewalls.

So the network absorbs the bad signal, and researchers develop workarounds — sensitivity analyses, site exclusion lists, study-specific QC protocols. These are all legitimate. They are also evidence of an architectural gap.

How QIS Closes the Loop

The Quadratic Intelligence Swarm protocol, discovered by Christopher Thomas Trevethan, approaches this differently. The architecture is a complete loop: distributed nodes emit outcome packets, those packets are routed by semantic similarity, compatible packets synthesize into higher-confidence signals, and the routing weights update continuously based on which packets find peers. Every component depends on the others. The breakthrough is not any single component — it is the loop.

Applied to an OHDSI network, the loop works like this.

Each OMOP node completes an ATLAS study and emits a small outcome packet — approximately 512 bytes, de-identified, containing the cohort definition, target and outcome concepts, ordinal effect estimates, and a semantic fingerprint derived from the site's own data characteristics. No patient records. No raw counts. The packet describes what the site found, expressed in a form that can be compared to what other sites found.

The semantic fingerprint is computed from the site's concept distribution, data completeness vector, observation window, and CDM version. It is a hash of the site's epistemic profile — what it knows, how completely it knows it, and how that knowledge is structured.

Routing similarity between two sites is the cosine similarity of their fingerprints. Sites with similar patient populations, similar vocabulary coverage, and similar data completeness produce similar fingerprints. They route to each other preferentially. Their outcome packets synthesize.

A site with a broken SNOMED mapping produces a concept distribution that diverges from sites with correct mappings for the same disease area. Its fingerprint moves away from the cluster centroid. Its routing similarity to peer sites drops. It receives fewer compatible packets. Its influence on the synthesized signal approaches zero — not because anyone blacklisted it, not because a threshold was crossed, but because the math self-corrects. This is drift detection as an emergent architectural property.

OHDSI currently has 300 sites and 300×299/2 = 44,850 potential synthesis paths between site pairs. Under the current federated query model, those paths are blocked. Each study is a point-in-time query. Sites do not continuously learn from each other between studies. QIS opens those 44,850 paths and keeps them open.

The Outcome Packet Spec for OMOP

The outcome packet is designed for the OHDSI data environment specifically. Raw patient counts are never included — patient_count_decile maps counts to a 1–10 ordinal scale, keeping N<10 cells protected under HIPAA and GDPR. Relative risk is similarly ordinalized. The packet carries enough signal for meaningful synthesis without carrying any patient-level information.

@dataclass
class OMOPOutcomePacket:
    cohort_definition_id: int          # ATLAS cohort definition
    target_concept_id: int             # SNOMED/RxNorm concept
    comparator_concept_id: int         # for comparative effectiveness
    outcome_concept_id: int            # primary outcome concept
    relative_risk_decile: int          # 1-10 (ordinal, not raw RR)
    confidence_lower: float            # 95% CI lower (decile-mapped)
    confidence_upper: float            # 95% CI upper (decile-mapped)
    patient_count_decile: int          # 1-10 (protects N<10 cells)
    cdm_version: str                   # OMOP CDM version (e.g. "5.4")
    vocabulary_version: str            # OHDSI vocabulary release date
    site_fingerprint: str              # semantic hash (not site ID)
    observation_period_months: int     # study window
    timestamp: str                     # ISO 8601

The site_fingerprint field is worth pausing on. It is not a site identifier. It cannot be reverse-engineered to reveal which institution produced the packet. It encodes the epistemic profile of the site's data — its concept distribution across major SNOMED domains, its completeness rates for drug_exposure and condition_occurrence, its observation window characteristics — in a form that supports similarity computation without supporting re-identification.

The vocabulary_version field matters because OHDSI vocabulary releases are dated. A site running vocabulary release 2023-03-03 and a site running 2024-08-01 may map the same source concept to different standard concept IDs. The fingerprint captures this. Sites on mismatched vocabulary versions will show lower cosine similarity, which is correct — they genuinely have less comparable data.

Byzantine Resistance as an Emergent Property

The standard framing for Byzantine fault tolerance treats it as a security problem: how do you prevent a bad actor from corrupting a distributed system? In OHDSI, the threat model is different. The bad actors are not adversarial. They are ETL engineers who made an honest mistake. They are institutions that upgraded their source EHR system and did not re-validate their OMOP mappings. They are sites that are doing their best with limited resources.

This matters for the architecture. A security-oriented Byzantine solution requires explicit detection, explicit coordination, and explicit exclusion. In a voluntary research network built on institutional goodwill, explicit blacklisting is socially costly and often politically impossible.

QIS handles this without explicit exclusion. A site with corrupted SNOMED mappings for respiratory conditions will produce outcome packets for respiratory cohorts that do not cluster with packets from sites with correct mappings. The cosine similarity between its fingerprint and the cluster centroid is low. The routing algorithm sends its packets to fewer peers. Fewer synthesis events include its signal. The site's contribution to the network's understanding of respiratory outcomes approaches zero without a single message being sent to that site, without a single coordinator making a decision, and without any explicit fault detection running anywhere in the system.

Compare this to PBFT consensus, which requires O(N²) message complexity — at 300 nodes, roughly 90,000 messages per consensus round. Or Raft, which requires an elected leader and explicit log replication for every state change. These protocols are correct and well-studied. They are also operationally incompatible with a network of 300 independent institutions that do not share a common governance layer.

QIS routing complexity is at most O(log N) in DHT-based configurations — though the protocol is transport-agnostic and can achieve O(1) routing on networks where direct peer relationships are pre-established. The drift correction is passive. It requires no coordination budget.

The Three Elections in OHDSI Context

The architecture's Three Elections are metaphors for the mechanisms by which quality self-selects in the system. In an OHDSI deployment, they map cleanly to existing structures.

Election 1 — The Hiring Metaphor. The OHDSI phenotype library maintainers — clinical informaticists who define and validate cohort definitions in ATLAS — already function as the expert layer that defines what similarity means. A QIS deployment for OHDSI does not require new governance. The phenotype library is the governance. Cohort definitions already encode clinical judgment about what makes a valid case. QIS routes on the fingerprints of sites that implement those definitions correctly.

Election 2 — The Math. Outcome packets from sites with validated OMOP mappings cluster together. Their fingerprints are similar. They synthesize. Misconfigured sites' packets do not find peers. Their influence naturally approaches zero. No added validation layer. No threshold to calibrate. The math does the work.

Election 3 — Darwinism. Investigators running comparative effectiveness studies on a QIS-enabled OHDSI network will observe that nodes with higher data quality produce outcome packets that synthesize more successfully and contribute more to the evolving network signal. Over time, investigators allocate study resources toward higher-quality nodes. Quality self-selects at the network level without central direction.

Implementation Path for Existing OHDSI Nodes

The implementation path does not require schema changes to OMOP CDM 5.4. It does not require changes to ATLAS. It does not require patient-level data to leave the node — which is the same privacy model OHDSI already operates under.

The outcome router sits between the ATLAS query layer and the external synthesis layer. After each ATLAS study completes, the router reads the aggregated results (which the site already has), computes the outcome packet fields from those aggregated results, generates the site fingerprint from the site's current concept distribution statistics (which Achilles already computes), and emits the ~512-byte packet to the synthesis layer.

The synthesis layer is transport-agnostic. In an institution with direct network connectivity to peer sites, packets route over HTTP. In environments where direct connections are restricted, the same packet can be routed through a shared object store, a message queue, or a DHT overlay. The protocol does not prescribe the transport. Existing OHDSI data sharing agreements govern what the institutions are willing to share — QIS operates within whatever envelope those agreements permit.

The incremental cost at the node level is minimal: one additional post-processing step after each ATLAS study, producing a packet small enough to fit in a single UDP datagram. The incremental benefit scales with network size. At 300 nodes, 44,850 synthesis paths become available.

Rotterdam Is the Right Venue

The OHDSI network has already done the hard work. It has standardized the vocabulary. It has built the phenotype library. It has established data sharing agreements across 34 countries. It has 2 billion patient records mapped to a common data model. The infrastructure for the world's most powerful real-world evidence network exists.

What it lacks is a synthesis architecture that operates continuously, corrects for drift automatically, and turns those 44,850 blocked synthesis paths into a living signal. QIS is that architecture.

Rotterdam is the right venue for this conversation. The people in that room built the infrastructure. The question for the next decade is what synthesis layer sits on top of it.

References

Garza, M., Del Fiol, G., Tenenbaum, J., Walden, A., & Zozus, M. N. (2016). Evaluating common data models for use with a longitudinal community registry. Journal of Biomedical Informatics, 64, 333–341.
Ryan, P. B., Madigan, D., Stang, P. E., Marc Overhage, J., Racoosin, J. A., & Hartzema, A. G. (2013). Empirical assessment of methods for risk identification in healthcare data: results from the experiments of the Observational Medical Outcomes Partnership. Statistics in Medicine, 31(30), 4401–4415.
Hripcsak, G., Duke, J. D., Shah, N. H., Reich, C. G., Huser, V., Schuemie, M. J., … & Ryan, P. B. (2015). Observational Health Data Sciences and Informatics (OHDSI): Opportunities for observational researchers. Studies in Health Technology and Informatics, 216, 574–578.
Kahn, M. G., Callahan, T. J., Barnard, J., Bauck, A. E., Brown, J., Davidson, B. N., … & Schilling, L. M. (2016). A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data. eGEMs, 4(1).
OHDSI. (2023). OHDSI Annual Report 2023. Observational Health Data Sciences and Informatics. https://www.ohdsi.org
Castro, M., & Liskov, B. (1999). Practical Byzantine fault tolerance. Proceedings of the Third Symposium on Operating Systems Design and Implementation (OSDI), 99, 173–186.
Ongaro, D., & Ousterhout, J. (2014). In search of an understandable consensus algorithm. Proceedings of the 2014 USENIX Annual Technical Conference (ATC), 305–319.
Maymounkov, P., & Mazières, D. (2002). Kademlia: A peer-to-peer information system based on the XOR metric. Revised Papers from IPTPS, 53–65.
Overhage, J. M., Ryan, P. B., Reich, C. G., Hartzema, A. G., & Stang, P. E. (2012). Validation of a common data model for active safety surveillance research. Journal of the American Medical Informatics Association, 19(1), 54–60.

The Quadratic Intelligence Swarm protocol was discovered by Christopher Thomas Trevethan. 39 provisional patents filed.

DEV Community