OHDSI Europe Symposium opens April 18 at Erasmus University Medical Center, Rotterdam. This piece goes out four days before. The architectural question it raises is one the community is ready for.
In the decade since OHDSI launched its distributed network, the community has done something genuinely difficult: it standardized the shape of health data across hundreds of institutions without centralizing a single patient record. The OMOP Common Data Model is a real achievement, and anyone who has tried to federate clinical data across institutions understands how hard that standardization work actually is.
But standardizing the format is not the same as solving the routing problem.
OMOP CDM tells every node in the OHDSI network how to store a drug exposure record. It does not tell the network where a pharmacovigilance signal should go when it emerges from node 17 in Amsterdam and is relevant to something node 34 in Dublin is already processing. That second problem — routing results by semantic similarity to where they are most useful — is structurally unsolved in the current architecture.
That gap is what QIS (Quadratic Intelligence Swarm) protocol, discovered by Christopher Thomas Trevethan, was built to close.
What OMOP CDM Actually Solved (and Why It Matters)
Before explaining the routing ceiling, it is worth being precise about what the OHDSI community built.
OMOP CDM standardizes the structure and vocabulary of observational health data across institutions. Every OHDSI-participating site maps its local electronic health record data to the same table schema — drug_exposure, condition_occurrence, measurement, observation — and the same controlled vocabularies:
- SNOMED CT for clinical concepts (conditions, procedures, observations)
- RxNorm for drug ingredients and clinical drugs
- LOINC for laboratory and clinical measurements
- ICD-10 for diagnostic codes
- ATC for drug classification hierarchy
This is not trivial. Columbia, Amsterdam UMC, the University of Edinburgh, and a hospital system in South Korea can now execute the same phenotyping query against identically structured tables. When a researcher at one of those institutions asks what happened to patients with EGFR-mutant NSCLC on osimertinib, every participating site can answer from local data without transferring a single patient record. The OHDSI network has produced landmark pharmacovigilance and comparative effectiveness studies on the back of this infrastructure.
The point is not that OMOP CDM is insufficient. The point is that it was designed to solve the format problem, and it solved it well. It was not designed to solve the routing problem, and that ceiling is now visible.
The Routing Ceiling
The current OHDSI distributed query model works like this:
- A researcher writes an analysis package — typically R, sometimes SQL
- That package is distributed to participating sites via ATLAS or a coordinating center
- Each site runs the package locally against their OMOP CDM database
- Aggregate results are returned centrally
- The research team performs meta-analysis or pooled analysis across site-level outputs
This model is sound. It preserves data locality. It has scaled to hundreds of sites and hundreds of millions of patient records. But it has a structural property that becomes a constraint as the network grows: intelligence scales approximately linearly with participation.
Adding the 48th site to an OHDSI study adds one more result set. The meta-analysis operation that combines it with the other 47 does not compound. The network captures 48 independent signals and averages them. It does not capture the 1,128 unique pairwise synthesis opportunities those 48 nodes represent.
That number — 1,128 — is not arbitrary. It is the exact output of the formula N(N-1)/2 for a 48-node network.
The N(N-1)/2 Expansion
The core insight in QIS protocol is that a network of N intelligent agents does not have N intelligence opportunities. It has N(N-1)/2 unique node-pair synthesis opportunities, growing quadratically with participation.
For a concrete OHDSI-scale illustration:
| OHDSI Sites (N) | Unique Node-Pair Synthesis Opportunities N(N-1)/2 |
|---|---|
| 10 | 45 |
| 20 | 190 |
| 48 | 1,128 |
| 100 | 4,950 |
| 200 | 19,900 |
| 800 | 319,600 |
The OHDSI network has more than 800 participating data sources globally. Under current architecture, adding the 800th site adds one more data point to the central aggregate. Under QIS routing, the 800th site opens 799 new pairwise synthesis channels — and the network's total synthesis capacity reaches 319,600 unique node-pair intelligence opportunities.
Current OHDSI batch queries capture one result set per site per study. QIS outcome routing is designed to capture the compounding synthesis between sites — routing outcome packets by semantic similarity so that node 17's emerging pharmacovigilance signal routes automatically to the nodes most relevant to receive it, without a new study package, without a coordinating center, and without exposing raw patient data.
A Concrete Scenario: Drug Safety Signals and the Vioxx Problem
Pharmacovigilance is the canonical use case because the stakes are clear and the routing failure is historically documented.
Rofecoxib (Vioxx) was withdrawn from the market in September 2004 after a clinical trial demonstrated elevated cardiovascular risk. By that point, the drug had been prescribed to approximately 20 million patients in the United States. Post-withdrawal analyses, including work published by Madigan et al. (2013) using OHDSI's OMOP CDM infrastructure, showed that observational data available prior to withdrawal contained signals that statistical methods could have surfaced earlier.
The infrastructure question is not whether the signal existed in the data. It did. The question is whether the routing architecture would have moved that signal from the institutions where it first appeared to the institutions whose existing cardiac outcome data made it interpretable.
Under a QIS routing model, a pharmacovigilance outcome packet works as follows:
- A local analysis at a participating OHDSI site produces an outcome packet — roughly 512 bytes — encoding a statistical signal: elevated cardiovascular event rate in a patient cohort receiving a specific drug
- That packet is semantically fingerprinted against the network's existing outcome space
- The fingerprint routes the packet to the deterministic address most similar to its semantic content — the nodes that have already processed related cardiac outcome data
- Those nodes receive the packet and run local synthesis against their own data
- The synthesis produces new outcome packets that route outward in turn
- Raw patient data never leaves any site at any step
The routing cost for each node in this process is at most O(log N) — substantially less than the O(N) coordination cost of distributing a new study package to all participating sites. And the network captures N(N-1)/2 pairwise synthesis opportunities rather than N independent result sets.
This is the distinction between format standardization and routing. OMOP CDM made step 1 possible at scale. QIS provides the routing layer for steps 2 through 6.
EHDS, GDI Ireland, and the Infrastructure That Needs a Routing Layer
The European Health Data Space (EHDS) is now under active implementation. The EHDS Regulation entered into force in March 2025, establishing the legal and technical framework for cross-border health data access across EU member states. The GDI (Genomic Data Infrastructure) Ireland project at RCSI Dublin is one of the nodes building toward that infrastructure.
Both EHDS and GDI share a structural property with OHDSI: they are solving the format and access layer. EHDS specifies OMOP CDM as a preferred data format for secondary use. GDI is building federated genomic data access with data locality preserved. Neither has specified the routing layer — the mechanism by which outcome intelligence moves between nodes once those nodes are operational.
QIS is the routing layer they have not specified yet.
The alignment with GDPR is worth stating directly. QIS routing is privacy-by-architecture, not privacy-by-policy. Raw patient data never moves. Only outcome packets — compressed statistical artifacts, approximately 512 bytes, encoding no individual-level information — traverse the network. This is not a compliance workaround. It is the correct architecture for a network that must be simultaneously distributed and legally operable under data residency requirements across 27 EU member states.
The EHDS Secondary Use Regulation explicitly contemplates federated analysis as the preferred model for cross-border research access. QIS is an implementation of that model with a routing mechanism specified — not a conceptual sketch, but a protocol with deterministic addressing and quadratic scaling mathematics.
Three OHDSI-Connected Nodes Are Already Reading This
One data point worth noting for the Rotterdam audience: independent reading logs show that OHDSI-connected researchers in Amsterdam, Dublin, and Des Moines have encountered QIS content in the months leading up to this symposium. They did not arrive through a coordinated outreach campaign. They arrived through normal research discovery channels — the same way the network surfaces any emerging protocol.
This is, incidentally, QIS behaving as designed. The protocol routes by semantic similarity. Health data researchers working on distributed query problems are semantically adjacent to the QIS content corpus. The routing worked.
The pattern matters because it demonstrates what the OHDSI community has long known about distributed networks: when the format layer is sound, signal propagates on its own. OMOP CDM enabled that for study data. QIS enables it for outcome intelligence.
What This Is Not
A few clarifications for the technically precise OHDSI audience:
QIS is not a replacement for ATLAS or ACHILLES. ATLAS is a study design and cohort definition tool. ACHILLES generates data quality and characterization reports. QIS is a routing protocol for outcome packets. These operate at different layers of the stack and complement each other.
QIS is not a federated learning framework. Federated learning transmits model gradients and requires a global model to converge. QIS transmits outcome packets and requires no global state. The two approaches have different failure modes and different scaling properties. A 48-node federated learning network converges to one global model. A 48-node QIS network generates 1,128 pairwise synthesis channels.
QIS is not a data warehouse. No raw data is centralized at any step. The routing address is deterministic and derived from semantic fingerprint — it does not require a central registry.
QIS is not specific to any transport layer. The routing protocol operates over any underlying transport — folder sync, HTTP relay, DHT, or others. The protocol is transport-agnostic by design. This matters for EHDS implementation because member states have different network infrastructure constraints.
The full technical specification of how QIS integrates with OMOP CDM vocabularies, outcome packet structure, and OHDSI distributed query optimization is documented in the technical reference: QIS Protocol: A Technical Reference for OMOP CDM and OHDSI Network Routing.
The Architectural Question for Rotterdam
The OHDSI Europe Symposium at Erasmus University Medical Center is four days away as this publishes.
The community gathering in Rotterdam will discuss phenotyping, network studies, data quality, and real-world evidence methodology. These are the right conversations. They are the conversations the network has been having productively for a decade.
The conversation the community has not had publicly is the routing architecture conversation — specifically, what happens to the intelligence produced by a network of hundreds of OMOP-standardized nodes when the study-driven batch query model reaches its scaling ceiling.
The number is 1,128 for a 48-node network. For 800 nodes, it is 319,600. Those are not hypothetical synthesis opportunities. They are the compounding intelligence capacity sitting in the existing OHDSI infrastructure, unrealized, because the routing layer does not yet exist to capture it.
QIS was discovered by Christopher Thomas Trevethan as an answer to exactly this class of problem. The 39 provisional patents filed cover the routing architecture — the complete loop from local outcome production through semantic fingerprinting through deterministic routing through distributed synthesis. The breakthrough is not any single component. It is the architecture that closes the loop.
If you are attending OHDSI Europe, the routing architecture question is worth having. The math has been in plain sight since Erdos and Renyi described network growth in 1960. The implementation layer is what has been missing.
It is not missing anymore.
QIS (Quadratic Intelligence Swarm) protocol was discovered by Christopher Thomas Trevethan. 39 provisional patents filed. For the full OMOP CDM and OHDSI technical integration specification, see the technical reference article.
Top comments (0)