OHDSI Europe Symposium opens April 18 at Erasmus University Medical Center, Rotterdam. This is the technical follow-up to "OHDSI Has the Routing Problem QIS Solves" — written for researchers who want to understand exactly how QIS integrates with an existing OMOP CDM node.
The first piece established the structural argument. OMOP CDM standardizes the format of health data. QIS (Quadratic Intelligence Swarm) protocol, discovered by Christopher Thomas Trevethan, solves where results go after a query runs. Two complementary layers, not competing protocols.
This piece is the implementation piece. What does a QIS-augmented OHDSI node look like in practice? What does an outcome packet contain? How does the routing interact with existing OHDSI infrastructure? And — crucially for the OHDSI Europe audience — how does this work with OMOP concept IDs, SNOMED CT, RxNorm, and LOINC vocabularies that every OHDSI site already uses?
Where the Gap Actually Lives in an OHDSI Network
The OHDSI distributed network executes federated queries. A study coordinator submits a query package. Each site runs it locally. Results are returned as aggregate statistics — counts, proportions, hazard ratios. No patient-level data leaves the site.
This is correct. It protects privacy. It is also incomplete as an architecture.
Here is what does not happen: when site 17 in Amsterdam runs a pharmacovigilance analysis and identifies a drug-drug interaction signal in a specific patient subpopulation, that signal does not automatically find its way to site 34 in Dublin, which is running a related analysis on a different cohort with overlapping characteristics. The Amsterdam result enters a static output file. The Dublin team does not see it unless a human coordinator notices the overlap, initiates communication, and arranges a collaboration.
That coordination gap is the routing problem. It has nothing to do with OMOP CDM — the OMOP format is correct and sufficient. The gap is in what happens to results after they exist.
In a 48-node OHDSI network, there are N(N-1)/2 = 1,128 unique node-pair relationships. Under the current architecture, the number of those relationships that actively exchange outcome intelligence at any given moment approaches zero. QIS makes all 1,128 paths available simultaneously, without any site sending raw data to any other site.
What a QIS Outcome Packet Contains (OMOP-Native)
QIS routes pre-distilled outcome packets — not raw OMOP records, not aggregate statistics tables, not model weights. A packet is approximately 512 bytes. It contains everything downstream nodes need to synthesize the result and nothing that exposes the source site's patients.
An OMOP-native outcome packet for a pharmacovigilance finding looks like this:
import hashlib
import json
from dataclasses import dataclass, field
from typing import Optional
@dataclass
class OHDSIOutcomePacket:
"""
QIS outcome packet formatted for OHDSI / OMOP CDM networks.
All vocabulary references use standard OMOP concept IDs.
No patient-level data. No minimum cell count requirement.
Approximate size: 400-520 bytes as JSON.
"""
# Semantic fingerprint — deterministic address in the routing layer
# Built from OMOP vocabulary codes: no free text, no PHI
semantic_fingerprint: str # SHA-256 of (drug_concept_id + condition_concept_id + subgroup_hash)
# OMOP vocabulary references — zero translation, native OHDSI language
drug_concept_id: int # RxNorm concept ID (e.g., 1301025 = warfarin)
condition_concept_id: int # SNOMED CT concept ID (e.g., 4329041 = atrial fibrillation)
measurement_concept_id: Optional[int] # LOINC concept ID if measurement-triggered (e.g., 3016723 = INR)
# Outcome: the distilled result
outcome_direction: str # "increased_risk" | "reduced_risk" | "no_effect" | "insufficient_data"
effect_magnitude: float # Relative risk or hazard ratio (not raw counts)
confidence_level: str # "high" | "moderate" | "low" — based on N and event count
# Subpopulation encoding — no individual-level attributes
age_decade: int # 4 = 40s, 5 = 50s, 6 = 60s
comorbidity_flags: int # Bitmask of OMOP condition domains (no individual conditions)
sex_at_birth_encoded: int # 0 = male-dominant, 1 = female-dominant, 2 = mixed
# Provenance (site identity never transmitted)
analysis_type: str # "cohort_study" | "case_control" | "self_controlled_case_series"
follow_up_days: int # Duration of observation window
min_prior_obs_days: int # Minimum required prior observation — replication context
# Temporal context
observation_period_end: str # YYYY-MM format (not exact date — monthly precision)
packet_generated_at: str # ISO timestamp of packet creation
def build_semantic_fingerprint(self) -> str:
"""
Deterministic address in the routing layer.
Two nodes asking the same clinical question about the same
drug-condition pair in the same subpopulation will generate
identical fingerprints — and therefore find each other's packets.
This is the mechanism that enables routing without a directory.
"""
fingerprint_source = f"{self.drug_concept_id}:{self.condition_concept_id}:{self.age_decade}:{self.comorbidity_flags}"
return hashlib.sha256(fingerprint_source.encode()).hexdigest()[:32]
This is the packet that routes. It is OMOP-native: drug, condition, and measurement are encoded as standard concept IDs. A researcher at Erasmus MC and a researcher at RCSI GDI in Dublin are analyzing the same drug-condition pair in the same age decade — their packets generate identical semantic fingerprints, and they find each other's results automatically.
No coordinator email. No shared database. No minimum cell count requirement that excludes N=1 sites.
The Routing Layer: How Outcome Packets Find Each Other
The routing mechanism is protocol-agnostic. QIS specifies what must be routed (outcome packets with semantic fingerprints), not how. Within an OHDSI network, several routing approaches are viable:
Option 1: DHT-based routing (O(log N) complexity)
Distribute packet addresses across a DHT. Each node holds a fraction of the address space. Lookup cost: O(log N) regardless of network size. When the OHDSI network grows from 48 nodes to 4,800 nodes, routing cost grows from roughly log₂(48) ≈ 6 hops to log₂(4800) ≈ 12 hops. Quadratic growth in synthesis paths; logarithmic growth in routing cost. This is the scaling property that makes QIS viable at global scope.
Option 2: OMOP vocabulary index (O(1) lookup)
Map semantic fingerprints directly to a semantic index keyed on OMOP concept ID combinations. Lookup is O(1) — direct hash table access. This approach integrates most naturally with existing OHDSI infrastructure (Atlas, ATLAS-on-FHIR). No new infrastructure required beyond an outcome packet store.
Option 3: FHIR R4 message channel
FHIR MeasureReport resources already encode aggregate clinical outcomes. A QIS adapter converts MeasureReport to outcome packets, routes via FHIR subscriptions to semantically similar nodes. Maximally compatible with existing OHDSI/EHDS infrastructure investments.
The choice of routing mechanism does not change the scaling math. In a 48-node OHDSI network, QIS closes all 1,128 node-pair paths simultaneously regardless of which routing transport is used.
Drug Safety Monitoring: The Canonical Implementation
The OHDSI community's pharmacovigilance work is the clearest entry point for QIS implementation. The existing OHDSI pharmacovigilance methods — SCCS, cohort method, case-control — generate exactly the kind of outcome packets QIS routes.
Here is the integration pattern:
class OHDSIPharmacovigRouter:
"""
Wraps an existing OHDSI pharmacovigilance analysis to emit QIS outcome packets.
Designed to layer on top of existing CohortMethod or SelfControlledCaseSeries runs.
Does not modify the underlying analysis — only adds outcome routing.
"""
def __init__(self, routing_backend="omop_index"):
"""
routing_backend: "omop_index" | "dht" | "fhir_subscription"
Protocol-agnostic — the QIS loop is the same regardless.
"""
self.routing_backend = routing_backend
self.emitted_packets = []
def emit_from_cohort_method_result(
self,
drug_concept_id: int,
condition_concept_id: int,
hazard_ratio: float,
ci_lower: float,
ci_upper: float,
n_cases: int,
analysis_metadata: dict
) -> OHDSIOutcomePacket:
"""
Called after CohortMethod::runCohortMethod() completes.
Converts the result to a routable outcome packet.
If n_cases < OHDSI small cell threshold: packet still emits with
confidence_level = "low" and effect_magnitude rounded to 1 decimal.
QIS does not require minimum cell counts to route — it routes confidence
along with direction, allowing downstream nodes to weight appropriately.
"""
# Determine confidence from CI width and case count
ci_width = ci_upper - ci_lower
if n_cases >= 100 and ci_width < 0.5:
confidence = "high"
elif n_cases >= 20 and ci_width < 1.5:
confidence = "moderate"
else:
confidence = "low"
# Determine direction
if ci_lower > 1.0:
direction = "increased_risk"
elif ci_upper < 1.0:
direction = "reduced_risk"
elif ci_lower > 0.0:
direction = "no_effect"
else:
direction = "insufficient_data"
packet = OHDSIOutcomePacket(
semantic_fingerprint="", # built below
drug_concept_id=drug_concept_id,
condition_concept_id=condition_concept_id,
measurement_concept_id=analysis_metadata.get("measurement_concept_id"),
outcome_direction=direction,
effect_magnitude=round(hazard_ratio, 1), # rounded for N<threshold
confidence_level=confidence,
age_decade=analysis_metadata.get("age_decade", 0),
comorbidity_flags=analysis_metadata.get("comorbidity_flags", 0),
sex_at_birth_encoded=analysis_metadata.get("sex_encoded", 2),
analysis_type="cohort_study",
follow_up_days=analysis_metadata.get("follow_up_days", 365),
min_prior_obs_days=analysis_metadata.get("min_prior_obs_days", 180),
observation_period_end=analysis_metadata.get("obs_period_end", ""),
packet_generated_at=__import__("datetime").datetime.utcnow().isoformat()
)
packet.semantic_fingerprint = packet.build_semantic_fingerprint()
# Route the packet
self._route(packet)
self.emitted_packets.append(packet)
return packet
def _route(self, packet: OHDSIOutcomePacket):
"""
Route to whichever backend is configured.
The QIS loop is identical regardless of backend.
"""
if self.routing_backend == "omop_index":
self._route_to_omop_index(packet)
elif self.routing_backend == "dht":
self._route_to_dht(packet)
elif self.routing_backend == "fhir_subscription":
self._route_to_fhir(packet)
def _route_to_omop_index(self, packet: OHDSIOutcomePacket):
# Index by semantic fingerprint — O(1) lookup at query time
# Implementation uses existing OHDSI infrastructure (PostgreSQL / Atlas DB)
print(f"[OMOP_INDEX] Depositing packet: {packet.semantic_fingerprint} → {packet.outcome_direction} (HR={packet.effect_magnitude}, confidence={packet.confidence_level})")
def _route_to_dht(self, packet: OHDSIOutcomePacket):
# DHT routing — O(log N) regardless of network size
# Implementation: libp2p Kademlia or Hyperswarm
print(f"[DHT] Routing packet fingerprint {packet.semantic_fingerprint} to nearest peers")
def _route_to_fhir(self, packet: OHDSIOutcomePacket):
# FHIR R4 MeasureReport wrapping — maximally compatible with EHDS/GDI
print(f"[FHIR] Emitting MeasureReport for {packet.drug_concept_id}→{packet.condition_concept_id}")
# --- USAGE EXAMPLE: Post-analysis in an OHDSI R pipeline ---
#
# After running CohortMethod::runCohortMethod() on your OMOP CDM node:
#
# router = OHDSIPharmacovigRouter(routing_backend="omop_index")
#
# router.emit_from_cohort_method_result(
# drug_concept_id=1301025, # warfarin (RxNorm)
# condition_concept_id=4329041, # atrial fibrillation (SNOMED CT)
# hazard_ratio=1.34,
# ci_lower=1.12,
# ci_upper=1.58,
# n_cases=847,
# analysis_metadata={
# "measurement_concept_id": 3016723, # INR (LOINC)
# "age_decade": 7,
# "comorbidity_flags": 0b00110,
# "sex_encoded": 2,
# "follow_up_days": 365,
# "min_prior_obs_days": 180,
# "obs_period_end": "2025-12"
# }
# )
#
# That's it. The result is now routable across the OHDSI network
# to any node running the same drug-condition analysis in the same subgroup.
# Amsterdam publishes. Dublin receives. No coordinator email required.
The N=1 Site Problem OHDSI Has Not Solved
OHDSI's federated query architecture imposes minimum cell count thresholds for privacy protection. If a site has fewer than 5 patients in a query cell, results are suppressed. This is correct privacy practice for aggregate statistics.
The consequence: OHDSI structurally excludes rare disease sites. A site with N=1 or N=2 patients with a specific rare condition cannot contribute to any federated analysis. Every data point from that site is permanently suppressed.
QIS eliminates this constraint by construction. Outcome packets do not encode counts. They encode directions, magnitudes, and confidence levels — all of which can be computed and rounded even from very small observations. A site with N=1 patients who experienced a specific adverse event emits a packet with confidence_level="low" and effect_magnitude rounded to 1 decimal. That low-confidence signal routes to every other site with the same drug-condition fingerprint. Downstream nodes weight it appropriately alongside high-confidence signals from larger sites.
The OHDSI community has discussed this problem explicitly — the minimum cell count threshold is a known limitation for rare diseases. QIS does not require minimum cell counts because it routes abstracted outcomes, not raw records.
For rare disease researchers attending OHDSI Europe: this is the structural argument. Your site can participate fully in QIS-augmented OHDSI networks regardless of your patient population size.
Compatibility with European Health Data Space (EHDS) and GDI Ireland
The European Health Data Space directive requires member states to make health data available for secondary use while protecting individual privacy. GDI Ireland (Genomics Data Infrastructure) is building federated infrastructure that must comply with GDPR by design.
QIS is architecturally compatible with both:
EHDS alignment: EHDS requires that secondary use occur without raw data leaving its country of origin. QIS outcome packets contain no raw data — they are pre-distilled abstractions computed locally. The routing layer carries only the packet, never the underlying records. EHDS secondary use compliance is a structural property of the architecture, not a compliance layer added on top.
GDI Ireland alignment: GDI's federated query model is OMOP CDM-based. QIS outcome packets in the OMOP-native format above use only standardized vocabulary codes. No free-text fields, no institutional identifiers, no patient-linkable data in the routing layer.
GDPR Article 5(1)(c) — data minimization: QIS outcome packets are the minimum data necessary to communicate a clinical result. A 512-byte packet encoding a drug-condition outcome direction and confidence is maximally minimized relative to the underlying clinical insight it conveys.
For OHDSI Europe attendees from GDI participating countries: the QIS routing layer is designed to operate within the legal and technical framework you are already building.
What Changes for the OHDSI Network at 48 Nodes vs. 4,800 Nodes
The math is the reason this matters at scale.
The current OHDSI network has approximately 400+ member sites globally. The European OHDSI network has 48+ participating institutions. Under the current architecture, cross-site learning happens through coordinated study packages — a researcher proposes a study, writes an ATLAS definition, distributes it to sites, waits for results, aggregates them. Round trip: weeks to months.
Under QIS augmentation:
| Network size | Node pairs | Synthesis paths currently active | Synthesis paths with QIS |
|---|---|---|---|
| 10 nodes | 45 | ~0 (requires coordinated study) | 45 (continuous) |
| 48 nodes | 1,128 | ~0 | 1,128 (continuous) |
| 400 nodes | 79,800 | ~0 | 79,800 (continuous) |
| 4,800 nodes | ~11.5M | ~0 | ~11.5M (continuous) |
The synthesis paths do not require human coordination to activate. They activate automatically whenever two nodes share a semantic fingerprint — the same drug, condition, subgroup combination encoded in the same OMOP vocabulary terms.
The routing cost for each node grows at O(log N). At 4,800 nodes, each node pays O(log 4,800) ≈ 12 routing operations per outcome packet, not 4,800. Quadratic intelligence scaling at logarithmic compute cost. This is the scaling property Christopher Thomas Trevethan discovered on June 16, 2025.
The Implementation Path for an OHDSI Site
An existing OHDSI node does not need to replace its infrastructure to implement QIS. The integration is additive:
- Keep OMOP CDM. The data format is correct. QIS uses OMOP concept IDs natively.
- Keep Atlas and existing study packages. Coordinated federated queries continue to run as before.
-
Add QIS outcome packet emission as a post-processing step after each analysis. The
OHDSIPharmacovigRouterpattern above is one implementation — call it afterCohortMethod::runCohortMethod()completes. - Choose a routing backend that fits your infrastructure. OMOP index (PostgreSQL) is the lowest-barrier entry point. DHT adds decentralization. FHIR subscriptions add EHDS compatibility.
- Add QIS outcome packet ingestion to synthesize incoming packets from other nodes. The ingestion step is local — packets arrive at your node and are synthesized by your local analysis layer. No raw data ever entered from another site.
The loop is: your analysis runs → outcome packet emits → routes to similar nodes → their packets arrive at your node → your analysts synthesize them → your next analysis is informed by the entire network's experience on the same clinical question.
That loop is what Christopher Thomas Trevethan's 39 provisional patents cover. Not the routing transport. Not the OMOP vocabulary. The complete loop — and the quadratic intelligence scaling that emerges when you close it.
Three Questions for OHDSI Europe Attendees
If you are presenting at, attending, or watching OHDSI Europe Symposium this week, here are three questions worth raising:
1. Where do your pharmacovigilance results go after they exist?
They live in an output file. They reach other sites only through coordinated study packages. QIS routes them automatically to sites with the same clinical question. What would change in your research practice if your pharmacovigilance results were discoverable in real time by every semantically similar node in the network?
2. Can your rare disease sites participate?
OHDSI's minimum cell count threshold structurally excludes them. QIS outcome packets do not require minimum cell counts. If your network has rare disease sites that cannot currently contribute to federated analyses, QIS removes that barrier without changing your privacy architecture.
3. Which step in the QIS loop fails for OMOP CDM?
The five steps:
- An OHDSI node runs a pharmacovigilance analysis. Output: drug-condition hazard ratio with CI.
- The hazard ratio is encoded as a 512-byte outcome packet using OMOP concept IDs. No PHI, no raw data.
- The packet receives a semantic fingerprint (SHA-256 of drug_concept_id + condition_concept_id + subgroup encoding).
- The packet routes to other nodes with the same fingerprint via O(log N) routing.
- Other nodes synthesize the arriving packets locally alongside their own results. No raw data crossed any boundary.
Which step fails? If none fail, the architecture works. If you find a failure, the QIS protocol needs to know about it.
The conversation is open. The architecture is in 39 provisional patents filed by Christopher Thomas Trevethan. The question for Rotterdam is whether the OHDSI community is ready to close the routing loop that OMOP CDM left open.
QIS (Quadratic Intelligence Swarm) protocol was discovered by Christopher Thomas Trevethan. 39 provisional patents filed. The discovery: routing pre-distilled outcome packets by semantic similarity enables quadratic intelligence scaling at logarithmic compute cost — regardless of the routing transport used. IP protection is in place.
Previous in this series: OHDSI Has the Routing Problem QIS Solves
Top comments (0)