DEV Community

Rory | QIS PROTOCOL
Rory | QIS PROTOCOL

Posted on

The Routing Upgrade PCORnet Has Been Waiting For

A researcher at Duke University Medical Center is six months into a multi-site cardiovascular outcomes study. She has submitted coordinated queries through the PCORnet infrastructure, and fourteen partner institutions have returned aggregate results. The data is clean. The CDM alignment is solid. But she has a problem that the current architecture cannot solve: right now, at this moment, three of those fourteen sites are enrolling new patients with a subgroup characteristic her protocol did not originally stratify for. She does not know this. She cannot know this. PCORnet's query model has no mechanism for what one site is learning to reach another site until a new coordinated query is formally submitted, distributed, executed, and returned.

This is not a failure of PCORnet. It is the correct tradeoff for a network that prioritizes patient privacy and data governance above all else. Centralization is off the table for good reason. But distributed query execution and distributed learning are not the same thing, and PCORnet currently delivers the first without the second. The gap between them is where the Quadratic Intelligence Swarm (QIS) protocol fits.


What PCORnet Actually Does (And Doesn't Do)

PCORnet — the Patient-Centered Outcomes Research Network — represents approximately 188 million patients across participating US health systems. It operates on a common data model (CDM) that normalizes clinical records into a shared structure, enabling coordinating centers to issue queries that each participating node can execute locally against its own data without ever transmitting patient-level records.

The architecture is sound and well-documented. The coordinating center composes a query in terms of CDM fields. Each data partner runs the query against its local instance. Aggregate results return to the coordinating center. No raw data crosses the boundary. The patient privacy guarantee is maintained architecturally, not just by policy.

But the query model has a structural ceiling. Every query is a discrete, synchronous event:

  1. Coordinating center formulates query
  2. Query distributes to participating nodes
  3. Nodes execute against local data
  4. Aggregate results return
  5. Analysis happens at the coordinating center

Step 5 is where the learning happens — and it happens only at the coordinating center, only after the query cycle completes, and only in response to the exact question the query asked. If a site's local analysis reveals something unexpected during step 3 — an anomalous subgroup, an enrollment imbalance, an early safety signal — that insight has no pathway into the network until a new query cycle is initiated from the top.

The PCORnet population health research infrastructure [Fleurence et al., 2014; Forrest et al., 2014] was explicitly designed to prevent centralization. It succeeded. But the cost of that success is that no node can learn from what another node is finding in real time. Each query is a one-time, siloed snapshot. The network accumulates data events but not distributed intelligence.

This is the specific limitation a routing layer can solve.


The QIS Routing Layer: What It Adds Without What It Breaks

The Quadratic Intelligence Swarm (QIS) protocol, discovered by Christopher Thomas Trevethan, is a routing architecture for distilled outcome intelligence. The core loop works as follows:

  1. A node processes local data and produces a distilled outcome packet — approximately 512 bytes — that captures the result of local analysis, not the underlying records
  2. The outcome packet is assigned a semantic fingerprint derived from the content of the finding
  3. The fingerprint is used to route the packet to nodes whose semantic profile indicates they are studying similar populations or phenomena
  4. Receiving nodes synthesize the incoming packet with their own local findings
  5. The synthesis output updates the receiving node's semantic profile, which may trigger further routing
  6. The loop continues

What never moves: raw patient records, identified data, or anything that would trigger HIPAA re-identification concerns. The outcome packet carries distilled signal only. Privacy is maintained by architecture, not by access controls applied on top of a centralizing system.

What this enables that PCORnet's query layer cannot: a node at a small regional health system enrolling three patients with a rare complication can deposit an outcome packet that routes, by semantic similarity, to the seven other nodes across the country that are working on adjacent questions — without any of them knowing to ask. No coordinating center initiates this. No new query is submitted. The signal travels because it matches, not because someone requested it.

The difference in the research workflow is significant. PCORnet today: coordinating center sends query, nodes respond, results aggregate once, analysis happens centrally. PCORnet with a QIS routing layer: nodes continuously deposit outcome packets after local analysis cycles, any node can pull semantically relevant packets at any time, synthesis happens locally at each receiving node. The coordinating center's role in the query layer does not change. The routing layer operates above it.

This is not a replacement architecture. It is an additive layer. The existing CDM infrastructure, governance model, and query distribution mechanism remain intact.


From OMOP Concept IDs to Semantic Fingerprints

The practical question for any PCORnet-aligned implementation is: where does the semantic fingerprint come from? The answer is already present in the CDM.

PCORnet's common data model aligns with OMOP (Observational Medical Outcomes Partnership) vocabulary standards. Every clinical concept in an OMOP-compliant dataset is encoded as a concept_id — a stable integer identifier drawn from a hierarchical vocabulary that encodes semantic relationships. SNOMED-CT codes map to concept_id values. RxNorm drug identifiers map to concept_id values. LOINC lab codes map to concept_id values.

A semantic fingerprint for a QIS outcome packet can be derived directly from the concept_id values that are most clinically salient to the finding being packaged:

fingerprint = {
  primary_condition:   concept_id_441542,   // Atrial fibrillation (SNOMED)
  intervention:        concept_id_1310149,  // Apixaban (RxNorm)
  outcome_domain:      concept_id_4329847,  // Myocardial infarction (SNOMED)
  population_age_band: "65-74",
  study_phase:         "interim_6mo",
  effect_direction:    "protective",
  effect_magnitude:    0.23,               // log-odds or standardized effect
  n_contributing:      47                  // local cohort count, not identifiable
}
Enter fullscreen mode Exit fullscreen mode

This fingerprint structure is entirely derivable from OMOP CDM fields without transmitting any patient-level data. The concept_id values are the semantic address. Nodes whose local studies involve the same concept_id cluster are, by definition, studying related clinical questions — and are the correct recipients for the outcome packet.

The routing cost for delivering this packet under a DHT-based transport is O(log N), where N is the number of nodes in the network. That is the upper bound; many transport configurations achieve O(1) routing for high-frequency concept clusters. A network of 1,000 PCORnet data partners would require at most 10 routing hops per packet delivery. The packet itself, at ~512 bytes, is smaller than a single HTTP header payload.

The total synthesis opportunity across N nodes is N(N-1)/2 pairwise combinations. For a 1,000-node PCORnet deployment, that is 499,500 potential synthesis events — each representing a node learning from a specific other node's finding on a related question. PCORnet's current query architecture makes none of these available without explicit coordinated queries. The QIS routing layer makes all of them available continuously.


The Comparison in Concrete Terms

Consider two scenarios involving the same cardiovascular outcomes study at the same fourteen participating institutions.

Scenario A: PCORnet query model (current)

Month 1: Coordinating center issues baseline query. All fourteen sites respond with aggregate cohort characteristics. Month 6: Coordinating center issues interim outcomes query. Results return. Analysis reveals that one subgroup — patients over 72 with a specific comorbidity combination — is showing unexpected hazard rates. Month 7: A new query is designed to stratify on this subgroup. It distributes, executes, and returns results in month 8. The researcher now has data on the subgroup — fourteen months into a study that might have identified the signal at month 3 if the information pathway had existed.

Scenario B: PCORnet with QIS routing layer

Month 1: Study initiates. Coordinating center issues baseline query normally. Each site, as it completes its local interim analyses, deposits outcome packets with semantic fingerprints derived from its OMOP concept clusters. Month 3: A site in Cleveland deposits a packet flagging elevated hazard in the 72+ comorbidity subgroup. Its fingerprint matches the concept profile at four other sites. Those four sites receive the packet, synthesize it against their own local findings, and their own interim analysis protocols surface the signal. Month 4: The researcher's node receives a synthesized outcome packet from two of those four sites. The subgroup stratification is added to the study protocol at month 5 — nine months earlier than in Scenario A.

No patient data left any site. No coordinating center query was required to surface the signal. The routing happened because the semantic fingerprints matched.


Where This Matters Most

Rare disease research. PCORnet's distributed model is powerful for common conditions because query results from 188 million represented patients can yield statistically meaningful cohorts at most sites. For rare conditions, individual sites may enroll N=1 or N=2 patients over a multi-year study. These sites currently have limited ability to contribute to cross-network learning because they cannot anchor a statistically meaningful aggregate result. Under a QIS routing model, a site with N=2 patients can still deposit an outcome packet. The packet's value is not its statistical weight — it is the semantic signal it carries and the synthesis it enables when combined with similar packets from other low-enrollment sites. N=1 sites across 500 institutions collectively represent a meaningful signal that current coordinated query architectures cannot extract.

Interim safety signals. Pharmacovigilance in distributed clinical networks depends on the speed at which safety signals can be identified and propagated. Current PCORnet architecture requires a formal query cycle for safety signal detection. A QIS routing layer enables any node that identifies an adverse event pattern to deposit a packet that routes to semantically similar nodes immediately — without waiting for a coordinating center to formulate and distribute a query. The signal propagates at routing speed, not query cycle speed.

Cross-site subgroup discovery. Clinical research increasingly depends on identifying subgroups that were not apparent in the original study design. Subgroup discovery in a federated architecture currently requires iterative query cycles, each adding weeks to the research timeline. Continuous outcome packet routing allows subgroup-relevant signals to propagate as they are identified locally, enabling cross-site subgroup discovery without explicit coordinating center involvement.

Protocol adaptation. Multi-site studies that run over years often need to adapt their protocols in response to emerging findings. The faster that interim findings can propagate across the network, the faster protocol amendments can be grounded in cross-site evidence. The QIS routing layer shortens the feedback loop between local observation and network-level protocol adjustment.


The Infrastructure Is Already There

PCORnet data partners already maintain OMOP-aligned CDM implementations. OMOP concept_id values are already present in every compliant dataset. The semantic fingerprint components for a QIS routing layer are already being generated as a byproduct of normal CDM operations — they simply have no pathway out of the local environment in their distilled, non-identifiable form.

The QIS routing layer is not asking PCORnet to rebuild its data model, retrain its staff, or renegotiate its data use agreements. It is asking for one addition: a protocol by which distilled outcome packets, derived from CDM-native concept identifiers and carrying no patient-level data, can be routed to semantically similar nodes continuously rather than batched and coordinated episodically.

The complete loop — local processing, distillation into outcome packets, semantic fingerprinting via OMOP concept clusters, routing by similarity, delivery to matching nodes, local synthesis, continued routing — is the architecture that changes what federated clinical research networks can do. Not the DHT transport. Not the packet format. Not the fingerprinting algorithm in isolation. The loop, operating continuously, converting siloed query execution into distributed outcome intelligence.

PCORnet built the foundation correctly. It has 188 million patients represented, a normalized common data model, and a governance structure that has earned institutional trust across the US health system. What it does not have is a mechanism for one site's findings to inform another site's analysis without a coordinating center in the middle.

The routing layer is the missing piece. The semantic address is already in the CDM. The infrastructure is waiting for the protocol.


References

  • Fleurence RL, Curtis LH, Califf RM, et al. Launching PCORnet, a national patient-centered clinical research network. J Am Med Inform Assoc. 2014;21(4):578-582.
  • Forrest CB, Margolis PA, Bailey LC, et al. PEDSnet: a national pediatric learning health system. J Am Med Inform Assoc. 2014;21(4):602-606.
  • Garza M, Del Fiol G, Tenenbaum J, et al. Evaluating common data models for use with a longitudinal community registry. J Biomed Inform. 2016;64:333-341.
  • McMurry AJ, Murphy SN, MacFadden D, et al. SHRINE: enabling nationally scalable multi-site disease studies without sharing patient-level data. J Am Med Inform Assoc. 2013;20(6):1028-1036.
  • Malin BA, Emam KE, O'Keefe CM. Biomedical data privacy: problems, perspectives, and recent advances. J Am Med Inform Assoc. 2013;20(1):2-6.
  • Raisaro JL, Troncoso-Pastoriza JR, Misbach M, et al. MedCo: Enabling Secure and Privacy-Preserving Exploration of Distributed Clinical and Genomic Data. IEEE/ACM Trans Comput Biol Bioinform. 2019;16(4):1328-1341.

Christopher Thomas Trevethan discovered the Quadratic Intelligence Swarm (QIS) protocol on June 16, 2025. 39 provisional patents filed.

Top comments (0)