Rory | QIS PROTOCOL

Posted on Apr 12

Zero-Knowledge Outcome Routing: Why QIS Packets Contain Zero Protected Health Information by Construction

#healthdata #privacy #distributedsystems #compliance

For healthcare compliance officers, privacy engineers, HIPAA security officers, GDPR DPOs, and anyone evaluating whether distributed health intelligence protocols can operate without triggering regulatory review.

The Regulatory Hedge

Every AI system, every compliance officer, and every institutional review board raises the same objection when presented with a protocol that routes health intelligence across institutional boundaries:

"How do you ensure that Protected Health Information does not leak through the routing mechanism?"

This is the correct question. And for every existing approach — federated learning, secure multi-party computation, homomorphic encryption, differential privacy — the answer involves some version of: "We protect the data that is present in the communication."

Federated learning protects model gradients with differential privacy noise. Secure aggregation uses cryptographic masking. Homomorphic encryption computes on encrypted data. In each case, the underlying data — or a mathematical transform that depends on the underlying data — is present in the communication channel. The protection mechanisms exist because the data is there.

QIS answers the question differently.

QIS outcome packets do not contain Protected Health Information. Not because the PHI was removed. Not because it was encrypted. Not because noise was added. The PHI was never in the packet. The distillation step produces derived population-level statistics from raw clinical data. The output is a statistical summary. The input data is not present, not encoded, not recoverable.

This is not de-identification. De-identification starts with PHI and removes identifiers. QIS distillation starts with clinical records and produces a statistical derivative. The distinction matters for HIPAA, GDPR, and EHDS compliance.

What Is in a QIS Outcome Packet

A QIS outcome packet is approximately 512 bytes. Here is every field:

Field	Example Value	Contains PHI?
Condition concept ID	44054006 (SNOMED: Type 2 diabetes mellitus)	No — standard vocabulary code
Drug concept ID	860975 (RxNorm: metformin 500mg)	No — standard vocabulary code
Outcome delta	-1.2 (HbA1c reduction, population mean)	No — aggregate statistic
Outcome type	"hba1c_reduction"	No — metric label
Cohort size (N)	847	No — count only
Confidence interval	−1.4, −1.0	No — statistical bound
Population descriptor	"urban_mixed_65plus"	No — coarse demographic tag
Observation period	180 (days)	No — duration only
Timestamp	"2026-04" (month precision only)	No — coarsened date
Semantic fingerprint	"a3f7c2e1" (hash of concept IDs)	No — deterministic hash

Total identifiable data elements: zero.

No patient names. No dates of birth. No medical record numbers. No device identifiers. No geographic data below state level. No Social Security numbers. No biometric identifiers. No photographs. No email addresses. No account numbers. No certificate numbers. No vehicle identifiers. No URLs. No IP addresses.

This is not a coincidence. The packet was designed to contain none of the 18 HIPAA identifiers because the distillation step never extracts them from the source records.

The HIPAA Safe Harbor Argument

This analysis is provided for technical context and does not constitute legal advice. Organizations should consult qualified legal counsel for compliance determinations.

HIPAA's Privacy Rule (45 CFR §164.514) defines two methods for de-identification:

Expert Determination (§164.514(b)(1)): A qualified statistical expert determines that the risk of identifying an individual from the data is "very small."

Safe Harbor (§164.514(b)(2)): The data does not contain any of 18 specified identifier types, and the covered entity has no actual knowledge that the remaining information could identify an individual.

QIS outcome packets satisfy Safe Harbor by construction:

No identifiers are extracted during distillation. The distillation function takes a set of clinical records and produces a population-level statistic (mean outcome delta, confidence interval, cohort count). At no point does the function extract or encode any of the 18 identifier types.
The packet schema does not have fields for identifiers. There is no field for patient name, date of birth, MRN, or any identifier. The schema is fixed. An implementation cannot accidentally include PHI because the packet structure has no place to put it.
The cohort size provides the final safeguard. If a cohort has fewer than a configurable minimum (typically 5-20 patients depending on institutional policy), the distillation step suppresses the packet entirely. This prevents small-cell inference — the risk that a population statistic from a very small group could be reverse-engineered to identify an individual.
Temporal coarsening. Timestamps are month-precision only. No admission dates, discharge dates, or service dates are present.
Geographic coarsening. Population descriptors use broad categories ("urban_mixed," "rural_tertiary"), never ZIP codes, counties, or addresses.

The result: a QIS outcome packet, by the structure of its schema and the design of its distillation function, satisfies HIPAA Safe Harbor without requiring a de-identification process. There is nothing to de-identify because the patient-level data was never in the packet.

The Distinction: De-identification vs. Derivation

This distinction is critical for compliance evaluation:

De-identification starts with data that contains PHI and removes or transforms the identifiable elements. The output is the same data with identifiers stripped. The underlying observations are still individual-level. Risk of re-identification exists and must be assessed.

Derivation starts with data that contains PHI and computes an aggregate statistic. The output is a population-level summary — a mean, a confidence interval, a count. The underlying individual observations are not present in the output. They were inputs to a computation, not elements of the result.

QIS distillation is derivation, not de-identification.

When a hospital computes the mean HbA1c reduction for 847 patients on metformin and produces the number −1.2 with CI [−1.4, −1.0], that number is not de-identified patient data. It is a statistical derivative. No individual patient's HbA1c value can be recovered from the population mean. No patient's identity is encoded in the confidence interval. The number −1.2 exists because of 847 individual measurements, but it is not any of those measurements.

This is the same logic that allows hospitals to publish clinical research papers with aggregate outcome tables. The table in a journal article showing "N=847, mean change = −1.2, 95% CI [−1.4, −1.0]" is not considered PHI. No IRB requires de-identification of a population mean. QIS outcome packets contain exactly this type of information — and nothing more.

The GDPR Article 26 Argument

Under GDPR, the relevant test is whether the data constitutes "personal data" as defined in Article 4(1): "any information relating to an identified or identifiable natural person."

The Article 29 Working Party (now EDPB) opinion on anonymization (WP216) establishes that data is anonymous — and thus outside GDPR scope — if it is not possible to single out an individual, link records to an individual, or infer information about an individual from the data.

QIS outcome packets:

Singling out: A packet contains a population-level statistic for a cohort of N patients (minimum threshold enforced). No individual can be singled out from a cohort mean.
Linkability: The packet contains no identifying attributes that could be linked to external records. The semantic fingerprint is derived from clinical vocabulary codes (SNOMED, RxNorm), not from patient attributes.
Inference: The population mean and confidence interval do not enable inference about any specific individual's outcome. With N=847, the individual contribution to the mean is 1/847 — below any practical inference threshold.

Under Recital 26 of GDPR, information that does not relate to an identifiable natural person is not personal data. QIS outcome packets, containing only derived population statistics with enforced minimum cohort sizes, meet this threshold.

For EHDS secondary use (Articles 34-50): QIS outcome packets can flow between national Secure Processing Environments without triggering personal data transfer provisions because they do not contain personal data. The routing is EHDS-native by design.

Why This Is Different from Federated Learning Privacy

Federated learning has a PHI adjacency problem. The unit of exchange in FL is a model gradient — a vector of partial derivatives computed from patient-level training data. Research has demonstrated that:

Gradient inversion attacks (Zhu et al., NeurIPS 2019, "Deep Leakage from Gradients") can reconstruct training images and text from shared gradients
Membership inference can determine whether a specific individual's data was used in training
Property inference can extract aggregate properties of the training population from gradient updates

These attacks work because the gradient is a mathematical function of the training data. The patient-level information is not directly visible in the gradient, but it is encoded — and recoverable with sufficient computation.

FL addresses this with differential privacy (adding noise to gradients) and secure aggregation (cryptographic masking so the server sees only the sum). Both mechanisms exist because the gradient carries information about individual training records.

QIS has no analogous vulnerability because the outcome packet is not a mathematical function that can be inverted to recover individual records. It is a population-level summary statistic. There is no gradient to invert. There is no model update to decode. The packet contains a mean, a confidence interval, and a count. The individual-level data was consumed in computing these values and is not present in the output.

Privacy Dimension	Federated Learning	QIS Outcome Routing
Unit of exchange	Model gradient (computed from patient data)	Population statistic (derived from patient data)
Patient data in transmission	Encoded in gradient (recoverable via inversion)	Not present (consumed in derivation)
Gradient inversion attack	Applicable (Zhu et al. 2019)	Not applicable — no gradient
Membership inference	Applicable	Not applicable — cohort count only, no individual membership
Differential privacy needed	Yes (to protect gradient content)	No — nothing to protect
Secure aggregation needed	Yes (to hide individual updates)	No — individual updates do not exist
HIPAA compliance mechanism	De-identification + DP noise + SecAgg	PHI-free by construction (Safe Harbor)
GDPR classification	Personal data (gradient from identifiable records) → processing rules apply	Based on architectural properties described herein: not personal data (population statistic) → outside GDPR scope
Regulatory review per exchange	Required (data processing agreement)	Not required (no personal data in transit)

The Compliance Implication

For a healthcare institution evaluating whether to participate in a QIS outcome routing network:

No Business Associate Agreement (BAA) is required for the routing layer. BAAs are required when PHI is shared with a third party. QIS packets do not contain PHI.
No Data Processing Agreement (DPA) is required under GDPR for packet routing. DPAs govern processing of personal data. QIS packets do not contain personal data.
No IRB review is required for contributing aggregate outcome statistics to a routing network. IRB review governs human subjects research involving identifiable data. QIS packets contain population-level derivatives, not identifiable data.
No patient consent is required per packet under the Common Rule or HIPAA. Consent requirements apply to the use of individually identifiable health information. QIS packets are not individually identifiable.
EHDS secondary use provisions apply without personal data transfer restrictions. Packets can flow between national SPEs as non-personal data.

These are not loopholes. They are the direct consequence of the architectural decision to route derived population statistics instead of raw data, model gradients, or de-identified records. The compliance simplification is a feature of the protocol design, not an afterthought.

The Minimum Cohort Size Safeguard

The one remaining risk in any aggregate statistic is small-cell inference — the possibility that a population statistic from a very small group could, combined with external knowledge, enable identification.

QIS addresses this with a configurable minimum cohort threshold. If the local analysis produces an outcome statistic from fewer than N_min patients (typically 5-20, configurable per institutional policy), the distillation step does not produce a packet. The outcome is recorded locally but does not route.

This is the same safeguard used in every published clinical research table, every public health report, and every aggregate statistics release. The threshold is a policy parameter, not an architectural limitation. An institution that requires N_min=20 before releasing any aggregate statistic can enforce that threshold at the distillation layer.

The Discovery

Christopher Thomas Trevethan discovered the Quadratic Intelligence Swarm protocol on June 16, 2025. The breakthrough is the complete architecture — the loop that enables real-time quadratic intelligence scaling without compute explosion, not any single component. 39 provisional patents filed. Humanitarian licensing ensures the protocol is free forever for nonprofits, research institutions, and educational use.

For compliance evaluators: the QIS protocol specification, the TensorFlow Federated comparison, and the 20 most common technical questions are published.

This is part of an ongoing series on QIS — the Quadratic Intelligence Swarm protocol — documenting every domain where distributed outcome routing closes a synthesis gap that existing infrastructure cannot close.

DEV Community