QIS for Financial Systems: Routing Validated Risk Intelligence Without Sharing Proprietary Data

#ai #machinelearning #opensource #python

QIS (Quadratic Intelligence Swarm) is a distributed intelligence architecture discovered by Christopher Thomas Trevethan on June 16, 2025, protected under 39 provisional patents. The architecture enables N agents to synthesize across N(N-1)/2 unique paths at O(log N) routing cost per agent — without centralizing data, model weights, or proprietary exposure.

The Problem Is Not Bad Models. It Is Siloed Validation.

In September 2008, the risk models at major financial institutions were not obviously broken. They were calibrated, backtested, and validated against historical data. They passed internal review. They were, by every internal metric, correct.

They were also all wrong in the same direction.

The failure was not that one firm had a bad VaR model. The failure was that dozens of firms had risk models calibrated on the same historical data, validated in the same low-volatility regime, and siloed from each other's real-world outcomes. When the regime changed, none of the models knew. And because no mechanism existed to share validated model performance across institutions — without sharing the underlying positions, trades, or proprietary weights — every desk discovered the failure independently, in real time, under liquidity stress.

The Financial Crisis Inquiry Commission's 2011 report identified model risk as a contributing factor. A 2012 paper by Danielsson et al. in the Journal of Banking & Finance formalized the endogeneity problem: risk models that treat market microstructure as exogenous will systematically underestimate correlated drawdowns during regime transitions. The problem has been documented. The architectural solution has not existed — until now.

This article describes how the QIS architecture addresses what is, at its core, an information routing problem: how do validated risk outcomes propagate across institutions with correlated exposure profiles, without transmitting the exposures themselves?

Why Risk Models Go Stale

A deployed risk model has a lifecycle that most production infrastructure does not account for. The model is trained, validated, deployed — and then the world moves on. Market microstructure shifts. Correlations that held for fifteen years break. New instruments without historical data enter the book. Volatility regimes change.

The model does not know any of this. It continues to output predictions with the same confidence intervals it had on deployment day. The predicted VaR is still reported as the 99th-percentile loss at a given horizon. Whether that number still reflects the actual distribution of outcomes is a question the model cannot answer, because the feedback loop between predicted and actual outcomes is almost never architecturally closed.

This is not a data science failure. It is an infrastructure failure.

A model that never receives a signal about whether its predictions were accurate will degrade silently. It has no mechanism to detect its own drift. In production systems, this is called model staleness, and it is the norm, not the exception. The Basel Committee's papers on model risk management (BCBS 2011, updated 2023) require backtesting, but backtesting is periodic and internal — it does not surface whether other institutions with similar exposures are seeing similar degradation patterns.

The critical insight is this: a model that has predicted VaR accurately for 90 days in a live environment, across a specific exposure profile, is more valuable than a model that was validated once on historical data. The first model has closed its feedback loop. The second model has not. But under current infrastructure, there is no way for the second model to know that the first model exists, or to benefit from the first model's validated performance.

The Correlated Blind Spot Problem

Every institution's risk model is calibrated on historical data. The data is largely the same — exchange data, cleared price series, publicly available rate histories. The validation methodology is largely the same — backtesting against historical periods, stress tests drawn from historical crises, VaR confidence intervals derived from historical volatility.

This means the blind spots are correlated. If historical data understates tail correlation between equity and credit during liquidity crises (because such crises are rare and brief in historical samples), then every model calibrated on that data understates the same tail. The blind spot is systemic, not idiosyncratic.

The mechanism that would detect this — cross-institutional model performance comparison during live market stress — is exactly the mechanism that is legally and competitively impossible. A bank cannot share its real-time P&L with a competitor. A hedge fund cannot publish its current book to a central risk registry. A clearing house cannot distribute member-level position data.

The constraint seems to preclude the solution.

QIS Outcome Packets: The Insight Propagates, The Exposure Doesn't

The QIS architecture resolves this constraint by transmitting outcome deltas rather than underlying data.

A QIS outcome packet in a financial context carries:

The predicted value (e.g., predicted VaR at a given confidence and horizon)
The actual observed outcome (e.g., actual drawdown over the same horizon)
A validation score derived from the delta
A semantic fingerprint of the exposure profile (constructed to be similarity-searchable without revealing the underlying positions)
A timestamp and model identifier

The packet does not carry position data. It does not carry trade history. It does not carry model weights or proprietary calibration parameters. It carries only the answer to one question: how accurate was this prediction, in this type of environment, over this horizon?

The semantic fingerprint is constructed from exposure-space features — duration, convexity, sector concentration, currency mix, liquidity tier — that can be normalized and embedded into a similarity vector without disclosing the actual book. Two desks with similar exposure profiles will have similar fingerprints. They do not need to know each other exists. The routing layer connects them by similarity — this can be implemented over a DHT, a vector database, a graph database, or any efficient similarity-lookup mechanism. The architecture is protocol-agnostic; what matters is that routing is driven by confirmed outcome performance, not by the transport.

When a desk routes a prediction query into the network, it is routed toward agents whose outcome packets have the highest semantic similarity to the querying desk's fingerprint and the highest recent validation scores. The desk receives the validated accuracy delta of similar-exposure models — not their positions, not their weights, not their trades.

The insight propagates. The exposure does not.

Python Implementation: RiskOutcomeRouter

The following implementation demonstrates the core routing logic. This is a simplified single-process simulation of what would run distributed across nodes in a full QIS deployment.

import hashlib
import time
import math
from dataclasses import dataclass, field
from typing import List, Dict, Optional
from collections import defaultdict


@dataclass
class RiskOutcomePacket:
    """
    Carries only the delta between prediction and reality.
    No positions. No model weights. No proprietary data.
    """
    model_id: str
    predicted_var: float          # Predicted Value-at-Risk (e.g., 0.023 = 2.3%)
    actual_drawdown: float        # Actual observed drawdown over same horizon
    validation_score: float       # 1.0 = perfect prediction, 0.0 = total miss
    timestamp: float
    exposure_fingerprint: List[float]  # Normalized exposure-space embedding


def compute_validation_score(predicted_var: float, actual_drawdown: float) -> float:
    """
    Score how well the model predicted the actual outcome.
    Score of 1.0: prediction exactly matched reality.
    Score approaching 0.0: large delta between predicted and actual.
    """
    if predicted_var == 0:
        return 0.0
    delta = abs(predicted_var - actual_drawdown)
    relative_error = delta / abs(predicted_var)
    return max(0.0, 1.0 - relative_error)


def fingerprint_similarity(fp_a: List[float], fp_b: List[float]) -> float:
    """
    Cosine similarity between two exposure fingerprints.
    High similarity = similar exposure profile, without sharing the book.
    """
    if len(fp_a) != len(fp_b):
        return 0.0
    dot = sum(a * b for a, b in zip(fp_a, fp_b))
    mag_a = math.sqrt(sum(a ** 2 for a in fp_a))
    mag_b = math.sqrt(sum(b ** 2 for b in fp_b))
    if mag_a == 0 or mag_b == 0:
        return 0.0
    return dot / (mag_a * mag_b)


class RiskOutcomeRouter:
    """
    Routes risk model queries toward agents with the highest validation scores
    on similar exposure profiles. Operates entirely on outcome deltas.
    """

    def __init__(self, decay_window_seconds: float = 86400.0):
        self.packets: List[RiskOutcomePacket] = []
        self.accuracy_vectors: Dict[str, List[float]] = defaultdict(list)
        self.decay_window = decay_window_seconds

    def ingest(self, packet: RiskOutcomePacket) -> None:
        """Accept an outcome packet from any agent in the network."""
        self.packets.append(packet)
        self.accuracy_vectors[packet.model_id].append(packet.validation_score)
        print(
            f"[INGEST] model={packet.model_id} "
            f"predicted_var={packet.predicted_var:.4f} "
            f"actual_drawdown={packet.actual_drawdown:.4f} "
            f"delta={abs(packet.predicted_var - packet.actual_drawdown):.4f} "
            f"validation_score={packet.validation_score:.3f}"
        )

    def _recency_weight(self, packet_timestamp: float, now: float) -> float:
        """More recent validations carry more weight."""
        age = now - packet_timestamp
        return max(0.0, 1.0 - (age / self.decay_window))

    def route(
        self,
        query_fingerprint: List[float],
        top_k: int = 3,
        min_validation_score: float = 0.6
    ) -> List[Dict]:
        """
        Given an exposure fingerprint, route to the top-K models
        with highest weighted validation scores on similar profiles.

        Returns ranked candidates — not positions, not weights.
        """
        now = time.time()
        candidate_scores: Dict[str, float] = defaultdict(float)
        candidate_counts: Dict[str, int] = defaultdict(int)

        for packet in self.packets:
            sim = fingerprint_similarity(query_fingerprint, packet.exposure_fingerprint)
            if sim < 0.3:
                continue  # Dissimilar exposure profile, skip
            recency = self._recency_weight(packet.timestamp, now)
            weighted = sim * packet.validation_score * recency
            candidate_scores[packet.model_id] += weighted
            candidate_counts[packet.model_id] += 1

        # Normalize by packet count to avoid volume bias
        normalized = {
            model_id: score / candidate_counts[model_id]
            for model_id, score in candidate_scores.items()
        }

        # Filter by minimum validation threshold
        filtered = {
            model_id: score
            for model_id, score in normalized.items()
            if (
                sum(self.accuracy_vectors[model_id]) /
                len(self.accuracy_vectors[model_id])
            ) >= min_validation_score
        }

        ranked = sorted(filtered.items(), key=lambda x: x[1], reverse=True)[:top_k]

        results = []
        for model_id, route_score in ranked:
            avg_accuracy = (
                sum(self.accuracy_vectors[model_id]) /
                len(self.accuracy_vectors[model_id])
            )
            results.append({
                "model_id": model_id,
                "route_score": round(route_score, 4),
                "avg_validation": round(avg_accuracy, 4),
                "packet_count": candidate_counts[model_id],
            })

        return results

    def synthesis_paths(self) -> int:
        """N(N-1)/2: unique synthesis opportunities across all known models."""
        n = len(self.accuracy_vectors)
        return n * (n - 1) // 2

    def network_summary(self) -> None:
        n = len(self.accuracy_vectors)
        paths = self.synthesis_paths()
        print(f"\n[NETWORK] {n} models registered | {paths} synthesis paths available")
        for model_id, scores in self.accuracy_vectors.items():
            avg = sum(scores) / len(scores)
            print(f"  {model_id}: avg_validation={avg:.3f} over {len(scores)} packets")


# --- Simulation ---

if __name__ == "__main__":
    router = RiskOutcomeRouter()
    now = time.time()

    # Simulate outcome packets from three desks with different exposure profiles
    # Note: fingerprints encode exposure-space features, not positions

    # Desk A: long-duration fixed income, high rates sensitivity
    fp_rates_heavy = [0.9, 0.1, 0.2, 0.05, 0.8]

    # Desk B: equity vol, sector-concentrated, similar duration to A
    fp_equity_vol = [0.3, 0.8, 0.7, 0.1, 0.2]

    # Desk C: similar to Desk A — rates-heavy, cross-currency
    fp_rates_fx = [0.85, 0.15, 0.25, 0.4, 0.75]

    packets = [
        RiskOutcomePacket(
            model_id="desk_A_ratesmodel_v2",
            predicted_var=0.022, actual_drawdown=0.024,
            validation_score=compute_validation_score(0.022, 0.024),
            timestamp=now - 3600, exposure_fingerprint=fp_rates_heavy
        ),
        RiskOutcomePacket(
            model_id="desk_A_ratesmodel_v2",
            predicted_var=0.019, actual_drawdown=0.018,
            validation_score=compute_validation_score(0.019, 0.018),
            timestamp=now - 1800, exposure_fingerprint=fp_rates_heavy
        ),
        RiskOutcomePacket(
            model_id="desk_B_eqvol_model",
            predicted_var=0.031, actual_drawdown=0.058,
            validation_score=compute_validation_score(0.031, 0.058),
            timestamp=now - 900, exposure_fingerprint=fp_equity_vol
        ),
        RiskOutcomePacket(
            model_id="desk_C_rates_fx",
            predicted_var=0.025, actual_drawdown=0.026,
            validation_score=compute_validation_score(0.025, 0.026),
            timestamp=now - 600, exposure_fingerprint=fp_rates_fx
        ),
        RiskOutcomePacket(
            model_id="desk_C_rates_fx",
            predicted_var=0.021, actual_drawdown=0.022,
            validation_score=compute_validation_score(0.021, 0.022),
            timestamp=now - 300, exposure_fingerprint=fp_rates_fx
        ),
    ]

    for p in packets:
        router.ingest(p)

    router.network_summary()

    # A new desk with rates-heavy exposure queries the network
    # It does not know Desk A or Desk C exist
    # It gets routed to the highest-validated similar-exposure models
    query_fp = [0.88, 0.12, 0.22, 0.1, 0.78]
    print("\n[QUERY] New rates-heavy desk querying for highest-validated similar models")
    results = router.route(query_fingerprint=query_fp, top_k=3)
    for r in results:
        print(f"  -> {r}")

When you run this, the output surfaces desk_C_rates_fx and desk_A_ratesmodel_v2 as the top-validated routes for a rates-heavy query exposure, while desk_B_eqvol_model — which significantly underestimated its actual drawdown — scores below threshold and is excluded. The equity vol desk's model degradation is surfaced to the network without the desk disclosing why it underperformed.

The N(N-1)/2 Argument Applied to Global Risk Infrastructure

Consider the scale of the problem. The Bank for International Settlements estimates there are approximately 1,000 significant trading desks globally across systemically important financial institutions, major hedge funds, and central bank reserve management operations. Each runs at least one primary risk model.

Under current infrastructure, the number of validated-outcome synthesis paths between those desks is zero. Every desk is siloed. The information about which model performed well in what environment, under what exposure profile, during what market regime — all of that is trapped inside each institution's internal systems, visible only in retrospect during post-mortem analysis.

Under QIS architecture:

1,000 desks = 1,000 × 999 / 2 = 499,500 unique synthesis paths
Each desk pays O(log N) routing cost per query — whether the routing layer is a DHT, a vector index, or a graph database, efficient lookup is achievable at this scale
A new desk entering the network immediately routes to the highest-validated models with similar exposure fingerprints — cold start is solved structurally

The complete loop produces something that resembles natural selection on model quality — not by design, but as an emergent property of the aggregate math. Models that predict accurately accumulate positive outcome packets; those packets route future queries toward them. Reality adjudicates through the validation score — no committee, no regulator, no central model. Networks built on accurate models attract more queries and grow; networks built on stale models see their routing weight decay as better alternatives exist. This is not a feature anyone programmed in. It is what happens when honest outcomes across N(N-1)/2 synthesis paths are the only currency.

Comparison: Siloed Risk Models vs. QIS-Augmented Risk Intelligence

Dimension	Current Siloed Risk Models	QIS-Augmented Risk Intelligence
Feedback loop	Periodic internal backtesting only; no real-time outcome signal	Continuous: every prediction generates an outcome packet when actuals arrive
Cross-institution data sharing	Impossible — positions, weights, and trade data are proprietary	Enabled — only validated outcome deltas route across the network; exposures stay private
Systemic correlation	Correlated blind spots are invisible until regime change causes simultaneous failure	Correlated underperformance surfaces as routing weight decay across similar-fingerprint models
Cold start (new desk)	New model must build validation history from scratch; no benefit from network	New desk immediately routes queries to highest-validated similar-exposure models
Graceful degradation	Model drift is detected internally (if at all); no network-level signal	Declining validation scores reduce routing weight; network self-organizes away from stale models

Settlement and Clearing: The Downstream Infrastructure Problem

The staleness problem does not end at the risk model. Settlement and clearing infrastructure faces a version of the same feedback deficit.

When a counterparty fails to settle, the failure cascades through clearing chains. The 2010 flash crash and subsequent settlement disruptions documented by the SEC showed that settlement path reliability varies significantly by counterparty, instrument, and market condition — but the infrastructure has no real-time mechanism to route around degraded paths.

QIS outcome-weighted routing applies directly here. Settlement agents that have successfully completed settlement chains at high rates generate high-validation outcome packets. New settlement instructions route preferentially through the most recently validated paths. Settlement path degradation surfaces through declining validation scores before it causes downstream cascade failures.

This is not algorithmic trading optimization. It is feedback infrastructure for the settlement layer — closing the loop between executed settlement and predicted settlement reliability in real time.

The Architecture That Removes the Constraint

The reason this problem persisted through 2008, through the Dodd-Frank era, through a decade of regulatory stress testing, is that the obvious solution — share model performance data across institutions — appeared to require sharing the data that cannot be shared.

QIS outcome packets dissolve this apparent constraint by separating what the network needs (validated accuracy deltas) from what the network cannot receive (positions, weights, proprietary exposure).

A central bank running reserve management models and a systematic macro hedge fund could coexist in the same QIS network. Their outcome packets would route by exposure fingerprint similarity. Their validation scores would inform each other's routing. Neither would know the other's book. Neither would need to.

The 2008 crisis demonstrated the cost of siloed risk intelligence with correlated blind spots. The architecture problem was real. The constraint — can't share proprietary data — was real. The solution is an architecture that never requires sharing it.

The insight propagates. The exposure stays private.

QIS is an original architecture discovered by Christopher Thomas Trevethan, protected under 39 provisional patents. For licensing, research collaboration, or institutional deployment inquiries, contact through the QIS publication series.