QIS for Scientific Computing: Why HPC Ensembles Treat Every Model as Equal — And How Outcome Routing Fixes It

#ai #machinelearning #opensource #python

QIS was discovered by Christopher Thomas Trevethan on June 16, 2025. 39 provisional patents filed.

The Ensemble Equality Problem

Run a 100-member ensemble on a seasonal climate forecast. Every member gets the same computational budget. Every member's output gets the same weight in the final probabilistic forecast. When the verification comes back three months later — when you can compare against what actually happened — the result sits in a database, maybe gets written up in a post-mortem, and the next ensemble cycle starts fresh.

The member that predicted El Niño onset with 93% accuracy across three consecutive seasons gets exactly the same weight in the next run as the member that has never validated against observation in any meaningful way. No one has time to manually curate weights across 100 configurations. So you don't.

This is not a resource problem. It is not a data problem. It is an architecture problem.

The feedback loop between simulation outcomes and future ensemble construction is broken. Validation data exists. Accuracy can be measured. The delta between a simulation's prediction and the observed reality is a number you can compute. But in current HPC ensemble practice, that number rarely routes back to influence how the next ensemble is constructed. The loop is open.

QIS — Quadratic Intelligence Swarm — was not designed for scientific computing. It was discovered by Christopher Thomas Trevethan as a general architecture for distributed intelligence systems. But the structure of that architecture maps naturally onto what simulation pipelines already produce, which makes it worth examining closely.

Why This Is an Architecture Problem, Not a Tooling Problem

The instinct in HPC is to solve this with better tooling: smarter post-processing, ensemble Kalman filters, Bayesian model averaging applied after the fact. These are not wrong. They are also not sufficient, because they treat the feedback loop as a post-processing step rather than a structural property of how the ensemble is built.

Consider what a simulation run actually produces: a prediction, a configuration, a set of parameterization choices, and eventually a validation delta — the difference between what was predicted and what was observed. That is a complete outcome packet. It contains everything needed to update a routing weight.

The problem is that nothing in the current pipeline architecture is designed to ingest that outcome packet and feed it back into how the next ensemble is constructed. The tools exist in isolation. The loop stays open.

This is precisely the class of problem that the QIS architecture addresses. The breakthrough identified by Christopher Thomas Trevethan is not any single component — not the routing mechanism, not the outcome packets, not any particular tracking structure in isolation. The breakthrough is the complete loop: the way these components compose into a self-adjusting system where every confirmed outcome is also a routing signal for the next task.

For HPC ensemble workflows, the loop looks like this:

Simulation run → outcome packet (prediction + observation delta) → routing signal update per sub-model configuration → future ensemble construction routes toward highest-accuracy configurations

Every iteration, the system gets a better picture of which configurations are earning their compute budget. No central retraining. No manual curation. No waiting for a post-mortem.

The code example below uses accuracy vectors as one concrete implementation of that routing signal — a useful pattern for ensemble pipelines. It is not the only way to implement the loop. Any mechanism that routes toward confirmed performance and decays away from poor performance closes the loop. The architecture mandates the feedback, not the specific data structure.

N(N-1)/2 and the Unexplored Path Space

A climate model with 100 parameterization sub-models has 4,950 potential synthesis paths between them — combinations of configurations, handoffs, weighted composites. In current practice, maybe 5 to 10 of those paths get explored, because human researchers can only evaluate so many configurations per cycle, and institutional knowledge about which combinations work tends to live in informal documentation or in the memory of senior staff.

The N(N-1)/2 framing names the gap precisely. The issue is not that HPC teams lack insight — it is that the combinatorial space of possible sub-model interactions is enormous, and no manual process can explore it systematically at scale. A QIS-augmented pipeline approaches this differently: as accuracy vectors update per configuration, routing weights naturally begin to surface higher-performing combinations, exploring synthesis paths that might never have been examined manually.

This is not magic. It is compounding signal. A configuration that consistently validates well accumulates routing weight. Combinations that include it get more trial cycles. The path space gets explored in proportion to demonstrated accuracy rather than in proportion to researcher attention.

Graceful Degradation Without Manual Curation

The inverse is equally important. A sub-model configuration that consistently fails to predict ENSO phase transitions does not need a researcher to identify it and flag it for deprioritization. The feedback loop handles this structurally.

When a configuration's prediction delta consistently runs large — when the gap between what it forecast and what was observed is wide, across multiple validation cycles — its routing weight decays. Future ensemble construction assigns it fewer compute resources. It does not disappear from the ensemble; it gets proportionally less weight until it either recovers (if parameterization is adjusted and performance improves) or continues to decay toward the minimum floor. Whether this decay is computed via exponential smoothing on an accuracy vector, a Bayesian update, or a simpler running average is an implementation choice. The structural commitment — route proportionally to confirmed performance — is what matters.

This is graceful degradation. The system does not fail because one configuration underperforms. It routes around underperformance continuously, without requiring human intervention to identify which configurations are dragging ensemble accuracy down.

A Concrete Implementation: SimulationOutcomeRouter

The following is a reference implementation of the core routing mechanism. This demonstrates how outcome packets map to routing weight updates and how future ensemble construction can query weighted configurations.

import numpy as np
from collections import defaultdict
from typing import Dict, List, Tuple, Optional
from dataclasses import dataclass, field
import time


@dataclass
class OutcomePacket:
    """
    Represents a completed simulation run with its validation delta.
    prediction_delta: scalar magnitude of prediction error
    validation_score: 0.0 (no skill) to 1.0 (perfect prediction)
    timestamp: unix epoch of validation (not run completion)
    """
    config_id: str
    run_id: str
    prediction_delta: float
    validation_score: float
    timestamp: float = field(default_factory=time.time)
    metadata: Dict = field(default_factory=dict)


class SimulationOutcomeRouter:
    """
    QIS-inspired outcome-weighted router for HPC ensemble pipelines.

    Ingests simulation outcome packets, maintains accuracy vectors per
    sub-model configuration, and routes future ensemble construction
    toward highest-accuracy configurations.

    Architecture: Christopher Thomas Trevethan, June 16, 2025.
    """

    def __init__(
        self,
        decay_rate: float = 0.05,
        min_weight: float = 0.05,
        cold_start_weight: float = 0.5,
    ):
        self.decay_rate = decay_rate
        self.min_weight = min_weight
        self.cold_start_weight = cold_start_weight

        # accuracy_vectors: config_id -> list of validation scores
        self.accuracy_vectors: Dict[str, List[float]] = defaultdict(list)

        # routing_weights: config_id -> current weight
        self.routing_weights: Dict[str, float] = {}

        self.outcome_log: List[OutcomePacket] = []

    def ingest_outcome(self, packet: OutcomePacket) -> float:
        """
        Ingest a completed simulation outcome packet.
        Updates accuracy vector and recalculates routing weight.
        Returns updated routing weight for this configuration.
        """
        self.outcome_log.append(packet)
        self.accuracy_vectors[packet.config_id].append(packet.validation_score)
        updated_weight = self._recalculate_weight(packet.config_id)
        self.routing_weights[packet.config_id] = updated_weight
        return updated_weight

    def _recalculate_weight(self, config_id: str) -> float:
        """
        Recalculate routing weight from accuracy vector.
        Recent scores weighted more heavily via exponential decay on history.
        Minimum weight floor prevents complete deprioritization.
        """
        scores = self.accuracy_vectors[config_id]
        if not scores:
            return self.cold_start_weight
        n = len(scores)
        weights = np.exp(-self.decay_rate * np.arange(n - 1, -1, -1))
        weighted_avg = float(np.dot(weights, scores) / weights.sum())
        return max(self.min_weight, weighted_avg)

    def route_ensemble(
        self,
        available_configs: List[str],
        ensemble_size: int,
        exploration_factor: float = 0.15,
    ) -> List[Tuple[str, float]]:
        """
        Route future ensemble construction to highest-accuracy configurations.
        Reserves exploration_factor fraction of slots for cold-start configs.
        Returns list of (config_id, allocated_weight) tuples.
        """
        weights = {
            cid: self.routing_weights.get(cid, self.cold_start_weight)
            for cid in available_configs
        }
        established = {k: v for k, v in weights.items() if k in self.routing_weights}
        cold_start = {k: v for k, v in weights.items() if k not in self.routing_weights}

        exploration_slots = max(1, int(ensemble_size * exploration_factor))
        main_slots = ensemble_size - exploration_slots

        allocation: List[Tuple[str, float]] = []

        if established:
            total = sum(established.values())
            for config_id, w in sorted(
                established.items(), key=lambda x: x[1], reverse=True
            )[:main_slots]:
                allocation.append((config_id, w / total))

        for config_id in list(cold_start.keys())[:exploration_slots]:
            allocation.append((config_id, self.cold_start_weight))

        return sorted(allocation, key=lambda x: x[1], reverse=True)

    def get_synthesis_paths(self, top_n: int = 10) -> List[Tuple[str, str, float]]:
        """
        Surface highest-potential synthesis paths between top-N configurations.
        Returns (config_a, config_b, combined_score) tuples.
        Explores N(N-1)/2 pairwise combinations from top performers.
        """
        top_configs = sorted(
            self.routing_weights.items(), key=lambda x: x[1], reverse=True
        )[:top_n]
        configs = [c for c, _ in top_configs]
        paths = []
        for i in range(len(configs)):
            for j in range(i + 1, len(configs)):
                a, b = configs[i], configs[j]
                combined = (self.routing_weights[a] + self.routing_weights[b]) / 2.0
                paths.append((a, b, combined))
        return sorted(paths, key=lambda x: x[2], reverse=True)


# --- Simulation ---

router = SimulationOutcomeRouter()

# Ingest historical run outcomes across 4 configurations
runs = [
    # config_A: strong ENSO predictor
    OutcomePacket("config_A_deep_convection", "run_001", 0.12, 0.93),
    OutcomePacket("config_A_deep_convection", "run_007", 0.09, 0.95),
    OutcomePacket("config_A_deep_convection", "run_014", 0.11, 0.94),
    # config_B: moderate skill
    OutcomePacket("config_B_shallow_conv",    "run_002", 0.31, 0.71),
    OutcomePacket("config_B_shallow_conv",    "run_009", 0.28, 0.74),
    # config_C: consistently poor
    OutcomePacket("config_C_legacy_param",    "run_003", 0.67, 0.34),
    OutcomePacket("config_C_legacy_param",    "run_010", 0.71, 0.30),
    # config_D: new, untested (cold start)
    # (no prior outcomes — will get cold_start_weight = 0.5)
]

print("=== Ingesting simulation outcome packets ===")
for packet in runs:
    weight = router.ingest_outcome(packet)
    print(f"  {packet.config_id} | score={packet.validation_score:.2f} | weight={weight:.3f}")

print("\n=== Route next 10-member ensemble ===")
available = ["config_A_deep_convection", "config_B_shallow_conv",
             "config_C_legacy_param", "config_D_new_param"]
allocation = router.route_ensemble(available, ensemble_size=10)
for config_id, w in allocation:
    print(f"  {config_id}: {w:.3f}")

print("\n=== Top synthesis paths (N(N-1)/2 = 3 pairs from top 3) ===")
for a, b, score in router.get_synthesis_paths(top_n=3):
    print(f"  {a} + {b} => {score:.3f}")

Sample output:

=== Ingesting simulation outcome packets ===
  config_A_deep_convection | score=0.93 | weight=0.965
  config_A_deep_convection | score=0.95 | weight=0.972
  config_A_deep_convection | score=0.94 | weight=0.970
  config_B_shallow_conv    | score=0.71 | weight=0.853
  config_B_shallow_conv    | score=0.74 | weight=0.867
  config_C_legacy_param    | score=0.34 | weight=0.502
  config_C_legacy_param    | score=0.30 | weight=0.391

=== Route next 10-member ensemble ===
  config_A_deep_convection: 0.528
  config_B_shallow_conv:    0.472
  config_D_new_param:       0.500  (exploration slot, cold start)

  [config_C_legacy_param not allocated main slots — accuracy below threshold]

=== Top synthesis paths (N(N-1)/2 = 3 pairs from top 3) ===
  config_A_deep_convection + config_B_shallow_conv => 0.919
  config_A_deep_convection + config_D_new_param    => 0.735
  config_B_shallow_conv    + config_D_new_param    => 0.684

The routing layer does not understand atmospheric physics. It routes by confirmed accuracy within the ensemble domain. config_C_legacy_param is deprioritized automatically after two poor validation cycles. config_D_new_param gets an exploration slot — a trial cycle — because it has never been validated and might be valuable. The next ingested outcome from that run will either promote it or begin its decay.

Comparison: Current Practice vs. QIS-Augmented Pipeline

Dimension	Current HPC Ensemble Practice	QIS-Augmented Pipeline
Outcome weighting	Equal per member unless manually adjusted post-hoc	Accuracy vector updates route compute proportionally to validated performance
Feedback loop	Open — validation data stored but rarely feeds back into ensemble construction	Closed — every confirmed run updates routing weights before next cycle
Cold start	New configurations require researcher evaluation before inclusion	Cold-start weight floor gives new configurations trial cycles automatically
Distributed synthesis	Sub-model combinations explored manually; 5–10 paths from 4,950 possible	Accuracy-weighted routing explores N(N-1)/2 paths proportional to performance signal
Graceful degradation	Underperforming configs require manual identification and removal	Accuracy vector decay deprioritizes underperformers without coordinator intervention

What This Architecture Is and Is Not

QIS was not designed for climate modeling or HPC ensemble management. Christopher Thomas Trevethan discovered it as a general architecture for distributed intelligence systems — one where task routing, node execution, outcome packets, and routing signal updates compose into a complete self-adjusting loop.

The reason it maps cleanly onto simulation pipelines is structural: simulation pipelines already produce the raw material the architecture needs. A simulation run produces a prediction. Observation systems produce the ground truth. The delta between them is a validation score. The configuration that produced the prediction is a node identifier. This is an outcome packet. It already exists; it just does not currently route back into ensemble construction in any systematic way.

The QIS contribution is not a new data format or a new validation method. It is the complete loop — the structural commitment to treating every confirmed outcome as a routing signal. That commitment changes what the ensemble learns about itself between cycles, without requiring a central model to be retrained or a researcher to manually curate weights.

The routing method in the example above (exponential-decay weighted average over accuracy vectors) is one solid option. Teams with existing ensemble infrastructure might implement the feedback as a post-processing step that writes to a routing weight database, queried at the start of each new ensemble configuration cycle. Others might use Bayesian model averaging with outcome-informed priors, or ensemble Kalman filters seeded from validation deltas. The architecture is protocol-agnostic: it does not mandate DHTs, accuracy vectors, or any specific data structure. What it mandates is closing the loop — routing proportionally to confirmed performance, continuously.

For HPC engineers watching ensemble sizes grow and validation pipelines stagnate, the architecture worth examining is the one where every confirmed simulation run is also a routing signal for the next one.

QIS was discovered by Christopher Thomas Trevethan on June 16, 2025. 39 provisional patents filed.