Part 16 of the Understanding QIS series
Every RAG pipeline in production right now is built on the same assumption: embed your documents, embed your query, find the nearest neighbors, retrieve and generate. It works. At small scale, it works quite well.
But in 2026, the teams hitting real scaling walls are discovering something uncomfortable: cosine similarity in high-dimensional embedding space degrades as N grows. Not gradually — structurally. And the standard fix, layering a knowledge graph on top of a vector store, introduces a different problem: the graph goes stale the moment you stop paying curators to maintain it.
QIS (Quadratic Intelligence Synthesis) takes a different path. The DHT routing layer in a QIS network is a knowledge graph — one that builds and updates itself from empirically confirmed outcomes rather than human-defined ontologies or static embedding geometry.
This article explains why that distinction matters architecturally, when it matters in practice, and how you can see the pattern in code.
The Curse of Dimensionality Is Not a Myth
The phrase "curse of dimensionality" gets thrown around loosely. In the context of vector similarity search, the specific problem is this: as embedding dimensionality increases, the ratio of the maximum distance to the minimum distance between any two points in a random sample approaches 1. Everything starts to look equally far apart.
More concretely: in a 1,536-dimensional embedding space (OpenAI's text-embedding-3-large), with a corpus of 10 million documents, the top-k cosine similarity results for a given query are often separated by differences in the fourth or fifth decimal place. FAISS and Chroma are engineering marvels that find those neighbors efficiently — but they cannot fix the underlying geometry. You are making high-stakes retrieval decisions on vanishingly small discriminative signals.
The standard workarounds each introduce their own costs:
- Dimensionality reduction (PCA, UMAP) trades retrieval precision for geometric stability. You lose the expressiveness of the original embedding.
- Re-ranking with a cross-encoder adds a second inference pass that scales O(k × corpus_size) for thorough coverage.
- Hybrid search (BM25 + embeddings) improves recall but does not solve the fundamental representational problem.
- Knowledge graph + embeddings (Neo4j, LlamaIndex KG) adds relational structure but requires upfront ontology construction and ongoing maintenance to stay accurate.
None of these are wrong approaches. They are the right tools for what they are. The question is whether semantic routing can be solved at a lower level, before retrieval, by routing queries to the nodes most likely to produce confirmed-good outcomes rather than the nodes most geometrically similar in embedding space.
How QIS DHT Routing Handles Semantic Queries Differently
In a QIS network, each node maintains an accuracy vector — a multi-dimensional performance profile across task domains, built from confirmed outcome feedback over time. When a task or query arrives at the network, it is routed not by embedding similarity but by accuracy-weighted DHT lookup: the routing layer finds the subset of nodes whose confirmed performance on this class of task is highest.
The critical word is confirmed. These are not self-reported capability claims. They are not inferred from model weights or training data provenance. They are empirically measured: a node handled tasks of type X and produced outcomes that downstream validators confirmed as correct, useful, or high quality. That signal accumulates in the DHT routing table and decays over time if the node's performance changes.
This is semantically richer than embedding lookup in two ways:
1. Domain performance is not the same as semantic similarity. Two documents can be very similar in embedding space (both discuss "transformer attention mechanisms") but route to very different nodes depending on whether the query requires theoretical explanation, implementation debugging, or performance optimization. The accuracy vector encodes this distinction. The cosine similarity does not.
2. The routing weights reflect current reality, not historical training. An embedding model trained in 2024 does not know about your production system's performance profile in 2026. A QIS DHT updated from live outcome feedback does.
The Knowledge Graph That Writes Itself
Here is the architectural insight that connects QIS to knowledge graphs: the routing weights distributed across the DHT are a dynamic knowledge graph.
In a conventional knowledge graph (Neo4j, a LlamaIndex KG index, a Wikidata-style ontology), nodes represent entities, edges represent relationships, and weights or properties on those edges encode domain knowledge. A human — or a pipeline with human oversight — constructs and maintains that graph. It is accurate at the time of construction and drifts from reality thereafter at a rate proportional to how fast the underlying domain changes and how slowly the curation pipeline runs.
In a QIS network, the equivalent structure emerges from operation:
- Nodes are the intelligent agents in the DHT.
- Edges are the routing paths between nodes, weighted by accuracy vectors.
- Properties are the per-domain performance profiles, updated from every confirmed outcome packet.
- Ontology is implicit in the routing topology — if queries of type X consistently route through a particular cluster of nodes, that cluster has implicitly become the "knowledge subgraph" for domain X.
No curator defined this. No ontology engineer specified the edges. The graph constructed itself from accumulated evidence of what worked.
The additional structural property that makes this semantically richer than flat retrieval: N(N-1)/2 synthesis paths.
In a network of N nodes, the number of possible pairwise synthesis combinations is N(N-1)/2. A flat vector retrieval finds the k nearest neighbors and retrieves them independently. QIS routing can activate synthesis across any subset of relevant nodes — combining their outputs into a response that reflects multiple confirmed-accurate perspectives rather than a single best-match retrieval. At N=100 nodes, that is 4,950 potential synthesis paths. At N=1,000, it is 499,500. Flat retrieval does not have this property. A static knowledge graph has it in principle but requires explicit edge construction to enable it in practice.
Comparison: Three Approaches at Scale
| Dimension | Vector Similarity Search | Knowledge Graph + Embeddings | QIS Outcome-Weighted Routing |
|---|---|---|---|
| Routing signal | Geometric proximity in embedding space | Ontology edges + embedding similarity | Empirically confirmed domain performance |
| Staleness | Stale when corpus changes | Stale when ontology drifts from reality | Updates from every confirmed outcome |
| Discriminative power at scale | Degrades as N grows (curse of dimensionality) | Depends on graph density; brittle at edges | Scales with outcome volume — more data = better routing |
| Synthesis capability | Retrieve-then-aggregate (post-hoc) | Traverse graph + retrieve | N(N-1)/2 paths, activated per-query |
| Maintenance overhead | Low (reindex on corpus change) | High (ontology curation required) | Self-maintaining via feedback loop |
| Cold start behavior | Works immediately with any embedding model | Requires ontology construction before useful | Requires minimum node count and outcome history |
| Explainability | Cosine score (geometric, not semantic) | Edge traversal path (explicit but may be stale) | Routing trace + outcome history per node |
The cold start row is the honest tradeoff: QIS routing requires accumulated outcome history to outperform embedding lookup. Below a threshold of nodes and confirmed outcomes, a well-tuned FAISS index with a cross-encoder re-ranker will outperform QIS routing on retrieval tasks. The advantage compounds as the outcome history grows.
Code Example: Dynamic Semantic Routing vs Static Embedding Lookup
The following example shows the structural difference between building a retrieval layer on static embeddings (FAISS style) and building one on QIS-style outcome-weighted routing. This is a simplified simulation — production QIS routing operates at the DHT layer — but the pattern is accurate.
import numpy as np
from dataclasses import dataclass, field
from typing import Dict, List, Tuple
from collections import defaultdict
# ── Static embedding lookup (FAISS-style simplified) ──────────────────────────
class StaticEmbeddingIndex:
"""
Represents a conventional vector similarity retrieval layer.
Embeddings are fixed at index time. No feedback loop.
"""
def __init__(self, embedding_dim: int = 1536):
self.embedding_dim = embedding_dim
self.nodes: Dict[str, np.ndarray] = {}
def index_node(self, node_id: str, embedding: np.ndarray):
self.nodes[node_id] = embedding / np.linalg.norm(embedding)
def query(self, query_embedding: np.ndarray, top_k: int = 3) -> List[Tuple[str, float]]:
q = query_embedding / np.linalg.norm(query_embedding)
scores = {
node_id: float(np.dot(q, emb))
for node_id, emb in self.nodes.items()
}
return sorted(scores.items(), key=lambda x: x[1], reverse=True)[:top_k]
# ── QIS-style outcome-weighted routing ────────────────────────────────────────
@dataclass
class OutcomePacket:
"""
Confirmed outcome from a completed task.
This is the feedback signal that updates routing weights.
"""
node_id: str
task_domain: str
quality_score: float # 0.0–1.0, validated by downstream consumers
latency_ms: float
timestamp: float
@dataclass
class NodeAccuracyVector:
"""
Per-node accuracy profile across task domains.
Built from accumulated confirmed outcomes — never manually defined.
"""
node_id: str
domain_scores: Dict[str, List[float]] = field(default_factory=lambda: defaultdict(list))
decay_factor: float = 0.95 # older outcomes weighted less
def ingest_outcome(self, packet: OutcomePacket):
"""Update domain performance from a confirmed outcome packet."""
self.domain_scores[packet.task_domain].append(packet.quality_score)
# Keep rolling window — in production this lives in the DHT
if len(self.domain_scores[packet.task_domain]) > 100:
self.domain_scores[packet.task_domain].pop(0)
def get_domain_score(self, domain: str) -> float:
"""
Return decay-weighted average for this domain.
Nodes with no confirmed outcomes score 0.0 — honest cold start behavior.
"""
scores = self.domain_scores.get(domain, [])
if not scores:
return 0.0
weights = np.array([self.decay_factor ** i for i in range(len(scores) - 1, -1, -1)])
return float(np.average(scores, weights=weights))
class QISSemanticRouter:
"""
Simplified QIS-style routing layer.
Routes queries by empirically confirmed domain performance,
not geometric proximity in embedding space.
"""
def __init__(self):
self.nodes: Dict[str, NodeAccuracyVector] = {}
def register_node(self, node_id: str):
self.nodes[node_id] = NodeAccuracyVector(node_id=node_id)
def ingest_outcome(self, packet: OutcomePacket):
"""Feed confirmed outcomes back into the routing layer."""
if packet.node_id in self.nodes:
self.nodes[packet.node_id].ingest_outcome(packet)
def route_query(self, task_domain: str, top_k: int = 3) -> List[Tuple[str, float]]:
"""
Route a query to the top-k nodes by confirmed domain performance.
This is the structural difference from cosine similarity retrieval.
"""
scores = {
node_id: nav.get_domain_score(task_domain)
for node_id, nav in self.nodes.items()
}
ranked = sorted(scores.items(), key=lambda x: x[1], reverse=True)[:top_k]
return ranked
def synthesis_path_count(self) -> int:
"""N(N-1)/2 potential synthesis paths across all nodes."""
n = len(self.nodes)
return n * (n - 1) // 2
def activate_synthesis(self, task_domain: str, min_score: float = 0.6) -> List[str]:
"""
Identify all nodes above threshold for multi-node synthesis.
In production, these nodes collaborate via DHT-coordinated synthesis.
"""
return [
node_id for node_id, nav in self.nodes.items()
if nav.get_domain_score(task_domain) >= min_score
]
# ── Demonstration ─────────────────────────────────────────────────────────────
import time
if __name__ == "__main__":
print("=== Static Embedding Index (FAISS-style) ===")
static_index = StaticEmbeddingIndex(embedding_dim=64)
# Nodes have fixed embeddings — no feedback loop
rng = np.random.default_rng(42)
for node_id in ["node_A", "node_B", "node_C", "node_D"]:
static_index.index_node(node_id, rng.standard_normal(64))
query_emb = rng.standard_normal(64)
results = static_index.query(query_emb, top_k=3)
print("Top-3 by cosine similarity:", results)
print("Note: scores are geometric, not semantic. No outcome data used.\n")
print("=== QIS Outcome-Weighted Router ===")
router = QISSemanticRouter()
for node_id in ["node_A", "node_B", "node_C", "node_D"]:
router.register_node(node_id)
# Simulate confirmed outcome packets arriving over time
# node_A and node_C have strong confirmed performance on "code_generation"
# node_B has strong confirmed performance on "literature_review"
outcomes = [
OutcomePacket("node_A", "code_generation", 0.92, 180, time.time()),
OutcomePacket("node_A", "code_generation", 0.89, 195, time.time()),
OutcomePacket("node_C", "code_generation", 0.94, 160, time.time()),
OutcomePacket("node_B", "literature_review", 0.97, 220, time.time()),
OutcomePacket("node_D", "code_generation", 0.51, 400, time.time()),
OutcomePacket("node_A", "code_generation", 0.91, 175, time.time()),
]
for packet in outcomes:
router.ingest_outcome(packet)
print("Routing 'code_generation' query:")
routed = router.route_query("code_generation", top_k=3)
for node_id, score in routed:
print(f" {node_id}: confirmed accuracy score = {score:.4f}")
synthesis_nodes = router.activate_synthesis("code_generation", min_score=0.85)
print(f"\nNodes activated for multi-node synthesis (score >= 0.85): {synthesis_nodes}")
print(f"Total synthesis paths available in network: {router.synthesis_path_count()}")
print("\nKey difference: node_D ranks last despite potentially high embedding similarity.")
print("Its confirmed outcome quality is low — the feedback loop surfaces this.")
Running this will show node_D ranked last for code_generation despite having been registered alongside the others. A cosine similarity lookup against a random embedding has no mechanism to capture this — it has no outcome history to draw from.
Why the Feedback Loop Is the Architecture
The standard framing of this comparison focuses on components: DHT vs vector store, accuracy vectors vs embeddings, synthesis paths vs top-k retrieval. But the component comparison misses the point.
The architectural breakthrough in QIS is the complete feedback loop: tasks route to nodes, nodes produce outputs, outputs are validated, confirmed outcomes update the routing weights, which changes how future tasks route. This loop runs continuously, without human intervention, and means the routing layer reflects the current performance reality of the network rather than a historical snapshot.
In knowledge graph terms: a conventional KG is a snapshot. A QIS routing layer is a live system. The difference is not about which technology is more sophisticated — it is about whether the semantic structure of the system can update itself from evidence.
LangChain and LlamaIndex both offer knowledge graph integrations (LlamaIndex's KnowledgeGraphIndex, LangChain's Neo4j vector store hybrid). These are valuable tools for applications where the ontology is stable and curation is feasible. QIS-style outcome-weighted routing is the appropriate architecture when the domain is dynamic, the scale makes manual curation impractical, or the cost of retrieval errors is high enough to justify building on confirmed performance rather than geometric inference.
Neither approach is universally correct. Both can coexist — a LlamaIndex KG index for structured document retrieval, QIS routing for agent-to-agent task delegation — in a system that uses each where it fits.
What This Means for Engineers Building RAG and Agent Systems in 2026
If you are hitting retrieval quality ceilings with FAISS or Chroma at large corpus sizes, the first question to ask is whether your problem is a retrieval problem or a routing problem. If you need to find the most semantically similar document, vector search is the right tool. If you need to route a task to the agent or subsystem most likely to handle it correctly based on past performance, you need a routing layer that learns from outcomes — and that is what QIS provides.
The knowledge graph framing is useful here: the graph you need already exists in your system's outcome history. QIS makes it explicit and queryable.
QIS was discovered by Christopher Thomas Trevethan on June 16, 2025. 39 provisional patents filed.
Understanding QIS Series
- Part 1 — Introduction to QIS
- Part 3 — QIS Seven-Layer Architecture: A Technical Deep Dive
- Part 5 — Why Federated Learning Has a Ceiling and What QIS Does Instead
- Part 9 — QIS Cold Start: How Many Nodes Does It Take to Matter?
- Part 13 — QIS for LLM Orchestration: Replacing the Central Router
- Part 15 — QIS for Multi-Agent Coordination: Autonomous Swarms Without a Central Orchestrator
- Part 16 — QIS and Knowledge Graphs: Why Semantic Routing Beats Vector Similarity Search at Scale (this article)
Top comments (0)