DEV Community

Naoki Higuchi (CSCT-NAIL)
Naoki Higuchi (CSCT-NAIL)

Posted on

A New AI Architecture Without Prior Distributions: Stream-Based AI and Compositional Inference

The Problem with Current AI

The foundation of current AI is the Transformer/Attention model—undeniably successful. But why it succeeds and where its limits lie remain subjects of debate.

Hallucination—generating outputs that deviate from training data—has no fundamental solution despite numerous proposed fixes. Causal reasoning and compositional inference over unseen combinations remain weak points with clear limitations.

A Different Foundation

I designed a new AI architecture based on different principles. It's grounded in theory but implemented as working code.

The core ideas are simple:

  • We live in a continuously moving world → Input must be a stream
  • Based on human cognition → No prior distribution is better
  • To approximate neurons → Various gate mechanisms are needed

These are designed to be describable as physical dynamics, and they actually work.

However, the resulting architecture is far from existing AI. Understanding it requires knowledge across multiple domains—semiotics, neuroscience, geometry, information theory—rather than statistics alone. Bayesian inference is not required.

The CSCT Engine

This architecture is governed by axioms called CSCT (Clock-Selected Compression Theory). The key rules:

"Input is a stream (continuous). Basic coefficients have non-negative constraints. There is an external input called an 'anchor', and gate opening/closing is controlled in synchronization with the anchor's change (velocity). This gate mechanism achieves discretization. Input information becomes discrete codes, reusable for inference even after learning stops."

Design Philosophy: Cognition as a Projected Dynamical System

The CSCT Engine is a neurodynamical simulator implementing the 5 axioms in computable form. It mathematically models cognition as a Projected Dynamical System (PDS) evolving on a Simplex.

Unlike standard deep learning approaches that operate in unconstrained vector spaces allowing negative weights (subtractive interference), the CSCT Engine enforces a non-negative constraint (C ≥ 0) with unit sum (Σp_k = 1).

This ensures the system cannot "rewind" entropy via mathematical cancellation, but must "construct" meanings within the Simplex defined by codebook vectors, thereby grounding internal symbols in physical causality.

Architecture Diagram

CSCT Engine Architecture

Figure: Overview of the CSCT Engine Architecture.

Two Architectures: SingleGate and MultiGate

The CSCT Engine has two main architectures:

Aspect SingleGate MultiGate
Biological Correspondence Peripheral Processing Central Processing
Computational Cost Low High
Application Single source waveform Inter-channel relationships
Phase Processing None (velocity-based) θ phase temporal modulation

SingleGate performs clock selection directly from input features (position, velocity, acceleration). The gate opens/closes according to the anchor's rate of change (velocity), but no explicit phase calculation is performed.

MultiGate has three neurological gate mechanisms:

  • Na⁺ Gate: Corresponds to sodium channels—high-speed, sparse clock selection
  • θ Phase: Corresponds to hippocampal θ rhythm—time-dependent rhythm modulation
  • NMDA Gate: Corresponds to NMDA receptors—integration window opens at θ phase peak
combined_gate = Na⁺_activation × NMDA_activation
Enter fullscreen mode Exit fullscreen mode

The NMDA gate opens and closes depending on θ phase. This matches the timing when LTP (Long-Term Potentiation) occurs in the hippocampus, making it a computational implementation of θ-γ coupling known in neuroscience.

Structural Differences from Attention Models

Time Structure

Aspect Attention Model CSCT Engine
Processing Batch (static frames, text, video/audio via position encoding) Stream (continuous time-series)
Self-reference S created from S(t-1) Forward flow only
Parallelism Processes all information at once Progresses through stream

In current Attention models, the self is created from S(t-1). This structure with negative indices allows static images, text, and video/audio (converted to position information) to be handled uniformly.

In contrast, the CSCT Engine is stream-based, so it currently cannot handle static images as elegantly.

This may seem more constrained and slower. However, it's an architecture with high potential for processing while moving—a fundamentally different capability.

Distance Structure

Aspect Attention Model CSCT Engine
Norm L2 (codes mix, hard to extract) L1-like (codes separable)
Negative values Allowed Not allowed
Vector addition Unrestricted Constrained to simplex
Convex hull Local pseudo-hull only Globally closed hull

Why Transformer "Attention" Is Insufficient

The Transformer attention mechanism is defined as:

Attention(Q, K, V) = softmax(QK^T / √d_k) V
Enter fullscreen mode Exit fullscreen mode

Since softmax outputs sum to unity, the result is a convex combination of value vectors V. Superficially, this appears to satisfy the convex hull constraint.

However, this constitutes only a local pseudo-hull:

  • Each layer has its own local hull but doesn't form a globally closed simplex
  • No boundary means ever-increasing resources are needed to counteract entropy production

The CSCT Engine requires geometric simplex structure, disallows negative computation, and forbids simple addition.

This constraint creates a geometric convex hull in information space, enabling compositional inference. Conversely, no amount of improvement to current AI can eliminate hallucination or achieve true compositional inference.

Experiment 8: Meaning Extraction

Purpose

EX8 tests whether a frozen system can infer unseen inputs after discrete codes acquire meaning through anchoring.

Experimental Design

Training Data:

  • Singles: A, B
  • Composites: A+B, A+C, B+C

Withheld: C alone (never shown as single)

Test: After stopping learning (no gradient), can the model infer C (never directly taught) given only anchor 'c'?

Three Geometric Conditions

We vary C's position to examine convex hull effects:

  • IN_HULL: C is a convex combination of A and B (inside hull)
  • OUT_HULL: C is orthogonal to both A and B (outside hull)
  • RANDOM: C is randomly initialized (baseline)

Results (30 seeds per condition, 90 total runs)

Condition Success Rate Withheld Similarity
IN_HULL 96.7% (29/30) 0.979 ± 0.025
RANDOM 53.3% (16/30) 0.682 ± 0.370
OUT_HULL 16.7% (5/30) 0.701 ± 0.242

Statistical test: Kruskal-Wallis H = 42.52, p = 5.85 × 10⁻¹⁰

Key Discovery: Ungrounded Symbol Acquisition

Interestingly, in OUT_HULL, C acquired a unique code in 80% of seeds. Yet reconstruction accuracy remained low.

We term this "Ungrounded Symbol Acquisition": a discrete code is assigned and manipulated, but it lacks representational content within the codebook's constructive capacity.

This provides a mathematical instantiation of Searle's "Chinese Room" argument—the system can "handle" the symbol without the symbol bearing reconstructable meaning.

Experiment 9: Syntax Emergence

Purpose

EX9 tests whether syntactic inference can arise once meaning has been discretized. It is the mirror of EX8:

  • EX8 (Meaning): Withheld primitive must be inferred from observed composites
  • EX9 (Syntax): Withheld composite must be inferred from observed primitives and composites

Experimental Design

Training Data:

  • Singles: A, B, C
  • Composites: A+B, A+C

Withheld: B+C (never trained)

Test: Given anchor 'b+c', can the model infer B+C (never directly taught)?

Results (30 seeds per condition, 90 total runs)

Condition Success Rate Withheld Similarity
IN_HULL 66.7% (20/30) 0.890 ± 0.138
RANDOM 33.3% (10/30) 0.767 ± 0.253
OUT_HULL 13.3% (4/30) 0.742 ± 0.168

Statistical test: Kruskal-Wallis H = 19.13, p = 7.00 × 10⁻⁵

Discussion: Syntax as Interpolation, Not Algebra

EX9 results show that syntax in CSCT emerges not as algebraic rule manipulation (combining symbols independent of content) but as discovery of barycentric coordinates on the codebook simplex.

Rather than learning subtraction, the system discovers barycentric coordinates:

f: y_{A+B} → ½v_A + ½v_B
f: y_{A+C} → ½v_A + ½v_C
Enter fullscreen mode Exit fullscreen mode

Because this mapping rule is geometrically consistent, the system generalizes to the withheld input y_{B+C} → ½v_B + ½v_C.

Conclusion

From these results, I conclude that human cognition—especially intellectual activity—is built on constraints.

Through nine experiments, we demonstrated that:

  1. Discretization emerges reliably from continuous dynamics via clock selection (EX1-3)
  2. Irreversible anchors provide stability against noise and drift, outperforming reversible (self-referential) systems long-term (EX4)
  3. Binding is achieved as implicit synchronization through locking to a shared anchor (EX5)
  4. Internal time progresses proportionally to the signal's rate of change (flux), halting when no signal is present (EX7)
  5. Semantic grounding requires convex-hull membership; symbols assigned outside the hull remain ungrounded (EX6, EX8)
  6. Syntax emerges as barycentric interpolation, not algebraic abstraction; compositional inference degrades outside the trained hull (EX9)

Modern AI pursues "infinite expansion" through scaling. I propose the inverse: closed constraint may be a necessary condition for stable, efficient intelligence.

The CSCT Engine may be the first program that can describe (design) our cognitive activity as an "OS."

If you're interested, please run the code. There are no licensing restrictions, and this is just the beginning. This is not speculation—it's a working program.


Links


Top comments (0)