The Problem with Current AI
The foundation of current AI is the Transformer/Attention model—undeniably successful. But why it succeeds and where its limits lie remain subjects of debate.
Hallucination—generating outputs that deviate from training data—has no fundamental solution despite numerous proposed fixes. Causal reasoning and compositional inference over unseen combinations remain weak points with clear limitations.
A Different Foundation
I designed a new AI architecture based on different principles. It's grounded in theory but implemented as working code.
The core ideas are simple:
- We live in a continuously moving world → Input must be a stream
- Based on human cognition → No prior distribution is better
- To approximate neurons → Various gate mechanisms are needed
These are designed to be describable as physical dynamics, and they actually work.
However, the resulting architecture is far from existing AI. Understanding it requires knowledge across multiple domains—semiotics, neuroscience, geometry, information theory—rather than statistics alone. Bayesian inference is not required.
The CSCT Engine
This architecture is governed by axioms called CSCT (Clock-Selected Compression Theory). The key rules:
"Input is a stream (continuous). Basic coefficients have non-negative constraints. There is an external input called an 'anchor', and gate opening/closing is controlled in synchronization with the anchor's change (velocity). This gate mechanism achieves discretization. Input information becomes discrete codes, reusable for inference even after learning stops."
Design Philosophy: Cognition as a Projected Dynamical System
The CSCT Engine is a neurodynamical simulator implementing the 5 axioms in computable form. It mathematically models cognition as a Projected Dynamical System (PDS) evolving on a Simplex.
Unlike standard deep learning approaches that operate in unconstrained vector spaces allowing negative weights (subtractive interference), the CSCT Engine enforces a non-negative constraint (C ≥ 0) with unit sum (Σp_k = 1).
This ensures the system cannot "rewind" entropy via mathematical cancellation, but must "construct" meanings within the Simplex defined by codebook vectors, thereby grounding internal symbols in physical causality.
Architecture Diagram
Figure: Overview of the CSCT Engine Architecture.
Two Architectures: SingleGate and MultiGate
The CSCT Engine has two main architectures:
| Aspect | SingleGate | MultiGate |
|---|---|---|
| Biological Correspondence | Peripheral Processing | Central Processing |
| Computational Cost | Low | High |
| Application | Single source waveform | Inter-channel relationships |
| Phase Processing | None (velocity-based) | θ phase temporal modulation |
SingleGate performs clock selection directly from input features (position, velocity, acceleration). The gate opens/closes according to the anchor's rate of change (velocity), but no explicit phase calculation is performed.
MultiGate has three neurological gate mechanisms:
- Na⁺ Gate: Corresponds to sodium channels—high-speed, sparse clock selection
- θ Phase: Corresponds to hippocampal θ rhythm—time-dependent rhythm modulation
- NMDA Gate: Corresponds to NMDA receptors—integration window opens at θ phase peak
combined_gate = Na⁺_activation × NMDA_activation
The NMDA gate opens and closes depending on θ phase. This matches the timing when LTP (Long-Term Potentiation) occurs in the hippocampus, making it a computational implementation of θ-γ coupling known in neuroscience.
Structural Differences from Attention Models
Time Structure
| Aspect | Attention Model | CSCT Engine |
|---|---|---|
| Processing | Batch (static frames, text, video/audio via position encoding) | Stream (continuous time-series) |
| Self-reference | S created from S(t-1) | Forward flow only |
| Parallelism | Processes all information at once | Progresses through stream |
In current Attention models, the self is created from S(t-1). This structure with negative indices allows static images, text, and video/audio (converted to position information) to be handled uniformly.
In contrast, the CSCT Engine is stream-based, so it currently cannot handle static images as elegantly.
This may seem more constrained and slower. However, it's an architecture with high potential for processing while moving—a fundamentally different capability.
Distance Structure
| Aspect | Attention Model | CSCT Engine |
|---|---|---|
| Norm | L2 (codes mix, hard to extract) | L1-like (codes separable) |
| Negative values | Allowed | Not allowed |
| Vector addition | Unrestricted | Constrained to simplex |
| Convex hull | Local pseudo-hull only | Globally closed hull |
Why Transformer "Attention" Is Insufficient
The Transformer attention mechanism is defined as:
Attention(Q, K, V) = softmax(QK^T / √d_k) V
Since softmax outputs sum to unity, the result is a convex combination of value vectors V. Superficially, this appears to satisfy the convex hull constraint.
However, this constitutes only a local pseudo-hull:
- Each layer has its own local hull but doesn't form a globally closed simplex
- No boundary means ever-increasing resources are needed to counteract entropy production
The CSCT Engine requires geometric simplex structure, disallows negative computation, and forbids simple addition.
This constraint creates a geometric convex hull in information space, enabling compositional inference. Conversely, no amount of improvement to current AI can eliminate hallucination or achieve true compositional inference.
Experiment 8: Meaning Extraction
Purpose
EX8 tests whether a frozen system can infer unseen inputs after discrete codes acquire meaning through anchoring.
Experimental Design
Training Data:
- Singles: A, B
- Composites: A+B, A+C, B+C
Withheld: C alone (never shown as single)
Test: After stopping learning (no gradient), can the model infer C (never directly taught) given only anchor 'c'?
Three Geometric Conditions
We vary C's position to examine convex hull effects:
- IN_HULL: C is a convex combination of A and B (inside hull)
- OUT_HULL: C is orthogonal to both A and B (outside hull)
- RANDOM: C is randomly initialized (baseline)
Results (30 seeds per condition, 90 total runs)
| Condition | Success Rate | Withheld Similarity |
|---|---|---|
| IN_HULL | 96.7% (29/30) | 0.979 ± 0.025 |
| RANDOM | 53.3% (16/30) | 0.682 ± 0.370 |
| OUT_HULL | 16.7% (5/30) | 0.701 ± 0.242 |
Statistical test: Kruskal-Wallis H = 42.52, p = 5.85 × 10⁻¹⁰
Key Discovery: Ungrounded Symbol Acquisition
Interestingly, in OUT_HULL, C acquired a unique code in 80% of seeds. Yet reconstruction accuracy remained low.
We term this "Ungrounded Symbol Acquisition": a discrete code is assigned and manipulated, but it lacks representational content within the codebook's constructive capacity.
This provides a mathematical instantiation of Searle's "Chinese Room" argument—the system can "handle" the symbol without the symbol bearing reconstructable meaning.
Experiment 9: Syntax Emergence
Purpose
EX9 tests whether syntactic inference can arise once meaning has been discretized. It is the mirror of EX8:
- EX8 (Meaning): Withheld primitive must be inferred from observed composites
- EX9 (Syntax): Withheld composite must be inferred from observed primitives and composites
Experimental Design
Training Data:
- Singles: A, B, C
- Composites: A+B, A+C
Withheld: B+C (never trained)
Test: Given anchor 'b+c', can the model infer B+C (never directly taught)?
Results (30 seeds per condition, 90 total runs)
| Condition | Success Rate | Withheld Similarity |
|---|---|---|
| IN_HULL | 66.7% (20/30) | 0.890 ± 0.138 |
| RANDOM | 33.3% (10/30) | 0.767 ± 0.253 |
| OUT_HULL | 13.3% (4/30) | 0.742 ± 0.168 |
Statistical test: Kruskal-Wallis H = 19.13, p = 7.00 × 10⁻⁵
Discussion: Syntax as Interpolation, Not Algebra
EX9 results show that syntax in CSCT emerges not as algebraic rule manipulation (combining symbols independent of content) but as discovery of barycentric coordinates on the codebook simplex.
Rather than learning subtraction, the system discovers barycentric coordinates:
f: y_{A+B} → ½v_A + ½v_B
f: y_{A+C} → ½v_A + ½v_C
Because this mapping rule is geometrically consistent, the system generalizes to the withheld input y_{B+C} → ½v_B + ½v_C.
Conclusion
From these results, I conclude that human cognition—especially intellectual activity—is built on constraints.
Through nine experiments, we demonstrated that:
- Discretization emerges reliably from continuous dynamics via clock selection (EX1-3)
- Irreversible anchors provide stability against noise and drift, outperforming reversible (self-referential) systems long-term (EX4)
- Binding is achieved as implicit synchronization through locking to a shared anchor (EX5)
- Internal time progresses proportionally to the signal's rate of change (flux), halting when no signal is present (EX7)
- Semantic grounding requires convex-hull membership; symbols assigned outside the hull remain ungrounded (EX6, EX8)
- Syntax emerges as barycentric interpolation, not algebraic abstraction; compositional inference degrades outside the trained hull (EX9)
Modern AI pursues "infinite expansion" through scaling. I propose the inverse: closed constraint may be a necessary condition for stable, efficient intelligence.
The CSCT Engine may be the first program that can describe (design) our cognitive activity as an "OS."
If you're interested, please run the code. There are no licensing restrictions, and this is just the beginning. This is not speculation—it's a working program.
Links
- Code (GitHub): https://github.com/CSCT-NAIL/CSCT
- Paper (Zenodo DOI): https://doi.org/10.5281/zenodo.18382368
- Project site: https://csct-nail.com

Top comments (0)