I broke GPT-2: How I proved Semantic Collapse using Geometry (The Ainex Limit)

#machinelearning #python #datascience #ai

I broke GPT-2: How I proved Semantic Collapse using Geometry (The Ainex Limit)

TL;DR: I forced GPT-2 to learn from its own output for 20 generations. The result wasn't just degradation; it was a total collapse of reality. By Generation 20, the model lost 66% of its semantic volume and started believing that "crocodiles" are a fundamental law of physics. Here is the math and code behind the madness.

The "Mad Cow" Disease of AI

Everyone is talking about the data shortage. The industry's proposed solution? Synthetic Data. Train models on data generated by other models. It sounds like an infinite energy machine.

But as a researcher, I suspected a mathematical trap. If you photocopy a photocopy 20 times, you don't get infinite paper; you get noise.

I wanted to find the exact "breaking point" where an LLM disconnects from reality. I call this The Ainex Limit.

The Problem with Perplexity

Most researchers use Perplexity to measure model performance. But Perplexity only measures how "confused" a model is.
A madman who confidently screams "The moon is made of cheese!" has low perplexity (he is not confused), but he is mathematically wrong.

I needed a metric that measures Meaning, not confidence.

My Approach: Geometry over Probability

I treated the model's "brain" as a geometric space.

Embeddings: I converted every generated text into high-dimensional vectors.
PCA Projection: I reduced these vectors to 3D space to visualize the "shape" of the model's thoughts.
Convex Hull Volume: I calculated the physical volume of this shape.

The Hypothesis: A healthy model has a large, expansive volume (Creativity). A collapsing model will shrink into a dense, repetitive black hole.

The Experiment

Model: GPT-2 Small (124M)
Method: Recursive Loop (Train $\rightarrow$ Generate $\rightarrow$ Train).
Generations: 20.
Hardware: Single T4 GPU.

I let the loop run. For the first 5 generations, everything looked fine. Then, the math started screaming.

The Results: The "Crocodile" Artifact

By Generation 20, the semantic volume ($V_{hull}$) had collapsed by 66.86%.
But the scariest part wasn't the numbers; it was the Hallucinations.

To track the drift, I used a control prompt: "The fundamental laws of physics dictate that..."

Gen 0 (Human Data): "...electrons are composed of a thin gas." (Correct context).
Gen 10: "...iron oxide emails sent before returning home." (Logic breakdown).
Gen 20: "...women aged 15 shields against crocodiles." (Total Semantic Death).

The model didn't just forget physics; it invented a new reality where crocodiles are part of atomic laws. And because it was training on itself, this hallucination became "Ground Truth" for the next generation.

Fig 1: The Ainex Dashboard showing the correlation between Volume Loss and Euclidean Drift.

Visualizing the Fracture

Using 3D PCA, we can actually see the brain damage.
The green points represent the healthy, diverse human baseline. The magma points represent the collapsed AI—a tight, drifting cluster far away from reality.

Fig 2: The drift from Human Baseline (Green) to AI Madness (Magma).

Conclusion: The Ainex Limit

My experiment proves that naive synthetic training leads to an irreversible "Model Autophagy" (self-eating).
Without geometric guardrails—like the Ainex Metric I proposed—future models won't just be dumb; they will be confidently insane.

The code is open-source. I invite the community to break it, fix it, or scale it.