Jacob Alcock

Posted on Nov 10

Model Collapse: The AI Feedback Loop Problem Nobody Wants to Talk About

#ai #llm

AI models are eating their own tail, and it's going to be a problem.

The entire premise of modern LLMs is that they're trained on human-generated content. Books, articles, research papers, Stack Overflow answers, GitHub repositories - billions of tokens of actual human knowledge. But that assumption is breaking down faster than anyone wants to admit.

The Core Issue

As we approach the end of 2025, the web is saturated with AI-generated content:

Stack Overflow answers copy-pasted from ChatGPT
GitHub repos with AI-generated documentation and comments
Blog posts churned out by content farms using GPT
Social media posts from bots
Technical articles written entirely by LLMs

Yet AI companies still scrape the web for training data. They can't reliably distinguish human content from synthetic content. Which means the next generation of models will inevitably train on the outputs of previous models.

This is model collapse. And it's not theoretical - it's measurable, reproducible, and already happening.

How Model Collapse Works

The feedback loop is straightforward:

Gen 1: Train on 95% human data, 5% AI slop → minor quality issues
Gen 2: Train on 80% human data, 20% AI content → noticeable degradation
Gen 3: Train on 60% human data, 40% AI outputs → significant problems
Gen 4: Train on majority AI-generated content → model collapse

Each generation compounds the problems:

Loss of diversity - outputs converge toward homogeneous, repetitive patterns
Amplified biases - quirks from previous models get magnified
Increased hallucinations - errors stack across generations
Tail knowledge disappears - rare but critical information gets filtered out first

It's the same principle as photocopying a photocopy. Each iteration degrades the original.

Why You Should Care

Code quality degradation

If Copilot trains on AI-generated code that was itself generated by an earlier model, code suggestions degrade. You're not getting patterns from experienced developers anymore - you're getting averaged-out slop that "looks" like code.

Security implications

AI-assisted security tools trained on AI-generated vulnerability analyses will miss things. If the training data is full of hallucinated CVE details or incorrect exploit explanations, the model learns wrong information.

Knowledge erosion

Niche technical knowledge - the kind buried in obscure forum posts, old mailing lists, and forgotten documentation - disappears first. AI models optimise for common patterns. Rare but critical knowledge gets filtered out.

Trust degradation

You can't tell anymore if that blog post explaining a security vulnerability was written by someone who actually found and tested it, or by an LLM that pieced together fragments from six different sources and hallucinated the rest.

Proposed Solutions (And Why They're All Flawed)

Watermarking

Embed cryptographic signatures in AI outputs to filter them during training. Google and OpenAI are researching this. Problem: watermarks can be stripped. It's an arms race.

Provenance tracking

Track the origin of all training data. Only use verified human content. Problem: doesn't scale. The entire value proposition of LLMs is training on massive web-scale datasets.

Curated datasets

Stop scraping the web entirely. Build human-verified, high-quality datasets. Problem: expensive, slow, and fundamentally limits what the model can learn.

Adversarial filtering

Train models to detect and exclude AI-generated text. Problem: classic adversarial arms race. Detection improves, generation improves to evade detection, repeat forever.

Controlled synthetic mixing

Carefully balance the ratio of real to synthetic data. Problem: requires knowing the exact contamination threshold, which varies by domain and model architecture.

None of these solve the core issue. And we might already be past the point of no return. The web is saturated with AI slop. Even if filtering started today, there are years of contamination already baked into datasets.

The Actual Problem

We're running a one-way experiment on the future of LLMs, and nobody knows the safe parameters.

No one knows what percentage of AI contamination causes collapse. No one knows if current models are already degraded. No one knows how to reverse contamination once it's in the dataset.

LLMs were built on the assumption of abundant, renewable human knowledge. But that assumption was wrong. We're strip-mining the web for training data, and the mine doesn't refill. Every piece of human writing that gets replaced with AI slop permanently degrades the training pool.

The Economic Incentive Problem

The economics make this worse. AI companies have no incentive to solve this:

Scraping is free (legally questionable, but free)
Filtering costs money
Competition doesn't care about data quality 5 years from now
Investors reward shipping features, not long-term dataset integrity

Publishers can't win either. Paywalling content to prevent scraping also blocks legitimate human readers. Not paywalling means getting drained by RAG systems that plagiarise without attribution.

Content creators lose traffic and revenue to AI summaries. So they either stop producing content (reducing the pool of human knowledge) or start using AI to produce more content faster (contaminating the pool).

It's a race to the bottom, and every participant is incentivised to make it worse.

What Actually Needs to Happen

The realistic options are limited:

Legislation requiring training data transparency - companies must disclose what they trained on and prove licensing rights
Mandatory AI content labeling - cryptographic signatures that can't be easily stripped
Royalty systems for scraped content - similar to how music licensing works
Incentivise human-generated content - platforms that verify and reward genuine human writing

None of this will happen voluntarily. The industry is too profitable and moving too fast. Regulation would need to come first, and regulators barely understand the technology.

More likely: we hit model collapse in 3-5 years, everyone scrambles to fix it retroactively, and we end up with some half-botched solution that only partially works.

Final Thoughts

Model collapse is not a hypothetical future problem. It's happening now, measurably, in controlled experiments. The only question is whether we're already seeing it in production models.

The feedback loop is real. The economic incentives ensure it will continue. And the proposed solutions all have fundamental flaws that make them unlikely to work at scale.

I'm not saying LLMs are doomed. I'm saying the current trajectory is unsustainable, and nobody with the power to fix it has an incentive to do so. The companies building these models are optimising for next quarter's revenue, not training data quality in 2030.

This will either get fixed through heavy-handed regulation, or we'll collectively find out what happens when AI models train on increasingly degraded synthetic data. My money is on the latter.

The snake is already eating its tail. We're just waiting to see how far down it gets before someone notices.

Research: