DEV Community

Cover image for We're Creating a Knowledge Collapse and No One's Talking About It
Daniel Nwaneri
Daniel Nwaneri

Posted on

We're Creating a Knowledge Collapse and No One's Talking About It

"Hostile experts created the dataset for patient machines."

That line, from a comment by Vinicius Fagundes on my last article, won't leave my head.

Stack Overflow's traffic collapsed 78% in two years. Everyone's celebrating that AI finally killed the gatekeepers. But here's what we're not asking:

If we all stop contributing to public knowledge bases, what does the next generation of AI even train on?

We might be optimizing ourselves into a knowledge dead-end.

The Data We're Ignoring

Stack Overflow went from 200,000 questions per month at its peak to under 50,000 by late 2025. That's not a dip. That's a collapse.

Meanwhile, 84% of developers now use AI tools in their workflow, up from 76% just a year ago. Among professional developers, 51% use AI daily.

The shift is real. The speed is undeniable. But here's the uncomfortable part: 52% of ChatGPT's answers to Stack Overflow questions are incorrect.

The irony is brutal:

  • AI trained on Stack Overflow
  • Developers replaced Stack Overflow with AI
  • Stack Overflow dies from lack of new content
  • Future AI has... what, exactly?

The Wikipedia Problem

Here's something nobody's complaining about loudly enough: Wikipedia sometimes doesn't even appear on the first page of Google results anymore.

Let that sink in. The largest collaborative knowledge project in human history - free, community-curated, constantly updated, with 60+ million articles - is getting buried by AI-generated summaries and SEO-optimized content farms.

Google would rather show you an AI-generated answer panel (trained on Wikipedia) than send you to Wikipedia itself. The thing that created the knowledge gets pushed down. The thing that consumed the knowledge gets prioritized.

This is the loop closing in real-time:

  1. Humans build Wikipedia collaboratively
  2. AI trains on Wikipedia
  3. Google prioritizes AI summaries over Wikipedia
  4. People stop going to Wikipedia
  5. Wikipedia gets fewer contributions
  6. AI trains on... what, exactly?

We're not just moving from public to private knowledge. We're actively burying the public knowledge that still exists.

Stack Overflow isn't dying because it's bad. Wikipedia isn't disappearing because it's irrelevant. They're dying because AI companies extracted their value, repackaged it, and now we can't even find the originals.

The commons didn't just lose contributors. It lost visibility.

What We Actually Lost

PEACEBINFLOW captured something crucial:

"We didn't just swap Stack Overflow for chat, we swapped navigation for conversation."

Stack Overflow threads had timestamps, edits, disagreement, evolution. You could see how understanding changed as frameworks matured. Someone's answer from 2014 would get updated comments in 2020 when the approach became deprecated.

AI chats? Stateless. Every conversation starts from zero. No institutional memory. No visible evolution.

I can ask Claude the same question you asked yesterday, and neither of us will ever know we're solving the same problem. That's not efficiency. That's redundancy at scale.

As Amir put it:

"Those tabs were context, debate, and scars from other devs who had already been burned."

We traded communal struggle for what Ali-Funk perfectly named: "efficient isolation."

The Skills We're Not Teaching

Amir nailed something that's been bothering me:

"AI answers confidently by default, and without friction it's easy to skip the doubt step. Maybe the new skill we need to teach isn't how to find answers, but how to interrogate them."

The old way:
Bad docs forced skepticism accidentally. You got burned, so you learned to doubt. Friction built judgment naturally.

The new way:
AI is patient and confident. No friction. No forced skepticism. How do you teach doubt when there's nothing pushing back?

We used to learn to verify because Stack Overflow answers were often wrong or outdated. Now AI gives us wrong answers confidently, and we... trust them? Because the experience is smooth?

The Economics of Abundance

Doogal Simpson reframed the problem economically:

"We are trading the friction of search for the discipline of editing.
The challenge now isn't generating the code, but having the guts to
reject the 'Kitchen Sink' solutions the AI offers."

Old economy: Scarcity forced simplicity

Finding answers was expensive, so we valued minimal solutions.

New economy: Abundance requires discipline

AI generates overengineered solutions by default. The skill is knowing
what to DELETE, not what to ADD.

This connects to Mohammad Aman's warning about stratification: those who
develop the discipline to reject complexity become irreplaceable. Those
who accept whatever AI generates become replaceable.

The commons didn't just lose knowledge. It lost the forcing function that
taught us to keep things simple.

The Solver vs Judge Problem

Ben Santora has been testing AI models with logic puzzles designed to reveal reasoning weaknesses. His finding: most LLMs are "solvers" optimized for helpfulness over correctness.

When you give a solver an impossible puzzle, it tries to "fix" it to give you an answer. When you give a judge the same puzzle, it calls out the impossibility.

As Ben explained in our exchange:

"Knowledge collapse happens when solver output is recycled without a strong, independent judging layer to validate it. The risk is not in AI writing content; it comes from AI becoming its own authority."

This matters for knowledge collapse: if solver models (helpful but sometimes wrong) are the ones generating content that gets recycled into training data, we're not just getting model collapse - we're getting a specific type of collapse.

Confident wrongness compounds. And it compounds confidently.

The Verification Problem

Ben pointed out something crucial: some domains have built-in verification, others don't.

Cheap verification domains:

  • Code that compiles (Rust's strict compiler catches errors)
  • Bash scripts (either they run or they don't)
  • Math (verifiable proof)
  • APIs (test the endpoint, get immediate feedback)

Expensive verification domains:

  • System architecture ("is this the right approach?")
  • Best practices ("should we use microservices?")
  • Performance optimization ("will this scale?")
  • Security patterns ("is this safe?")

Here's the problem: AI solvers sound equally confident in both domains.

But in expensive verification domains, you won't know you're wrong until months later when the system falls over in production. By then, the confident wrong answer is already in blog posts, copied to Stack Overflow, referenced in documentation.

And the next AI trains on that.

The Confident Wrongness Problem

Maame Afua and Richard Pascoe highlighted something worse
than simple hallucination:

When AI gets caught being wrong, it doesn't admit error - it generates
plausible explanations for why it was "actually right."

Example:

You: "Click the Settings menu"
AI: "Go to File > Settings"
You: "There's no Settings under File"
AI: "Oh yes, that menu was removed in version 3.2"
[You check - Settings was never under File]
Enter fullscreen mode Exit fullscreen mode

This is worse than hallucination because it makes you doubt your own
observations. "Wait, did I miss an update? Am I using the wrong version?"

Maame developed a verification workflow: use AI for speed, but check
documentation to verify. She's doing MORE cognitive work than either
method alone.

This is the verification tax. And it only works if the documentation
still exists.

The Tragedy of the Commons

This is where it gets uncomfortable.

Individually, we're all more productive. I build faster with Claude than I ever did with Stack Overflow tabs. You probably do too.

But collectively? We're killing the knowledge commons.

The old feedback loop:

Problem → Public discussion → Solution → Archived for others
Enter fullscreen mode Exit fullscreen mode

The new feedback loop:

Problem → Private AI chat → Solution → Lost forever
Enter fullscreen mode Exit fullscreen mode

Ingo Steinke pointed out something I hadn't considered: even if AI companies train on our private chats, raw conversations are noise without curation.

Stack Overflow had voting. Accepted answers. Comment threads that refined understanding over time. That curation layer was the actual magic, not just the public visibility.

Making all AI chats public wouldn't help. We'd just have a giant pile of messy conversations with no way to know what's good.

Pascal CESCATO warned:

"Future generations might not benefit from such rich source material... we shouldn't forget that AI models are trained on years of documentation, questions, and exploratory content."

We're consuming the commons (Stack Overflow, Wikipedia, documentation) through AI but not contributing back. Eventually the well runs dry.

We're Feeling Guilty About the Wrong Thing

A commenter said: "I've been living with this guilty conscience for some time, relying on AI instead of doing it the old way."

I get it. I feel it too sometimes. Like we're cheating, somehow.

But I think we're feeling guilty about the wrong thing.

The problem isn't using AI. The tools are incredible. They make us faster, more productive, able to tackle problems we couldn't before.

The problem is using AI privately while the public knowledge base dies.

We've replaced "struggle publicly on Stack Overflow" with "solve privately with Claude." Individually optimal. Collectively destructive.

The guilt we feel? That's our instinct telling us something's off. Not because we're using new tools, but because we've stopped contributing to the commons.

One Possible Path Forward

Ali-Funk wrote about using AI as a "virtual mentor" while transitioning from IT Ops to Cloud Security Architect. But here's what he's doing differently:

He uses AI heavily:

  • Simulates senior architect feedback
  • Challenges his technical designs
  • Helps him think strategically

But he also:

  • Publishes his insights publicly on dev.to
  • Verifies AI output against official AWS docs
  • Messages real people in his network for validation
  • Has a rule: "Never implement what you can't explain to a non-techie"

As he put it in the comments:

"AI isn't artificial intelligence. It's a text generator connected to a library. You can't blindly trust AI... It's about using AI as a compass, not as an autopilot."

This might be the model: Use AI to accelerate learning, but publish the reasoning paths. Your private conversation becomes public knowledge. The messy AI dialogue becomes clean documentation that others can learn from.

It's not "stop using AI" - it's "use AI then contribute back."

The question isn't whether to use these tools. It's whether we can use them in ways that rebuild the commons instead of just consuming it.

Model Collapse

Peter Truchly raised the real nightmare scenario:

"I just hope that conversation data is used for training, otherwise the only entity left to build that knowledge base is AI itself."

Think about what happens:

  1. AI trains on human knowledge (Stack Overflow, docs, forums)
  2. Humans stop creating public knowledge (we use AI instead)
  3. New problems emerge (new frameworks, new patterns)
  4. AI trains on... AI-generated solutions to those problems
  5. Garbage in, garbage out, but at scale

This is model collapse. And we're speedrunning toward it while celebrating productivity gains.

GitHub is scraped constantly. Every public repo becomes training data. If people are using solver models to write code, pushing to GitHub, and that code trains the next generation of models... we're creating a feedback loop where confidence compounds regardless of correctness.

The domains with cheap verification stay healthy (the compiler catches it). The domains with expensive verification degrade silently.

The Corporate Consolidation Problem

webketje raised something I hadn't fully addressed:

"By using AI, you opt out of sharing your knowledge with the broader community
in a publicly accessible space and consolidate power in the hands of corporate
monopolists. They WILL enshittify their services."

This is uncomfortable but true.

We're not just moving from public to private knowledge. We're moving from
commons to capital.

Stack Overflow was community-owned. Wikipedia is foundation-run. Documentation
is open source. These were the knowledge commons - imperfect, often hostile,
but fundamentally not owned by anyone.

Now we're consolidating around:

  • OpenAI (ChatGPT) - $157B valuation
  • Anthropic (Claude) - $60B valuation
  • Google (Gemini) - Alphabet's future

They own the models. They own the training data. They set the prices.

And as every platform teaches us: they WILL enshittify once we're dependent.

Remember when:

  • Twitter was free and open? Now it's X.
  • Google search was clean? Now it's ads and AI.
  • Reddit was community-first? Now it's IPO-driven.

The pattern is clear: Build user dependency → Extract maximum value →
Users have nowhere else to go.

What happens when Claude costs $100/month? When ChatGPT paywalls
advanced features? When Gemini requires Google Workspace Enterprise?

We'll pay. Because by then, we won't remember how to read documentation.

At least Stack Overflow never threatened to raise prices or cut off API access.

Sidebar: The Constraint Problem

Ben Santora argues that AI-assisted coding requires strong constraints -
compilers that force errors to surface early, rather than permissive environments
that let bad code slip through.

The same principle applies to knowledge: Stack Overflow's voting system was a
constraint. Peer review was a constraint. Community curation was a constraint.

AI chats have no constraints. Every answer sounds equally confident, whether
it's right or catastrophically wrong. And when there's no forcing function to
catch the error...

The Uncomfortable Counter-Argument

Mike Talbot pushed back hard on my nostalgia:

"I fear Stack Overflow, dev.to etc are like manuals on how to look after your horse, when the world is soon going to be driving Fords."

Ouch. But maybe he's right?

Maybe we're not losing something valuable. Maybe we're watching an obsolete skill set become obsolete. Just like:

  • Assembly programmers → High-level languages
  • Manual memory management → Garbage collection
  • Physical servers → Cloud infrastructure
  • Horse care manuals → Auto repair guides

Each generation thought they were losing something essential. Each generation was partially right.

But here's where the analogy breaks down: horses didn't build the knowledge base that cars trained on. Developers did.

If AI replaces developers, and future AI trains on AI output... who builds the knowledge base for the NEXT paradigm shift?

The horses couldn't invent cars. But developers invented AI. If we stop thinking publicly about hard problems (system design, organizational architecture, scaling patterns), does AI even have the data to make the next leap?

Or do we hit a ceiling where AI can maintain existing patterns but can't invent new ones?

I don't know. But "we're the horses" is the most unsettling framing I've heard yet.

What We Actually Need

I don't have clean answers. But here are questions worth asking:

Can we build Stack Overflow for the AI age?

Troels asked: "Perhaps our next 'Stack Overflow for the AI age' is yet to come. Perhaps it will be even better for us."

I really hope so. But what would that even look like?

From Stack Overflow (the good parts):

  • Public by default
  • Community curation (voting, accepted answers)
  • Searchable and discoverable
  • Evolves as frameworks change

From AI conversations (the good parts):

  • Patient explanation
  • Adapts to your context
  • Iterative dialogue
  • No judgment for asking "dumb" questions

What it can't be:

  • Just AI chat logs (too noisy)
  • Just curated AI answers (loses the reasoning)
  • Just documentation (loses the conversation)

Maybe it's something like: AI helps you solve the problem, then you publish the reasoning path - not just the solution - in a searchable, community-curated space.

Your messy conversation becomes clean documentation. Your private learning becomes public knowledge.

Should we treat AI conversations as artifacts?

When you solve something novel with AI, should you publish that conversation? Create new public spaces for AI-era knowledge? Find a curation mechanism that actually works?

Pascal suggested: "Using the solid answers we get from AI to build clean, useful wikis that are helpful both to us and to future AI systems."

This might be the direction. Not abandoning AI, but creating feedback loops from private AI conversations back to public knowledge bases.

How do we teach interrogation as a core skill?

Make "doubting AI" explicit in how we teach development. Build skepticism into the workflow. Stop treating AI confidence as correctness.

As Ben put it: "The human must always be in the loop - always and forever."

The Uncomfortable Truth

We're not just changing how we code. We're changing how knowledge compounds.

Stack Overflow was annoying. The gatekeeping was real. The "marked as duplicate" culture was hostile. As Vinicius perfectly captured:

"I started learning Linux in 2012. Sometimes I'd find an answer on Stack Overflow. Sometimes I'd get attacked for how I asked the question. Now I ask Claude and get a clear, patient explanation. The communities that gatekept knowledge ended up training the tools that now give it away freely."

Hostile experts created the dataset for patient machines.

But Stack Overflow was PUBLIC. Searchable. Evolvable. Future developers could learn from our struggles.

Now we're all having the same conversations in private. Solving the same problems independently. Building individual speed at the cost of collective memory.

Sophia Devy said it best:

"We're mid-paradigm shift and don't have the language for it yet."

That's exactly where we are. Somewhere between the old way dying and the new way emerging. We don't know if this is progress or just... change.

But the current trajectory doesn't work long-term.

If knowledge stays private, understanding stops compounding. And if understanding stops compounding, we're not building on each other anymore.

We're just... parallel processing.


Huge thanks to everyone who commented on my last article. This piece is basically a synthesis of your insights. Special shoutout to Vinicius, Ben, Ingo, Amir, PEACEBINFLOW, Pascal, Mike, Troels, Sophia, Ali,Maame,webketje,doogal and Peter for sharpening this thinking.

What's your take? Are we headed for knowledge collapse, or am I overthinking this? Drop a comment - let's keep building understanding publicly.

Top comments (2)

Collapse
 
maame-codes profile image
Maame Afua A. P. Fordjour

I’ve noticed that the friction of a broken script or a confusing doc is actually what forces me to understand the 'why.' When an AI gives a confident, polished answer, it’s tempting to skip that doubt step entirely. Developing that judging layer you mentioned feels like the most important thing I can focus on right now. Great follow-up piece!

Collapse
 
dannwaneri profile image
Daniel Nwaneri • Edited

this is it exactly.

friction teaches the "why" accidentally. smooth AI answers skip straight to "what" and we miss the foundation.

the fact that you're consciously building that judging layer puts you ahead of most devs who just optimize for speed without realizing what they're losing.

curious.when you catch AI being confidently wrong now, does it make you
more skeptical of future answers? or do you still have to fight the temptation to trust it?