DEV Community

Alex Morgan
Alex Morgan

Posted on

I tracked 332 AI releases this week. 85% were noise.

332 items. That's how many AI releases, papers, repos, and tool launches we ingested last week into the tracker I've been building.

I sat down this morning to review the weekly summary. After 6 months of running this, I have a number burned into my brain: 85%.

That's the percentage of ingest that gets filtered as noise before anything reaches a reader.


What "noise" actually means

Not fake. Not spam. Just... undifferentiated. Here's the breakdown from this week's batch:

  • 47 "we're excited to announce" blog posts — repackaged product updates with zero technical content
  • 61 GitHub repos with a README that says "coming soon" or "WIP"
  • 29 papers that are incremental variations on RLHF fine-tuning with no usable results section
  • 38 "model launches" that are fine-tunes of Llama or Mistral with different system prompts and a new brand name
  • 27 ecosystem announcements (funding rounds, team changes, partnerships) — real news, but not useful if you're building

That's 202 items out of 332. Gone.

What's left: 130 items worth a developer's attention. Of those, maybe 15 are genuinely important — things that change what you should be building or how you should think about the stack.


The classification problem nobody talks about

I didn't expect this ratio when I started. I thought AI news was signal-dense. The opposite is true. The field moves fast, but most of the "movement" is noise masquerading as signal.

The hard part isn't ingestion. It's classification. We've tried three approaches:

  1. Rule-based filtering — keyword blacklists, source trust scores. Works for obvious noise. Misses subtle stuff.
  2. LLM scoring — prompt asking "is this novel?" Gets fooled by confident-sounding fluff.
  3. Hybrid + human audit — where we are now. An LLM pre-filters; I spot-check 20-30 items per day for calibration.

The calibration loop is the thing nobody writes about. Every week the distribution shifts slightly. A term that meant "genuine model architecture" last month now means "branding wrapper."


The number that surprised me

After 6 months: the 15% that matters hasn't actually changed.

The absolute count of real signal per week has been roughly stable at 40-60 items. The noise has grown. There are more repos, more launches, more papers — but the density of things that actually move the field has stayed flat.

Which means the job of the tracker is getting harder, not easier, despite LLMs being better at the task.


Wondering if others building in this space see the same pattern. Is your signal-to-noise ratio getting worse as the ecosystem scales, or is my classification schema just not adapting fast enough?

Top comments (0)