Phil Rentier Digital

Posted on Apr 23 • Originally published at rentierdigital.xyz

Every AI Got Dumber in 30 Days. And It's About to Get Worse.

#ai #technology #largelanguagemodels #aitools

This week, three different people told me the same thing. About three different AI tools. I'm not even on the other three, but the complaints reached me anyway.

TLDR. This isn't Anthropic throttling Claude to cut costs. It's not OpenAI botching a release. It's not Gemini's routing acting up. Four labs are serving you a degraded experience right now, and the explanation is the same for all four. No changelog will write it. No CEO will announce it. And it's about to get much worse.

When four competitors get worse in the same thirty days, it is not four independent bugs. It is a signal. The rest of this piece walks through what the signal says, why the labs can't fix it, and what to do before your bill catches up.

I Use Claude Every Day. This Week, Three Friends Said the Same Thing About GPT, Gemini, Perplexity.

I'm not one of those guys who pays for four subscriptions just to argue about which model is best. I'm a Claude guy. Claude Code all day, Claude for research, Claude for writing (this article too, yes). Boring. Settled.

So when complaints about other tools reach me, it's through the window, not through my wallet.

Monday. A friend who ships code for a living. He tells me Claude Code has gone lazy over the last two weeks or so. Skips steps it nailed in February. Loses thread on multi-file refactors. Not a tantrum. A measured "something isn't holding."

Wednesday evening. "I feel like I'm paying for a junior intern instead of the senior I had." That was another friend, a non-dev who writes with ChatGPT every day. She didn't read an article that told her to feel that way. She noticed while working.

Thursday night, Discord. A marketer in our channel dumps his screen: Perplexity answering a research query with a bullet list. Another query. Another bullet list. He's been chewing on it for three weeks before saying anything, in case it was him.

And in the background, I'd been noticing something on my own Gemini account (yes, I keep one, for search cross-checks). Gemini 3 has started hallucinating on topics Gemini 2.5 used to handle in March. Not catastrophically. Just enough that I stopped trusting it for anything I couldn't verify on the spot.

Four labs. Thirty days. Four different people, four different workflows, the same shape of complaint: the tool is not what it was.

It's not you. It's not the model, not really. It's not even Anthropic or OpenAI being sloppy. Something else is going on, and I don't see anyone writing about the thing that actually ties it together.

Everyone Is Looking in the Wrong Place

Google any one of these complaints right now and you'll find thoughtful people explaining each one. One by one.

OpenAI has been in internal "Code Red" since December 1, 2025. Altman told employees in an internal memo (confirmed by The Information and WSJ) to drop everything and fix ChatGPT before Gemini 3 keeps eating their lunch. So if GPT feels off, well, they are reorganising in a panic. Makes sense.

Anthropic had six outages in April alone. The one on April 13 lasted close to an hour. The one on April 15 stretched close to three hours. The public status monitor reports 98.79% uptime over 90 days. For a company selling enterprise contracts on reliability, that's rough. So if Claude feels flaky, well, the infrastructure is limping. Makes sense.

Gemini is now at 650 million monthly active users according to Google's own October 2025 numbers. Load is way up. Quality, by the consensus of my feed, is down. Draw your own connection.

Perplexity has been squeezing margins on every query since its last business model pivot. Shorter answers, cheaper compute, less research depth. So if Perplexity feels thinner, well, that's the strategy. Makes sense.

Each explanation is tidy. Each one is local. Each one is about a specific lab, a specific bad quarter, a specific executive sweating about benchmarks.

Nobody connects the four.

Why? Because once you notice that four competitors are degrading in the same thirty days, you have to accept that the cause is not inside any of them. That answer is much more uncomfortable than "Altman is panicking" or "Anthropic needs better ops." It implicates something bigger than any single lab controls. So people write a dozen thinkpieces about each individual symptom, and nobody writes about the simultaneity.

I've caught myself on this reflex before. The same misdiagnosis reflex showed up in the Bloomberg productivity panic piece, where the actual cause was not developers going soft on AI but something much more boring. Same reflex here. Everyone is pointing at a lab. The answer is not in a lab.

The simultaneity is the story.

Quality Isn't a Property of the Model. It's a Property of the Compute per Request.

Quality, the way you experience it, is not only a property of the model. It's a property of how much compute the lab is willing to spend on your specific request.

Same weights. Same architecture. Same public benchmark scores. The experience you get at 3pm on a Tuesday in April 2026 can be meaningfully worse than the one you got at 3am in February, and nothing in the release notes will tell you why.

The levers are boring and invisible. A lab can shorten the prompt cache TTL to save memory, and your model "forgets where you were" because the context from five minutes ago got evicted. It can batch your request with fifty others when GPUs are saturated, and each request gets a thinner slice of reasoning, which you feel as a skipped step. It can silently route your query to a smaller model when the frontier queue is full; you still see "Claude Opus 4.7" in the UI while getting Haiku underneath. For models with variable reasoning depth, it can cap the chain of thought at fewer tokens, and the analysis you would have gotten in February stops three paragraphs earlier. It can shrink max output length, and the paragraph becomes a bullet list.

None of this reaches the release notes, because it isn't a model change. It's a service change. The model on the shelf is the same. The portion they serve you got smaller.

None of it breaks the public benchmarks either. MMLU and HumanEval run in controlled conditions with generous compute. The numbers stay where they were. The evals tell you the model is fine. The model is fine. The model is not what you're buying anymore. You're buying a slice of the model's time, and the slice got thinner.

Anthropic can serve you Claude Opus 4.7 and give you a Haiku experience. You'll never see it in the changelog.

The Physical Numbers. They're Worse Than You Think.

US Data Center Capacity: Growing Gap Between Announcements and Construction

Eighteen months. That's how long it takes to build a data center from scratch, assuming you already have the land, the power contract, and the transformers.

Now the bad news. In its 2026 Data Center Outlook, Sightline Climate reports that 30 to 50% of US data centers planned to come online in 2026 will be delayed or cancelled. Bloomberg picked up the report at the end of March 2026. The chart is brutal. Of 16 gigawatts announced for 2026, only about 5 gigawatts are actually under construction right now. The rest sits in "announced" stage with no clear path to the grid.

Why? Not money. Hyperscalers have budgeted over $700 billion of combined capex for 2026. The bottleneck is physical. Transformers. Switchgear. Batteries. The boring electrical hardware between the utility line and the GPU rack. US manufacturing capacity cannot keep up, and the upstream components (including raw materials for batteries) still mostly come from China. Tariffs haven't fixed it. Reshoring hasn't fixed it. Grid operators are flooded with speculative load requests they can't even evaluate.

It gets worse at the corporate level. In December 2025, Oracle pushed several of its OpenAI-dedicated Stargate data centers from 2027 to 2028 per Bloomberg, citing labor and material shortages. Two weeks ago, OpenAI paused the UK Stargate site in West London after six months of announcements. The Narvik site in Norway, originally Stargate, was transferred to Microsoft. Microsoft itself cancelled a batch of European leases in March.

And the electrical grid. The part nobody thinks about until the bill arrives. US grid operators have been publicly warning since 2024 that they can't energize new data center capacity at the pace it's being announced. That warning is on file. Nobody acted on it. Here we are.

Why April. Why Now.

So why April. Why all four at once.

Because demand crossed supply this quarter. And the crossing was not gradual.

Gemini 3 shipped on November 18, 2025. By January, it was at 650 million monthly active users. A consumer product doesn't grow from nothing to 650 million in six weeks without eating compute that was allocated to other workloads. Google's internal routing had to make choices.

ChatGPT hit "Code Red" on December 1, 2025. Altman told his team to drop everything and focus on ChatGPT quality. That means reallocating compute planned for other things (agents, Pulse, ads infrastructure). OpenAI was defending market share against Gemini 3 and shifting compute inside the same fixed budget.

On the Anthropic side, enterprise adoption of Claude Code and the newer Cowork agent went exponential in Q1 2026. Anthropic told the press in March that the company was signing up more than a million new users a day. A million a day. On infrastructure that was not provisioned for a million a day.

And the compute available to all four labs has not grown proportionally since late 2025. The expansion pipeline we just walked through is what was supposed to add the capacity. That pipeline is sputtering.

Something had to give. What gave is the part nobody measures, the part nobody publishes: the quality of the average request. Not the benchmark. The request. Yours.

Labs don't announce rationing. They do it.

And 2027 Is Already Worse Than 2026

The worst part isn't 2026. It's 2027.

Sightline's same report tracks the 2027 pipeline. Of 21.5 gigawatts announced for 2027, only 6.3 are currently under construction. The ratio is worse than 2026. Not the same. Worse. The delta between announcement and reality is widening, not narrowing.

And remember the build time. Twelve to eighteen months minimum. If a project isn't under construction today, it doesn't come online in 2027. It comes online in 2028. Maybe.

Between today and that line crossing itself again, demand is not holding still. AI agents are multiplying, every Fortune 500 is running some internal copilot, video generation eats tokens by the gigabyte, persistent agent workflows keep context windows open for hours instead of seconds. The average query of 2027 will consume more compute than the average query of 2025. Much more. How much exactly, nobody knows. The direction is not contested.

Supply sputters. Demand rises. The gap is structural.

This is not a cycle correcting itself. It's a cliff that 2026 started climbing and 2027 makes steeper.

Three Things to Do Before the Price Catches Up

Before the bill starts telling the truth, three things. Not ten. Not a listicle. Three.

First, a number the consumer press has not hammered on enough. The Information reported back in March 2025 that OpenAI was planning agent tiers at $2,000 per month for a "high-income knowledge worker" agent, $10,000 for a software developer agent, and $20,000 for a "PhD-level researcher" agent. That was a year ago. ChatGPT Pro at $200 per month, now in the catalog, is not the end game. It's the appetiser. Claude will follow. Gemini will follow. Anyone with a serious reasoning product will follow. A thousand, five thousand, ten thousand euros a month for the tier that actually works. Not a question of if. Question of when. Maybe 2027. Maybe sooner.

Given that, three things.

One. Concentrate your budget on one primary tool. The Netflix fragmentation is coming to LLMs, and you know how that ended for your streaming stack. If you are paying for four subs right now to compare, you are about to be spending more for less. Pick your camp while picking is still cheap. Use the others free or API-metered when you absolutely need a second opinion.

Two. Stop choosing your model by brand. Choose by task. The "best all-arounder" is already dead, you just don't know it yet. Claude is currently the best I've used for long-context code and writing in voice. GPT is the best at multimodal and at reasoning when you give it enough room. Gemini is strongest on live search and on handling enormous inputs. Perplexity is for sourced research (contested right now, but the intent is still good). Move between them the way a cook moves between knives. Not one knife for everything.

Three. Re-learn the fundamentals without AI while it's still cheap. If you're a dev who has been vibe-coding for two years without being able to read a stack trace unassisted, you'll be stuck when the decent tier costs four figures a month. If you're a writer who can't draft a paragraph without the assistant, same story. (I learned this the hard way rebuilding a $200 setup for $15 when the price moved last time.) The people who will survive the price reset are the ones who still know what the tool was doing for them. Everybody else becomes a very expensive customer.

The window is closing. Not slammed shut. Closing. Use it.

The Golden Age Didn't End With an Announcement

The golden age of cheap electricity did not end with a speech. The golden age of free broadcast TV did not end with a speech. The golden age of unlimited mobile data did not end with a speech. Each time, the bill started telling the truth before any executive did.

We're at that moment.

The golden age does not end with an announcement. It ends when the bill starts telling the truth. The bill is whispering in April. It'll be talking by September 💸

Sources

Sightline Climate, 2026 Data Center Outlook, reported by Bloomberg in late March 2026
The Information and Wall Street Journal on Sam Altman's "Code Red" memo (December 1, 2025)
SF Standard and TechRadar on Anthropic's April 2026 outage series

(*) The cover is AI-generated. Which, given the subject of this article, is probably its own small irony.

DEV Community