LLM providers are retiring models faster than you can migrate

Model Radar — Sat, 16 May 2026 07:54:14 +0000

On May 15, 2026, xAI retired 8 Grok API models. The notice period was 9 days.

If you had grok-2, grok-3, or grok-4-fast pinned in production, here's the part that actually bites: the retired slugs don't hard-error. They silently redirect to grok-4.3 — reasoning models drop to low effort, non-reasoning to none — and you get billed at grok-4.3 pricing ($1.25 / $2.50 per 1M tokens). xAI's original retirement email said the requests would "no longer work"; a later docs update introduced the silent-redirect behavior. The two are contradictory, and either way your output quality and your bill changed without a single error in your logs.

This is not an xAI problem. It's the whole industry right now:

OpenAI removed chatgpt-4o-latest from the API on Feb 17, 2026. The Assistants API sunsets Aug 26, 2026.
Anthropic ends Claude Opus 4 and Sonnet 4 on Jun 15, 2026 (Opus 3 already retired Jan 5; Haiku 3 on Apr 19).
Google can shut off Gemini 2.0 Flash / Flash-Lite as early as Jun 1, 2026; they've been restricted to existing customers since Mar 6.

Pinning model IDs is correct — and it's now a liability

The standard advice is to pin explicit model versions for reproducibility, and that advice is right. A floating alias means your behavior changes silently under you. But a pinned ID means that when the provider retires it, you break — or worse, get silently rerouted. Either way, the failure mode is the same: something you depend on changed, and nobody told you in a channel you actually watch. Provider changelogs are scattered across docs pages, status pages, dashboard banners, and one-off emails to whatever address owns the billing account.

What I did about it

I started keeping a normalized, cross-provider timeline of every deprecation / breaking change / pricing change I could verify against provider docs: provider, model, event type, announced date, effective date, recommended replacement, source link.

It currently covers OpenAI, Anthropic, Gemini, and xAI. It's manually curated and fact-checked right now — the data is the hard part, not the page. If you spot a missing or wrong event, or a provider you want covered, tell me and I'll add it.

If you run anything against an LLM API in production, subscribe to the RSS feed or check it before your next deploy. The next 9-day notice is already on the calendar.

Your LLM provider will deprecate your model. xAI just gave 9 days' notice.

Model Radar — Sat, 16 May 2026 07:40:02 +0000

If your app has a hard-coded model ID, you are one provider email away from a production incident. Here is what just happened, and why it keeps happening.

xAI: 8 models, ~9 days

Around May 6, 2026, xAI emailed API customers announcing Grok 4.3 and the retirement of 8 models effective May 15, 2026, 12:00 PM PT:

grok-4-1-fast-reasoning, grok-4-1-fast-non-reasoning
grok-4-fast-reasoning, grok-4-fast-non-reasoning
grok-4-0709
grok-code-fast-1
grok-3
grok-imagine-image-pro

That is roughly nine days of notice to find every hard-coded slug in your stack, test a replacement, and ship.

The part that actually bites: silent redirect + rebill

The announcement email said requests to the retired models "will no longer work." xAI's migration docs say something different: the old slugs silently redirect to grok-4.3 — reasoning models at low effort, non-reasoning at none effort, grok-imagine-image-pro to grok-imagine-image-quality — billed at grok-4.3 pricing ($1.25 / $2.50 per 1M tokens).

Read that again. Depending on which source is current, your app either (a) starts throwing errors, or (b) keeps "working" while silently changing the underlying model, its reasoning behavior, and your per-token price — with no exception, no log line, nothing. Both outcomes are bad. The only safe move is to migrate off the slugs entirely and pin explicit model + reasoning_effort.

This is not an xAI problem. It is the new normal.

Same window, other providers, all dated:

Provider	Event	Effective
OpenAI	`chatgpt-4o-latest` snapshot removed from API	2026-02-17
OpenAI	Assistants / Threads / Runs API sunset	2026-08-26
Anthropic	Claude Opus 3 retired	2026-01-05
Anthropic	Claude Haiku 3 retired	2026-04-19
Anthropic	Claude Opus 4 + Sonnet 4 retired	2026-06-15
Google	Gemini 2.0 Flash / Flash-Lite earliest shutdown	2026-06-01
xAI	8 Grok models retired	2026-05-15

Model names are not stable infrastructure anymore. They are dated, expiring identifiers, and the notice window is shrinking (OpenAI's Feb retirements were ~2 weeks; xAI's was ~9 days).

What to actually do

Grep your codebase for every hard-coded model string today.
Route model IDs through config/env, never inline in business logic.
Canary the recommended replacement before the cutoff — replacements differ on latency, reasoning depth, tool use, and price.
Subscribe to something that watches provider lifecycle events so the deadline finds you before prod does.

For #4 I got tired of finding out from a 500, so I keep a single normalized cross-provider timeline — provider, model, event type, announced date, effective date, recommended replacement, source link — with an RSS feed, free and no signup:

https://ai-model-change-radar-west0ngs-projects.vercel.app/ — RSS: https://ai-model-change-radar-west0ngs-projects.vercel.app/rss.xml

It deliberately does not track your bill or benchmark model quality. It answers exactly one question: "did something change that will break or reprice the models I depend on?" If it is missing a provider or event you care about, that feedback is the most useful thing you can send.

DEV Community: Model Radar