Ted

Posted on Jun 14 • Originally published at tedagentic.com

My Redirects Worked in the Browser. Googlebot Saw Soft 404s.

#seo #vercel #react #webdev

Google Search Console flagged six URLs on a client site as Soft 404. Every one of them returned a clean 200 OK when I curled it. So how does a page that loads fine get reported as "not found"?

That contradiction is the whole story, and the answer turned out to be a category of bug I'd been shipping without realizing it.

What "Soft 404" actually means

A hard 404 is honest: the server returns a 404 status, Google drops the URL, everyone moves on. A soft 404 is when the server returns 200 OK but the content looks like an error or an empty page to Google. The status line says "here's your page," the body says "there's nothing here." Google trusts the body.

On a single-page app, there's a very common way to produce exactly that.

The setup

The site is a React/Vite SPA, deployed on Vercel, with a prerender layer that injects real HTML for SEO so the crawler doesn't have to run JavaScript. Routes that are prerendered get proper content. Routes that aren't fall through to Vercel's catch-all, which serves the app shell — effectively the homepage HTML — at whatever URL was requested.

Hold onto that last sentence. It's the trap.

The fingerprint

All six soft-404s shared a pattern: they were old, renamed blog slugs. A few were year-suffixed posts I'd renamed (think /blog/some-guide-2025 → /blog/some-guide-2026). Two were older duplicates of a post that now lives at a cleaner URL. One was the same path on the www host instead of the apex.

Old URLs that should redirect somewhere. And in the codebase, three of them did have redirects. That's what made this confusing — I had redirects, and Google was still calling the pages broken.

The trap: the redirects were client-side

The three "redirected" routes used the SPA router's redirect component — the React-Router <Navigate> element. In a browser, it works perfectly: the app boots, the router matches the old path, and the user is bounced to the new URL before they notice.

But look at the order of operations from Googlebot's point of view:

Request /blog/some-guide-2025.
The route isn't prerendered, so Vercel serves the app shell — homepage-ish HTML, status 200.
Then the JavaScript would run and redirect — but the crawler has already been handed a 200 with the wrong content.

The redirect lives inside the JavaScript. The crawler's verdict is formed before the JavaScript runs. So Google sees a 200 response whose body is the homepage, served at a URL that's supposed to be a specific blog post — a page that resolves to nothing meaningful. Soft 404.

The other two had no redirect at all and went straight to the shell fallback — same outcome by a more direct route.

A client-side redirect is invisible to a crawler that judges the first response. If you want Google to treat a URL as moved, the move has to happen in the response itself, before any JavaScript: a real server-side 3xx.

The fix

The fix is to do the redirect at the edge, not in the app. On Vercel that's a redirects entry in vercel.json:

{
  "source": "/blog/old-slug",
  "destination": "/blog/new-slug",
  "permanent": true
}

"permanent": true emits a 308 (Vercel's permanent redirect), which Google treats the same as a 301. Now the first response Googlebot gets is "this moved, permanently, here" — no shell, no JavaScript, no ambiguity. The www variant was already handled by an apex-redirect rule, so once the apex path resolved correctly, the www one chained into it.

I added server-side redirects for the two that had none, pointed both old duplicates at the canonical post, and the client-side <Navigate> routes became harmless fallbacks behind the real ones.

The twist: most of them were already fixed

Here's the part that saved me a pile of unnecessary work — and that I almost skipped.

Before "fixing" the three that already had server-side redirects, I checked when Google last crawled them. The URL Inspection API returns lastCrawlTime. Every one of those dates was weeks before the server-side redirects had shipped. The pages weren't broken anymore. Google's report was a snapshot from the last time it looked, and it simply hadn't looked again.

GSC statuses are not live. They're the result of the most recent crawl, which can be a month stale on a low-traffic site. Before you re-fix something the report calls broken, check lastCrawlTime — you may be debugging a problem that no longer exists.

Submission is not a fix

The instinct, once the redirects were in, was to "submit the pages to Google." But you don't submit a redirect source — it's not content, it's a signpost. The Indexing API is for telling Google a real page changed. For redirected URLs, the correct lever is GSC → "Validate Fix" on the soft-404 report, which re-queues the crawl. The only thing worth submitting is the redirect targets — the live pages — so Google freshens those.

Resubmitting a URL Google already crawled doesn't change Google's mind. It just asks the same question again.

The other bucket: "crawled, not indexed"

Separately, a larger batch of pages on the same site had been sitting in "Crawled – currently not indexed" — and around two dozen of them cleared into the index over the same period. It's tempting to lump that together with the soft-404 fix, but it's a different problem with a different cause.

Soft 404 is a technical verdict: wrong response, fix the response. "Crawled, not indexed" is a quality verdict: Google fetched a real page and chose not to index it. You don't move that with redirects or resubmission — you move it with better content (the prerender layer putting real HTML in front of the crawler) and internal links that give the page a reason to matter. Those recoveries were the delayed payoff of content and linking work from weeks earlier, not anything I did that day.

Keeping the two buckets separate matters, because the fixes don't transfer. A 308 will never rescue a thin page, and a content rewrite will never fix a client-side redirect.

What I took from it

A 200 status doesn't mean Google sees a real page. Soft 404 is the body contradicting the status line.
On an SPA, any route that isn't prerendered falls through to the app shell — a 200 full of the wrong content. That's a soft-404 factory.
Client-side redirects don't exist as far as a crawler is concerned. Anything Google should treat as moved needs a server-side 3xx, before the JavaScript.
GSC reports are stale snapshots. Check lastCrawlTime before re-fixing.
Submission re-asks the question; it doesn't change the answer. Use "Validate Fix" for redirects; submit only live targets.

The redirects had been working the entire time — in the one place that couldn't see them.

DEV Community