DEV Community

Sathish
Sathish

Posted on

Next.js notFound() fixed my soft 404 mess

  • Soft 404s aren’t “SEO issues”. They’re routing bugs.
  • I replaced empty 200 pages with real 404s in Next.js.
  • I added a tiny HEAD check script to catch regressions.
  • I stopped shipping pages Google won’t index.

Context

Google Search Console kept yelling “Soft 404”. Brutal.

The pages existed. Sort of.
They returned 200 OK. They rendered a sad “no results” UI. Google treats that as junk and labels it a soft 404.

I build in Next.js. App Router.
And I’d been doing the lazy thing: render an empty state and keep the URL alive.

That works for humans. It doesn’t work for crawlers.
So I stopped treating it like “SEO”. I treated it like correctness.

1) I reproduce the soft 404 locally first

GSC is slow. Hours. Sometimes days.
I needed a tight loop.

My first check is boring: does the route return a real 404 status?
Not “UI that looks like a 404”. An actual 404.

Here’s the tiny script I run against a list of URLs. It uses HEAD so it’s fast and doesn’t download full HTML.

// scripts/check-status.mjs
// node scripts/check-status.mjs https://example.com/a https://example.com/b

const urls = process.argv.slice(2);
if (!urls.length) {
  console.error('Pass URLs: node scripts/check-status.mjs ');
  process.exit(1);
}

const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 10_000);

for (const url of urls) {
  try {
    const res = await fetch(url, {
      method: 'HEAD',
      redirect: 'manual',
      signal: controller.signal,
      headers: { 'user-agent': 'status-check/1.0' }
    });

    const loc = res.headers.get('location');
    const extra = loc ? ` -> ${loc}` : '';
    console.log(`${res.status} ${url}${extra}`);
  } catch (e) {
    console.log(`ERR ${url} ${e.name}`);
  }
}

clearTimeout(timeout);
Enter fullscreen mode Exit fullscreen mode

Two notes.

Some CDNs don’t like HEAD and return weird stuff. If that happens, switch to GET and set cache: 'no-store'.
And don’t follow redirects silently. A 302 to / is a common “soft 404 factory”.

2) I stop returning empty 200 pages

This was my bug.

I had dynamic routes like /state/[state]/city/[city].
When there was no data, I returned a page with an empty list and a friendly message.
Still 200.

The fix is notFound().
In App Router, that triggers Next’s 404 boundary and sends the correct status.

// app/state/[state]/city/[city]/page.tsx
import { notFound } from 'next/navigation';

type Props = {
  params: Promise<{ state: string; city: string }>;
};

async function getCityData(state: string, city: string) {
  const res = await fetch(
    `${process.env.API_BASE_URL}/api/cities/${state}/${city}`,
    { cache: 'no-store' }
  );

  if (res.status === 404) return null;
  if (!res.ok) throw new Error(`API failed: ${res.status}`);

  return res.json() as Promise<{ name: string; count: number }>;
}

export default async function CityPage({ params }: Props) {
  const { state, city } = await params;

  const data = await getCityData(state, city);
  if (!data || data.count === 0) notFound(); // real 404

  return (

      # {data.name}

      {data.count} listings


  );
}
Enter fullscreen mode Exit fullscreen mode

I learned this the hard way: “count is 0” often means “invalid URL” for SEO.
If you keep it as a valid page, Google tries to index it, sees thin content, then you get soft 404s or “crawled - currently not indexed”.

If there’s a legit reason to show an empty state (like a filter page), don’t use notFound().
But for location pages, category pages, and slug pages? A missing entity is a 404.

3) I make sure the 404 is consistent

Next.js lets you customize the not-found UI.
Do it.
But keep it simple.

And don’t accidentally return 200 from an API route that powers the page.
That’s another soft 404 pattern: API says “ok” with an empty JSON, page renders nothing, status stays 200.

Here’s what I changed in one of my route handlers.
Real 404. No “success: false” with 200.

// app/api/cities/[state]/[city]/route.ts
import { NextResponse } from 'next/server';

type Params = { state: string; city: string };

export async function GET(
  _req: Request,
  { params }: { params: Promise }
) {
  const { state, city } = await params;

  // Replace this with your DB call.
  const record = await fakeDbLookup(state.toLowerCase(), city.toLowerCase());

  if (!record) {
    return NextResponse.json({ error: 'Not found' }, { status: 404 });
  }

  return NextResponse.json({
    name: record.name,
    count: record.count
  });
}

async function fakeDbLookup(_state: string, _city: string) {
  // Example only. Your real code hits Postgres/Supabase/etc.
  return null as null | { name: string; count: number };
}
Enter fullscreen mode Exit fullscreen mode

Now the page can make a clean decision:
API 404 => notFound().

No more guessing.

4) I don’t let robots.txt hide my mistakes

I wasted time here.

My first instinct was “maybe Google can’t crawl it”. So I stared at robots.txt.
That was mostly wrong.

Robots rules don’t fix soft 404s. They just hide them.
Also: blocking a URL doesn’t remove it from the index if it’s already known.

Still, I did make one change that helped: I stopped disallowing paths I actually wanted indexed.
And I added a sitemap location because I kept forgetting.

# public/robots.txt
User-agent: *
Allow: /

# Block internal junk
Disallow: /api/
Disallow: /_next/
Disallow: /admin/

Sitemap: https://example.com/sitemap.xml
Enter fullscreen mode Exit fullscreen mode

If you’re using multiple sitemaps, list them all.
And if you’re returning 404 correctly, you don’t need robots.txt to “manage” bad URLs.
You just let them die.

5) I add a regression test that runs in CI

This is the part I wish I’d done first.

Once I fixed soft 404s, I broke them again. Twice.
One refactor removed notFound().
Another refactor changed the API to return 200 with { count: 0 }.

So I added a small Node test.
It hits a handful of known-bad URLs and asserts 404.
And it hits known-good URLs and asserts 200.

// scripts/assert-status.mjs
// Run in CI: node scripts/assert-status.mjs

const cases = [
  { url: 'https://example.com/state/ca/city/does-not-exist', want: 404 },
  { url: 'https://example.com/state/tx', want: 200 }
];

let failed = 0;

for (const t of cases) {
  const res = await fetch(t.url, { method: 'GET', redirect: 'manual' });
  if (res.status !== t.want) {
    failed++;
    console.error(`FAIL ${t.url} got ${res.status}, want ${t.want}`);
  } else {
    console.log(`OK   ${t.url} => ${res.status}`);
  }
}

process.exit(failed ? 1 : 0);
Enter fullscreen mode Exit fullscreen mode

Yeah, it’s not a “unit test”.
It’s a cheap smoke test.
It saved me from shipping more soft 404s.

If you want it faster, run it against http://localhost:3000 in a Next build step.
But I like hitting production too. Real CDN. Real redirects.

Results

After switching empty pages to notFound(), my GSC soft 404 count dropped by 12 URLs over the next few crawls.
Not instantly. Google’s schedule is its own thing.

I also cleaned up indexing issues tied to thin pages.
“Discovered - currently not indexed” dropped by 50+ URLs after I stopped generating pages that had no content.

The bigger win was sanity.
Now when a slug doesn’t exist, the server says so. 404.
My scripts catch it. CI catches it.
I’m not guessing based on a chart in Search Console.

Key takeaways

  • If a dynamic route has no entity, return notFound(). Don’t render an empty list.
  • Make your API return 404 for missing records. No 200 + “error” JSON.
  • Verify status codes with a script. GSC is too slow for debugging.
  • Don’t use robots.txt to hide soft 404s. Fix the response.
  • Add a regression check in CI. You’ll break it again.

Closing

Soft 404s are sneaky because the UI looks fine.
The status code isn’t.

Do you test status codes in CI with a script like this, or do you rely on crawling tools + GSC to catch soft 404s later?

Top comments (0)