DEV Community

Cover image for The Next.js SEO Bug That Made Google Ignore My Entire Site (And How I Found It)
Federico Sciuca
Federico Sciuca

Posted on

The Next.js SEO Bug That Made Google Ignore My Entire Site (And How I Found It)

I shipped a full-featured AI travel planner. Three languages. 230+ pages. Then I realised that Google couldn't find a single one.

This is the story of how I went from zero indexed pages to 176 in three weeks and the one Next.js configuration line that changed everything.

Some Context

I'm not a developer. I like building things and try new tools. SEO was always that thing I'd "figure out later". Famous last words.
I know SEO at a very high level but working into Marketing Performance I know the importance of a well indexed website and of the keywords but I thoughts that was mostly it. Research queries and build good content around them.

This time was the time I would have to "figure it out!".

As I was saying, I build things to learn my way through new tools and technologies.
The app is called MonkeyTravel. It uses AI to generate personalised travel itineraries — day-by-day plans with activities, restaurants, hotels, and budget breakdowns. It works in English, Spanish, and Italian. I built it because planning group trips with friends was always chaos, and I wanted something smarter. As it usually happens, I built for myself, but this time I didn't want to send it to the massive Projects Graveyard.

The app itself worked great. People who found it loved it.

The problem? Nobody could find it.
I had to figure it out! And this is just the beginning of it!

Phase 1: "Why Isn't Google Showing My Site?"

I'll be honest, when I first checked Google Search Console, I expected to see... something. I'd been live for weeks. Instead: a flat line. Zero impressions. Zero clicks. Zero-indexed pages.

My first instinct was to blame Google. "It takes time," I told myself. So I waited another week. Still zero.

That's when I actually looked at my setup:

  • ❌ Outdated sitemap
  • ❌ No canonical tags
  • ❌ No hreflang tags (despite 3 languages)
  • ❌ Little structured data
  • ❌ Default robots.txt from create-next-app
  • ❌ No meta descriptions on half the pages

Basically, I'd built a beautiful house and forgotten to put a number on the door. I did the same on my mailbox recently, and I was surprised that the Postman didn't deliver my fresh new American driving license! But let's focus on the SEO instead of my poor decisions :D.

Phase 2: The Foundations (Boring but Necessary)

I spent a weekend adding the basics. Nothing revolutionary, just what every site needs:

Sitemap: Next.js makes this easy with app/sitemap.ts. Mine generates URLs for all static pages, blog posts, and destination pages across all 3 locales. Dynamic content from Supabase gets included too.

// Simplified version of my sitemap
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const baseUrl = "https://monkeytravel.app";
  const locales = ["en", "es", "it"];

  // Blog posts × 3 languages
  const blogSlugs = getAllSlugs();
  const blogPages = blogSlugs.flatMap(slug =>
    locales.map(locale => ({
      url: locale === "en"
        ? `${baseUrl}/blog/${slug}`
        : `${baseUrl}/${locale}/blog/${slug}`,
      changeFrequency: "monthly" as const,
      priority: 0.7,
    }))
  );

  return [...staticPages, ...blogPages, ...destinationPages];
}
Enter fullscreen mode Exit fullscreen mode

Canonical + Hreflang: This is where multilingual sites get tricky. Every page needs to say "I'm the official URL" AND "here are my other language versions." I used generateMetadata() in each page:

alternates: {
  canonical: locale === "en"
    ? `${BASE_URL}/${slug}`
    : `${BASE_URL}/${locale}/${slug}`,
  languages: {
    en: `${BASE_URL}/${slug}`,
    es: `${BASE_URL}/es/${slug}`,
    it: `${BASE_URL}/it/${slug}`,
    "x-default": `${BASE_URL}/${slug}`,
  },
},
Enter fullscreen mode Exit fullscreen mode

Structured Data: JSON-LD schemas for Organization, WebSite, SoftwareApplication, Article, and TouristDestination. I built a small utility for this:

export function jsonLdScriptProps(data: object) {
  return {
    type: "application/ld+json",
    dangerouslySetInnerHTML: {
      __html: JSON.stringify(data),
    },
  };
}
Enter fullscreen mode Exit fullscreen mode

Result: After submitting the sitemap, Google discovered all my URLs within 48 hours. But "discovered" ≠ "indexed." Most pages sat in the "Discovered — currently not indexed" queue.

After a week: 12 pages indexed. Progress, but painfully slow.

The real issue? Google Search Console doesn't look like it is giving much feedback or reasoning around the rejection of a page!

Phase 3: Content That Google Actually Wants

I realised my site was mostly an app behind a login wall. Google had very little public content to index. So I went aggressive on content:

50 blog posts covering real travel topics, itinerary guides, destination comparisons, budget travel tips, seasonal recommendations. Each one in 3 languages = 150 blog pages. Tedious process, but facilitated A LOT by the whole engine that the app is actually built on!

20 destination landing pages (Paris, Tokyo, Bali, Barcelona, etc.) with climate data, AI itinerary previews, and cross-links to blog posts. × 3 languages = 60 pages.

5 SEO landing pages targeting specific search intents: /free-ai-trip-planner, /group-trip-planner, /budget-trip-planner, etc.

But here's the thing that surprised me: internal linking mattered more than the content itself. Pages that were cross-linked from multiple other pages got indexed WAY faster than orphan pages. I added:

  • "From the Blog" sections on every landing page
  • "Related destinations" on every destination page
  • Blog → destination links and destination → blog links
  • A region filter on the blog index (Europe, Asia, Americas, Africa)

After two weeks: 78 pages indexed. The curve was accelerating.

Phase 4: The Bug That Almost Ruined Everything

Then Google Search Console showed a new error on my homepage:

"Duplicate without user-selected canonical"

Google was rejecting my homepage. It was choosing www.monkeytravel.app as canonical instead of monkeytravel.app. Despite having:

  • 301 redirects from www → non-www (in both middleware AND Vercel config)
  • Correct canonical tags in the HTML
  • All URLs in the sitemap using non-www

I checked everything twice. The redirects worked. The HTML had the right tags. I verified with curl:

$ curl -s https://monkeytravel.app/ | grep canonical
<link rel="canonical" href="https://monkeytravel.app"/>
Enter fullscreen mode Exit fullscreen mode

The tag was right there. So why was Google saying "User-declared canonical: None"?

The Discovery

I stared at this for hours before it clicked. The key was in how I verified it.

curl waits for the complete response. Googlebot doesn't.

In Next.js 15.2+, generateMetadata() streams metadata asynchronously. The <head> tags aren't in the initial HTML payload but they're injected via the stream after the body starts rendering. When Googlebot parses the initial response, the canonical tag literally doesn't exist yet. Or at least this is what I think I figured out jumping between AI, documentations etc.

I confirmed by looking at the raw initial HTML before streaming completes: no <link rel="canonical"> anywhere.

The Fix: One Config Option

// next.config.ts
const nextConfig: NextConfig = {
  htmlLimitedBots: /Googlebot|Google-InspectionTool|Bingbot|Yandex/i,
  trailingSlash: false,
};
Enter fullscreen mode Exit fullscreen mode

htmlLimitedBots tells Next.js: "When a crawler visits, disable streaming. Send the full HTML with all metadata synchronously."

That's it. One regex. Fixed the entire problem.

I also changed my root layout canonical from "/" to "./" so every page gets a self-referencing canonical instead of all pages pointing to the homepage (a subtle but important distinction).

Deployed. Requested re-indexing. Within days: 176 pages indexed.
Still not all the 230+ pages but we are getting there!

The Numbers

Metric Week 0 Week 1 Week 2 Week 3
Pages indexed 0 12 78 176
Total pages in sitemap 0 ~50 ~225 ~230
Blog posts (per language) 0 1 15 50
Structured data schemas 0 3 5 5

What I Got Wrong (So You Don't Have To)

1. I didn't set up htmlLimitedBots from day one. This should be in every Next.js project that cares about SEO. The metadata streaming issue is completely silent, everything looks fine when you check manually. Only crawlers are affected. I thought it was a "content volume" issue... not really!

2. I treated SEO as a "later" problem. Every week I delayed the sitemap and canonical tags was a week of potential crawling wasted. Google's queue doesn't move faster just because you're impatient.

3. I underestimated internal linking. Cross-linked pages got indexed 3-4x faster than isolated pages. If you have related content, link it. Google follows links.

4. I built multilingual support but forgot hreflang. Having 3 language versions without hreflang means Google might treat them as duplicate content instead of translations. Costly mistake.

AI Tricks That Helped

A few things that saved me time during this sprint:

  • AI-assisted blog content: I used AI to draft blog post structures, then edited and localized them. For 50 posts × 3 languages, doing everything manually would have taken months and study an extra language or find more international collaborators.

  • Automated cross-linking: I wrote a script that analyzed blog post topics and destination pages, then generated internal link suggestions. Much better than trying to mentally map 200+ pages.

  • Prompt engineering for i18n: Instead of translating English content, I had the AI generate locale-native content. "Write about Paris for an Italian audience" produces much better content than "translate this Paris article to Italian."

What I'd Tell Past Me

Start with these on day one, before you write a single feature:

  1. htmlLimitedBots in next.config.ts
  2. Sitemap generation
  3. Canonical tags on every page
  4. Submit to Google Search Console

Everything else, blog posts, structured data, internal linking, matters, but these four things are the foundation. Skip them and nothing else works.

129 pages are still in Google's queue. Based on the trajectory, they'll be indexed within a couple of weeks (hopefully). Then the real game starts: actually ranking for competitive keywords.
That will be FUN


MonkeyTravel is free to use — drop a destination, get a personalised AI itinerary in seconds. Built with Next.js, Supabase, and hosted on Vercel. Any feedback is more than welcome! Let's learn something new together


Top comments (3)

Collapse
 
madeburo profile image
Made Büro

Great writeup! The htmlLimitedBots discovery is really valuable that streaming metadata issue is completely silent and I bet a lot of Next.js projects are affected without knowing it.

One thing worth considering as a next step: you've nailed the Google side, but AI search engines (ChatGPT, Perplexity, Claude, Gemini, Grok) are becoming another discovery channel and they use different crawlers and signals than Google.

Your structured data setup is already a strong foundation for it. Things like llms.txt (a structured index of your site for AI crawlers) and managing AI-specific bots (GPTBot, ClaudeBot, PerplexityBot, etc.) can help your travel content surface when people ask AI for trip recommendations.

I've been building open source tools for this including a Next.js middleware that handles it automatically.

Either way bookmarking this post, the htmlLimitedBots tip alone is worth sharing

Collapse
 
federico_sciuca profile image
Federico Sciuca

Thank you! Really valuable addition, AI crawler management is one of those gaps I've been aware of but haven't tackled systematically yet. The llms.txt spec in particular is on my list. Curious about your middleware, is it published somewhere? I've also been exploring the OpenAI Apps SDK as a complementary discovery channel, so there's clearly a whole layer of AI-native SEO that deserves its own deep dive. Might be the next article!

Collapse
 
federico_sciuca profile image
Federico Sciuca

I almost forgot. To be sure I don't get financially killed by my own curiosity, I added a beta-testers-only wall, limiting the usage to standard users. There are a few beta accesses available eventually.