Federico Sciuca

Posted on Mar 7

The Next.js SEO Bug That Made Google Ignore My Entire Site (And How I Found It)

#nextjs #seo #buildinpublic #webdev

I shipped a full-featured AI travel planner. Three languages. 230+ pages. Then I realised that Google couldn't find a single one.

This is the story of how I went from zero indexed pages to 176 in three weeks and the one Next.js configuration line that changed everything.

Some Context

I'm not a developer. I like building things and try new tools. SEO was always that thing I'd "figure out later". Famous last words.
I know SEO at a very high level but working into Marketing Performance I know the importance of a well indexed website and of the keywords but I thoughts that was mostly it. Research queries and build good content around them.

This time was the time I would have to "figure it out!".

As I was saying, I build things to learn my way through new tools and technologies.
The app is called MonkeyTravel. It uses AI to generate personalised travel itineraries — day-by-day plans with activities, restaurants, hotels, and budget breakdowns. It works in English, Spanish, and Italian. I built it because planning group trips with friends was always chaos, and I wanted something smarter. As it usually happens, I built for myself, but this time I didn't want to send it to the massive Projects Graveyard.

The app itself worked great. People who found it loved it.

The problem? Nobody could find it.
I had to figure it out! And this is just the beginning of it!

Phase 1: "Why Isn't Google Showing My Site?"

I'll be honest, when I first checked Google Search Console, I expected to see... something. I'd been live for weeks. Instead: a flat line. Zero impressions. Zero clicks. Zero-indexed pages.

My first instinct was to blame Google. "It takes time," I told myself. So I waited another week. Still zero.

That's when I actually looked at my setup:

❌ Outdated sitemap
❌ No canonical tags
❌ No hreflang tags (despite 3 languages)
❌ Little structured data
❌ Default robots.txt from create-next-app
❌ No meta descriptions on half the pages

Basically, I'd built a beautiful house and forgotten to put a number on the door. I did the same on my mailbox recently, and I was surprised that the Postman didn't deliver my fresh new American driving license! But let's focus on the SEO instead of my poor decisions :D.

Phase 2: The Foundations (Boring but Necessary)

I spent a weekend adding the basics. Nothing revolutionary, just what every site needs:

Sitemap: Next.js makes this easy with app/sitemap.ts. Mine generates URLs for all static pages, blog posts, and destination pages across all 3 locales. Dynamic content from Supabase gets included too.

// Simplified version of my sitemap
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const baseUrl = "https://monkeytravel.app";
  const locales = ["en", "es", "it"];

  // Blog posts × 3 languages
  const blogSlugs = getAllSlugs();
  const blogPages = blogSlugs.flatMap(slug =>
    locales.map(locale => ({
      url: locale === "en"
        ? `${baseUrl}/blog/${slug}`
        : `${baseUrl}/${locale}/blog/${slug}`,
      changeFrequency: "monthly" as const,
      priority: 0.7,
    }))
  );

  return [...staticPages, ...blogPages, ...destinationPages];
}

Canonical + Hreflang: This is where multilingual sites get tricky. Every page needs to say "I'm the official URL" AND "here are my other language versions." I used generateMetadata() in each page:

alternates: {
  canonical: locale === "en"
    ? `${BASE_URL}/${slug}`
    : `${BASE_URL}/${locale}/${slug}`,
  languages: {
    en: `${BASE_URL}/${slug}`,
    es: `${BASE_URL}/es/${slug}`,
    it: `${BASE_URL}/it/${slug}`,
    "x-default": `${BASE_URL}/${slug}`,
  },
},

Structured Data: JSON-LD schemas for Organization, WebSite, SoftwareApplication, Article, and TouristDestination. I built a small utility for this:

export function jsonLdScriptProps(data: object) {
  return {
    type: "application/ld+json",
    dangerouslySetInnerHTML: {
      __html: JSON.stringify(data),
    },
  };
}

Result: After submitting the sitemap, Google discovered all my URLs within 48 hours. But "discovered" ≠ "indexed." Most pages sat in the "Discovered — currently not indexed" queue.

After a week: 12 pages indexed. Progress, but painfully slow.

The real issue? Google Search Console doesn't look like it is giving much feedback or reasoning around the rejection of a page!

Phase 3: Content That Google Actually Wants

I realised my site was mostly an app behind a login wall. Google had very little public content to index. So I went aggressive on content:

50 blog posts covering real travel topics, itinerary guides, destination comparisons, budget travel tips, seasonal recommendations. Each one in 3 languages = 150 blog pages. Tedious process, but facilitated A LOT by the whole engine that the app is actually built on!

20 destination landing pages (Paris, Tokyo, Bali, Barcelona, etc.) with climate data, AI itinerary previews, and cross-links to blog posts. × 3 languages = 60 pages.

5 SEO landing pages targeting specific search intents: /free-ai-trip-planner, /group-trip-planner, /budget-trip-planner, etc.

But here's the thing that surprised me: internal linking mattered more than the content itself. Pages that were cross-linked from multiple other pages got indexed WAY faster than orphan pages. I added:

"From the Blog" sections on every landing page
"Related destinations" on every destination page
Blog → destination links and destination → blog links
A region filter on the blog index (Europe, Asia, Americas, Africa)

After two weeks: 78 pages indexed. The curve was accelerating.

Phase 4: The Bug That Almost Ruined Everything

Then Google Search Console showed a new error on my homepage:

"Duplicate without user-selected canonical"

Google was rejecting my homepage. It was choosing www.monkeytravel.app as canonical instead of monkeytravel.app. Despite having:

301 redirects from www → non-www (in both middleware AND Vercel config)
Correct canonical tags in the HTML
All URLs in the sitemap using non-www

I checked everything twice. The redirects worked. The HTML had the right tags. I verified with curl:

$ curl -s https://monkeytravel.app/ | grep canonical
<link rel="canonical" href="https://monkeytravel.app"/>

The tag was right there. So why was Google saying "User-declared canonical: None"?

The Discovery

I stared at this for hours before it clicked. The key was in how I verified it.

curl waits for the complete response. Googlebot doesn't.

In Next.js 15.2+, generateMetadata() streams metadata asynchronously. The <head> tags aren't in the initial HTML payload but they're injected via the stream after the body starts rendering. When Googlebot parses the initial response, the canonical tag literally doesn't exist yet. Or at least this is what I think I figured out jumping between AI, documentations etc.

I confirmed by looking at the raw initial HTML before streaming completes: no <link rel="canonical"> anywhere.

The Fix: One Config Option

// next.config.ts
const nextConfig: NextConfig = {
  htmlLimitedBots: /Googlebot|Google-InspectionTool|Bingbot|Yandex/i,
  trailingSlash: false,
};

htmlLimitedBots tells Next.js: "When a crawler visits, disable streaming. Send the full HTML with all metadata synchronously."

That's it. One regex. Fixed the entire problem.

I also changed my root layout canonical from "/" to "./" so every page gets a self-referencing canonical instead of all pages pointing to the homepage (a subtle but important distinction).

Deployed. Requested re-indexing. Within days: 176 pages indexed.
Still not all the 230+ pages but we are getting there!

The Numbers

Metric	Week 1	Week 2	Week 3
Pages indexed	12	78	176
Total pages in sitemap	~50	~225	~230
Blog posts (per language)	1	15	50
Structured data schemas	3	5	5

What I Got Wrong (So You Don't Have To)

1. I didn't set up htmlLimitedBots from day one. This should be in every Next.js project that cares about SEO. The metadata streaming issue is completely silent, everything looks fine when you check manually. Only crawlers are affected. I thought it was a "content volume" issue... not really!

2. I treated SEO as a "later" problem. Every week I delayed the sitemap and canonical tags was a week of potential crawling wasted. Google's queue doesn't move faster just because you're impatient.

3. I underestimated internal linking. Cross-linked pages got indexed 3-4x faster than isolated pages. If you have related content, link it. Google follows links.

4. I built multilingual support but forgot hreflang. Having 3 language versions without hreflang means Google might treat them as duplicate content instead of translations. Costly mistake.

AI Tricks That Helped

A few things that saved me time during this sprint:

AI-assisted blog content: I used AI to draft blog post structures, then edited and localized them. For 50 posts × 3 languages, doing everything manually would have taken months and study an extra language or find more international collaborators.
Automated cross-linking: I wrote a script that analyzed blog post topics and destination pages, then generated internal link suggestions. Much better than trying to mentally map 200+ pages.
Prompt engineering for i18n: Instead of translating English content, I had the AI generate locale-native content. "Write about Paris for an Italian audience" produces much better content than "translate this Paris article to Italian."

What I'd Tell Past Me

Start with these on day one, before you write a single feature:

htmlLimitedBots in next.config.ts
Sitemap generation
Canonical tags on every page
Submit to Google Search Console

Everything else, blog posts, structured data, internal linking, matters, but these four things are the foundation. Skip them and nothing else works.

129 pages are still in Google's queue. Based on the trajectory, they'll be indexed within a couple of weeks (hopefully). Then the real game starts: actually ranking for competitive keywords.
That will be FUN

MonkeyTravel is free to use — drop a destination, get a personalised AI itinerary in seconds. Built with Next.js, Supabase, and hosted on Vercel. Any feedback is more than welcome! Let's learn something new together

Top comments (9)

Made Büro • Mar 7

Great writeup! The htmlLimitedBots discovery is really valuable that streaming metadata issue is completely silent and I bet a lot of Next.js projects are affected without knowing it.

One thing worth considering as a next step: you've nailed the Google side, but AI search engines (ChatGPT, Perplexity, Claude, Gemini, Grok) are becoming another discovery channel and they use different crawlers and signals than Google.

Your structured data setup is already a strong foundation for it. Things like llms.txt (a structured index of your site for AI crawlers) and managing AI-specific bots (GPTBot, ClaudeBot, PerplexityBot, etc.) can help your travel content surface when people ask AI for trip recommendations.

I've been building open source tools for this including a Next.js middleware that handles it automatically.

Either way bookmarking this post, the htmlLimitedBots tip alone is worth sharing

Federico Sciuca • Mar 7

Thank you! Really valuable addition, AI crawler management is one of those gaps I've been aware of but haven't tackled systematically yet. The llms.txt spec in particular is on my list. Curious about your middleware, is it published somewhere? I've also been exploring the OpenAI Apps SDK as a complementary discovery channel, so there's clearly a whole layer of AI-native SEO that deserves its own deep dive. Might be the next article!

Comment deleted

Federico Sciuca • Mar 7

Amazing! Thank you! I'll definitely take a look at it!

Mihir kanzariya • Mar 7

the internal linking part is what got me. i spent months thinking "just publish more content and google will figure it out" but orphan pages basically don't exist to crawlers. cross-linked pages getting indexed 3-4x faster is wild but makes total sense in hindsight.

also the htmlLimitedBots thing is sneaky. everything looks fine when you check manually but crawlers see something completely different. how did you even discover that was the issue? was it showing up in search console somehow or did you have to dig into server logs?

Federico Sciuca • Mar 7

Honestly? After going through docs, testing everything manually, and hitting dead ends, I did what any pragmatic builder does in 2026: dumped all the context into Claude Code and asked it to figure it out 😄
I'll admit I didn't do my usual "understand (almost) every line before trusting it" review this time, I just ran it and submitted the URLs for reindexing in Search Console. Turns out that was the right call.
There's something kind of humbling about a crawler seeing a completely different page than you do in a browser. You assume parity and never think to question it.

Federico Sciuca • Mar 7

I almost forgot. To be sure I don't get financially killed by my own curiosity, I added a beta-testers-only wall, limiting the usage to standard users. There are a few beta accesses available eventually.

Apex Stack • Mar 11

This really resonates. I'm working on a large-scale programmatic SEO project right now — thousands of stock analysis pages across 12 languages built with Astro — and the indexing struggle is the single biggest bottleneck. Google has crawled tens of thousands of my pages but rejected most of them with "Crawled - currently not indexed," which is essentially the same problem you hit but from a content quality angle rather than a rendering one.

Your point about SSR vs SSG mattering for crawl budget is something people seriously underestimate. Going from zero to 176 indexed in three weeks just by fixing one config line is a perfect example of how technical SEO can unlock everything downstream. Most people jump straight to content and backlinks without realizing the plumbing is broken.

Curious — after getting those 176 pages indexed, did you notice Google picking up new pages faster? I've heard that once you prove your site renders properly, the crawl rate tends to accelerate. Would love to know if that held true for MonkeyTravel.

Federico Sciuca • Mar 12

Sounds quite an interesting project and I definitely know the pain :D
I'm still struggoling with some of the pages. what actually picked up pretty quick are the impressions and clicks but I haven't tested yet with new contents.

I'll definitely give it a try next week and I'll let you know or write about it.

I love how supportive this community is!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.