<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Noctias</title>
    <description>The latest articles on DEV Community by Noctias (@noctias).</description>
    <link>https://dev.to/noctias</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4002098%2Ff2fc9e56-c96a-407e-9f61-57cdd6bae24e.png</url>
      <title>DEV Community: Noctias</title>
      <link>https://dev.to/noctias</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/noctias"/>
    <language>en</language>
    <item>
      <title>I lost months of Google indexing to a single missing UA pattern</title>
      <dc:creator>Noctias</dc:creator>
      <pubDate>Fri, 26 Jun 2026 02:19:53 +0000</pubDate>
      <link>https://dev.to/noctias/i-lost-months-of-google-indexing-to-a-single-missing-ua-pattern-9a9</link>
      <guid>https://dev.to/noctias/i-lost-months-of-google-indexing-to-a-single-missing-ua-pattern-9a9</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;tl;dr — If your site has any kind of geo-gate, age verification, or country-specific wall, and you wrote a "let Googlebot through" rule in your middleware: &lt;strong&gt;your rule is probably wrong&lt;/strong&gt;. Google's URL Inspector does not send a &lt;code&gt;Googlebot&lt;/code&gt; UA. It sends &lt;code&gt;Google-InspectionTool&lt;/code&gt;. Match that explicitly, or you'll lose months of crawl budget without ever seeing the cause in Search Console.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I'm shipping &lt;a href="https://noctias.tv" rel="noopener noreferrer"&gt;noctias.tv&lt;/a&gt; — a multi-language portal — alongside our older domain noctias.com. While verifying the new domain in Search Console, every single URL inspection failed with the same message:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"ページをインデックスに登録できません: noindex タグによって除外されました"&lt;br&gt;
("Cannot be indexed: excluded by noindex tag")&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But the page wasn't &lt;code&gt;noindex&lt;/code&gt;. Curl with a Googlebot UA returned &lt;code&gt;&amp;lt;meta name="robots" content="index, follow"/&amp;gt;&lt;/code&gt;. The page rendered correctly to humans. The sitemap was healthy and showed 252 URLs discovered. Bing indexed it. Search Console said: nope.&lt;/p&gt;

&lt;p&gt;This took me hours to track down. Sharing it because I'm sure other sites are silently bleeding the same way.&lt;/p&gt;




&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;noctias.tv runs Next.js 15 in standalone mode behind Cloudflare Tunnel on an OVHcloud VPS. Compliance is jurisdictional, not language-level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/lib/geo.ts (simplified)&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Policy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ALLOW&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AGE_GATE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AGE_VERIFICATION&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;BLOCK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;country&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;region&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Policy&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;country&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AGE_VERIFICATION&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;            &lt;span class="c1"&gt;// fail closed&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;BLOCKED_COUNTRIES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;country&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;BLOCK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;country&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;US&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;STRICT_US_STATES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AGE_VERIFICATION&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;country&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;UK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;country&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GB&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AGE_VERIFICATION&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;country&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;JP&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AGE_GATE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ALLOW&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;AGE_VERIFICATION&lt;/code&gt; means: rewrite to &lt;code&gt;/age-verification&lt;/code&gt;, which is a &lt;code&gt;noindex,nofollow&lt;/code&gt; page. After the visitor passes the wall, a signed cookie unlocks the real content.&lt;/p&gt;

&lt;p&gt;To not destroy SEO, I'd added what I thought was a standard bot bypass:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/middleware.ts (the WRONG version)&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;policy&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AGE_VERIFICATION&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;verified&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;verifyAvToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;AV_COOKIE&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isBot&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sr"&gt;/Googlebot/i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user-agent&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;verified&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;isBot&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nf"&gt;isAvPath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;nextUrl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pathname&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;NextResponse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rewrite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;locale&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/age-verification`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looks right. Plenty of "how to allow Googlebot through" Stack Overflow answers use the same pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But the live test from Search Console still failed.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Finding the actual UA
&lt;/h2&gt;

&lt;p&gt;The "Test live URL" panel in Search Console shows you the rendered HTML it fetched. I dug into the rendered output and found:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;title&amp;gt;&lt;/span&gt;Noctias.tv&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"robots"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"noindex, nofollow"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the &lt;code&gt;/age-verification&lt;/code&gt; page. Google was being walled. So the bypass wasn't matching.&lt;/p&gt;

&lt;p&gt;I went looking for the real UA. Search Console's own UA documentation lists &lt;em&gt;crawlers&lt;/em&gt; — Googlebot, AdsBot-Google, etc. But the URL Inspector live test isn't a crawler — it's a separate fetcher with its own UA:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Mozilla/5.0 (compatible; Google-InspectionTool/1.0;)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;/Googlebot/i&lt;/code&gt; does NOT match &lt;code&gt;Google-InspectionTool&lt;/code&gt;. The substring isn't there.&lt;/p&gt;

&lt;p&gt;This is the bug. My bypass let real Googlebot crawl, but blocked the URL Inspector. So when I (or anyone) tried to &lt;em&gt;manually&lt;/em&gt; request indexing in Search Console, the live test saw the noindex wall and refused.&lt;/p&gt;

&lt;p&gt;The crawler-only path was probably also failing silently for any URL that wasn't already in the index, because the first URL Inspector check is part of how Search Console decides to crawl new URLs.&lt;/p&gt;




&lt;h2&gt;
  
  
  The actual fix
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;isSearchEngineBot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;userAgent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sr"&gt;/Googlebot|Google-InspectionTool|Google-Read-Aloud|AdsBot-Google|Google-Site-Verification|Bingbot|DuckDuckBot|YandexBot|Baiduspider|Applebot|GPTBot|ClaudeBot|PerplexityBot|facebookexternalhit|Twitterbot|LinkedInBot/i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;userAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not cloaking-spam. Google explicitly allows serving bots the same content you'd serve a verified human, when an interstitial would otherwise block crawling. See &lt;a href="https://developers.google.com/search/docs/advanced/mobile/mobile-intrusive-interstitials" rel="noopener noreferrer"&gt;Google's "intrusive interstitials" guidance&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Some others worth matching while you're at it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;facebookexternalhit&lt;/code&gt;, &lt;code&gt;Twitterbot&lt;/code&gt;, &lt;code&gt;LinkedInBot&lt;/code&gt; — Open Graph card scrapers. Without them, every Discord / Twitter / LinkedIn unfurl of your URL shows the age-verification placeholder, which kills click-through.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GPTBot&lt;/code&gt;, &lt;code&gt;ClaudeBot&lt;/code&gt;, &lt;code&gt;PerplexityBot&lt;/code&gt; — AI search crawlers. Your call whether to let these through; we do.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How to check your own site
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"User-Agent: Mozilla/5.0 (compatible; Google-InspectionTool/1.0;)"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  https://your-site.com/some-article &lt;span class="se"&gt;\&lt;/span&gt;
  | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-oE&lt;/span&gt; &lt;span class="s1"&gt;'name="robots"[^&amp;gt;]*'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If that returns &lt;code&gt;noindex&lt;/code&gt;, your indexing pipeline is broken for any URL behind whatever gate you have — age verification, geo-blocking, paywall preview, "press Enter to continue", anything.&lt;/p&gt;

&lt;p&gt;You'll find the same issue on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Any site that gates content for US TX / UT / AR / LA / MS / NC / OK / etc. for age verification (state law)&lt;/li&gt;
&lt;li&gt;Any site that gates for UK Online Safety Act (since Jan 2025)&lt;/li&gt;
&lt;li&gt;Any site with a country-blocked list (&lt;code&gt;BLOCK&lt;/code&gt; policy in our case)&lt;/li&gt;
&lt;li&gt;Some EU cookie wall implementations that fully rewrite the response&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Test the inspector path before deploy.&lt;/strong&gt; I had unit tests for the geo router but never tested that a fresh URL would survive a live URL Inspector run. That's the actual integration test.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Watch the rendered HTML in Search Console's live test panel&lt;/strong&gt; — not just the verdict. The verdict tells you something is wrong; the rendered HTML tells you &lt;em&gt;what&lt;/em&gt;. I'd assumed the verdict was Search Console misreading my pages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Match crawlers broadly.&lt;/strong&gt; Half the Stack Overflow answers about "let Googlebot through" only match &lt;code&gt;Googlebot&lt;/code&gt;. Several Google crawler UAs don't contain that string. Mine now matches a long allowlist.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What this means for adult-content sites specifically
&lt;/h2&gt;

&lt;p&gt;Adult sites get hit by this more than other categories because almost all of them have an age-verification wall. The wall is non-optional under multiple jurisdictions (US TX, UT, UK, parts of EU). If your wall ate your indexing, that's potentially months of search traffic lost — and you wouldn't see it as a single error in Search Console, because each missed URL is silently never crawled.&lt;/p&gt;

&lt;p&gt;After the fix, I re-submitted my five most recent articles via URL Inspector. All five went through immediately and entered the priority crawl queue. The earlier rejections were 100% the UA-match bug.&lt;/p&gt;




&lt;h2&gt;
  
  
  Repo
&lt;/h2&gt;

&lt;p&gt;The full architectural notes (multi-domain Next.js, Cloudflare Tunnel setup, geo-policy table, sanitized middleware snippets) are open on &lt;a href="https://github.com/noctias/noctias-stack" rel="noopener noreferrer"&gt;github.com/noctias/noctias-stack&lt;/a&gt;. It's not deployable code — the live source is private — but the patterns are documented enough to reproduce.&lt;/p&gt;

&lt;p&gt;Live sites:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://noctias.com" rel="noopener noreferrer"&gt;noctias.com&lt;/a&gt; — Japan-first portal&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://noctias.tv" rel="noopener noreferrer"&gt;noctias.tv&lt;/a&gt; — global cam/dating portal, host-routed from the same container&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;If you've spent days arguing with Search Console about why your URLs don't index, try the curl above. There's a decent chance this is your bug.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>devops</category>
      <category>security</category>
    </item>
  </channel>
</rss>
