<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Gregory</title>
    <description>The latest articles on DEV Community by Gregory (@monsieurho).</description>
    <link>https://dev.to/monsieurho</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3903710%2F4121f882-2cd0-40c9-94d5-a799b73f7878.jpg</url>
      <title>DEV Community: Gregory</title>
      <link>https://dev.to/monsieurho</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/monsieurho"/>
    <language>en</language>
    <item>
      <title>How to Fix "Couldn't Fetch" in Google Search Console for a Next.js Sitemap</title>
      <dc:creator>Gregory</dc:creator>
      <pubDate>Wed, 29 Apr 2026 07:04:59 +0000</pubDate>
      <link>https://dev.to/monsieurho/how-to-fix-couldnt-fetch-in-google-search-console-for-a-nextjs-sitemap-j37</link>
      <guid>https://dev.to/monsieurho/how-to-fix-couldnt-fetch-in-google-search-console-for-a-nextjs-sitemap-j37</guid>
      <description>&lt;p&gt;For about a week, Google Search Console showed "Couldn't fetch" on my sitemap. The file loaded fine in a browser, returned 200 OK to curl, validated as XML, and Googlebot's user agent saw the same response. Nothing was wrong. Except something was clearly wrong, because pages weren't being discovered.&lt;/p&gt;

&lt;p&gt;The actual problem turned out to have nothing to do with my server, my hosting, or my XML structure. It was a single category of characters inside the URLs themselves. Here's the full debugging path, because I wish someone had written this post a week ago.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;I'm running &lt;a href="https://www.yumcha.fun" rel="noopener noreferrer"&gt;YumCha&lt;/a&gt;, a Cantonese learning site. The web side is Next.js 16 with PayloadCMS (Postgres) and a 124,000-entry Cantonese dictionary. Each dictionary entry is its own page at &lt;code&gt;/cantonese-dictionary/[slug]&lt;/code&gt;, so the sitemap is genuinely large.&lt;/p&gt;

&lt;p&gt;Initial sitemap implementation used the Next.js metadata route convention:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/app/sitemap.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;getDictionaryEntries&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/dictionary&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;sitemap&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;entries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getDictionaryEntries&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`https://www.yumcha.fun/cantonese-dictionary/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;changeFrequency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;monthly&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That returned ~50,000 URLs in a single file. Google's per-file limit is 50,000 URLs, so I was right at the edge. Time to split.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempt 1: &lt;code&gt;generateSitemaps()&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Next.js supports splitting a sitemap natively:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateSitemaps&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getDictionaryCount&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;40000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;chunks&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;sitemap&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;coreSitemap&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;dictionaryChunk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This generates &lt;code&gt;/sitemap/0.xml&lt;/code&gt;, &lt;code&gt;/sitemap/1.xml&lt;/code&gt;, etc. — and according to the docs, an index at &lt;code&gt;/sitemap.xml&lt;/code&gt; that lists them all.&lt;/p&gt;

&lt;p&gt;Except the index didn't exist. Sub-sitemaps returned 200 fine. The index returned 404. Verified locally with &lt;code&gt;next start&lt;/code&gt; after a clean build: &lt;code&gt;/sitemap.xml&lt;/code&gt; is genuinely missing when you use &lt;code&gt;generateSitemaps()&lt;/code&gt; together with &lt;code&gt;force-dynamic&lt;/code&gt;. (The behavior may also affect static mode in Turbopack — I observed 404 in both cases on Next.js 16.2.3.)&lt;/p&gt;

&lt;p&gt;I tried adding a route handler at &lt;code&gt;app/sitemap.xml/route.ts&lt;/code&gt; to manually serve the index. Build error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Conflicting route and metadata at /sitemap.xml: route at /sitemap.xml/route and metadata at /sitemap.xml/route
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next.js reserves the path even though it doesn't actually serve it. The metadata route claims the URL but only fills sub-sitemap slots, leaving the index empty.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempt 2: full manual control
&lt;/h2&gt;

&lt;p&gt;I deleted &lt;code&gt;app/sitemap.ts&lt;/code&gt; and replaced it with two route handlers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;sitemap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;xml&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;          &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nx"&gt;sitemap&lt;/span&gt; &lt;span class="nx"&gt;index&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;sitemap&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;         &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nx"&gt;individual&lt;/span&gt; &lt;span class="nx"&gt;sub&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;sitemaps&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Plus a small XML helper library so both routes share generation logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/lib/sitemap-xml.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;urlsetXml&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;UrlEntry&lt;/span&gt;&lt;span class="p"&gt;[]):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;entries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;urls&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`    &amp;lt;loc&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;xmlEscape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/loc&amp;gt;`&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lastmod&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`    &amp;lt;lastmod&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lastmod&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/lastmod&amp;gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;priority&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`    &amp;lt;priority&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/priority&amp;gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`  &amp;lt;url&amp;gt;\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;\n  &amp;lt;/url&amp;gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;
&amp;lt;urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"&amp;gt;
&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
&amp;lt;/urlset&amp;gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The dynamic route accepts both &lt;code&gt;/sitemap/0&lt;/code&gt; and &lt;code&gt;/sitemap/0.xml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;idStr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="sr"&gt;xml$/&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;idStr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Build succeeded. &lt;code&gt;/sitemap.xml&lt;/code&gt; returned 200 with a proper sitemap index. Sub-sitemaps returned 200 with valid urlset XML. xmllint validated everything. Googlebot's user agent saw the same. I deployed.&lt;/p&gt;

&lt;p&gt;GSC still said "Couldn't fetch."&lt;/p&gt;

&lt;h2&gt;
  
  
  The real culprit
&lt;/h2&gt;

&lt;p&gt;Out of frustration I started looking at the URLs themselves. Out of 71,574 dictionary URLs, most looked clean — &lt;code&gt;/cantonese-dictionary/jat1-one-or-two-10&lt;/code&gt; style slugs. But running the file through &lt;code&gt;grep -P '[^\x00-\x7F]'&lt;/code&gt; surfaced something:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/cantonese-dictionary/凝:jing4-focus-one-s-eyes-upon-2429
/cantonese-dictionary/ngaam1ｆｅｅｌ-to-feel-like-someone-2025
/cantonese-dictionary/aa3ｓｉｒ-the-appellation-for-a-police-19004
/cantonese-dictionary/noi6…m4-cannot-intervene-23166
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The dictionary's seed script had preserved Chinese characters, full-width Latin (those are real Unicode codepoints, not styled letters), and the Unicode horizontal ellipsis from CC-Canto data. The slugs went into Postgres unmodified, and the sitemap dumped them into &lt;code&gt;&amp;lt;loc&amp;gt;&lt;/code&gt; elements as raw UTF-8.&lt;/p&gt;

&lt;p&gt;Per the &lt;a href="https://www.sitemaps.org/protocol.html#location" rel="noopener noreferrer"&gt;sitemap protocol spec&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The URL [in the loc element] must be URL-encoded.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Google fetches the file, parses the XML, encounters non-URL-encoded characters in &lt;code&gt;&amp;lt;loc&amp;gt;&lt;/code&gt;, and treats the entire sitemap as malformed. The error surfaces in Search Console as "Couldn't fetch" — which sounds like a network error but is actually a parse rejection. xmllint, Chrome, and curl all accept the file fine because they're more permissive than Google's sitemap parser.&lt;/p&gt;

&lt;p&gt;There were only 22 of these out of 124k entries. But 22 is enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  The two-prong fix
&lt;/h2&gt;

&lt;p&gt;First, exclude the malformed slugs at the database query level. They actually 404 in Next.js routing too (Chinese characters in dynamic segments don't match cleanly), so they were dead URLs anyway:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/lib/dictionary.ts&lt;/span&gt;
&lt;span class="nx"&gt;sql&lt;/span&gt;&lt;span class="s2"&gt;`SELECT slug FROM dictionary_entries
    WHERE ((example_traditional IS NOT NULL AND example_traditional &amp;lt;&amp;gt; '')
       OR LENGTH(english) &amp;gt; 20)
      AND slug ~ '^[a-z0-9-]+$'
    ORDER BY id
    LIMIT &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; OFFSET &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The regex filter (POSIX, with &lt;code&gt;-&lt;/code&gt; at the end of the character class to avoid escaping) keeps only slugs that are pure ASCII lowercase + digits + hyphens.&lt;/p&gt;

&lt;p&gt;Second, URL-encode the value in the XML helper as a defensive measure for any future fields. Order matters here — &lt;code&gt;encodeURI&lt;/code&gt; keeps &lt;code&gt;&amp;amp;&lt;/code&gt; raw, then &lt;code&gt;xmlEscape&lt;/code&gt; turns it into &lt;code&gt;&amp;amp;amp;&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;encodeUrlForXml&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;xmlEscape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;encodeURI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After deploy: 71,574 URLs (vs. 71,588 before — 14 dropped from the kept set, plus 8 already excluded by other filters), zero non-ASCII URLs in the output, valid XML. Now waiting on Google to re-fetch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;"Couldn't fetch" in GSC is not always a network error.&lt;/strong&gt; It can be a parser rejection that the rest of the world accepts. If your sitemap loads everywhere except in Google, suspect content validation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;generateSitemaps()&lt;/code&gt; + &lt;code&gt;force-dynamic&lt;/code&gt; doesn't generate the index in Next.js 16.&lt;/strong&gt; Sub-sitemaps work, but &lt;code&gt;/sitemap.xml&lt;/code&gt; is silently missing. Verify with curl after deploy. If you need dynamic sitemaps with an index, use manual route handlers instead.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;&amp;lt;loc&amp;gt;&lt;/code&gt; values must be URL-encoded.&lt;/strong&gt; Even if your slugs look fine in the database, audit them. A grep for &lt;code&gt;[^\x00-\x7F]&lt;/code&gt; against your live sitemap takes seconds and would have saved me a week.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Filter dead URLs at query time.&lt;/strong&gt; If a slug 404s, don't put it in your sitemap. Saves crawl budget and avoids exactly this class of parser rejection.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sitemap saga over. Now I just have to wait for Google to refetch. Time will tell.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm building &lt;a href="https://www.yumcha.fun" rel="noopener noreferrer"&gt;YumCha&lt;/a&gt;, a Cantonese learning app. Web side is Next.js 16 + PayloadCMS, mobile is Expo, backend is NestJS.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
    </item>
  </channel>
</rss>
