robots.txt unreachable is not usually a content problem.
It is usually a fetch, routing, DNS, CDN, middleware, firewall, redirect, or cache problem.
That distinction matters because teams often waste time editing pages when Google is really saying:
"I could not reliably fetch the file that tells me what I am allowed to crawl."
Here is the debugging order I use.
1. Confirm the file exists at the root
Open:
https://example.com/robots.txt
It should return a plain-text response from the same public host Google crawls.
2. Check the HTTP status
Use:
curl -I https://example.com/robots.txt
You want a stable 200 OK.
Watch for:
-
403from bot protection -
404from routing -
5xxfrom hosting or edge functions - long redirect chains
- HTML being returned instead of plain text
3. Check middleware and auth rules
This is especially easy to miss in modern app routers.
Make sure these paths are not behind auth, redirects, or app-level rewrites:
/robots.txt
/sitemap.xml
/llms.txt
If your middleware protects everything by default, explicitly bypass these files.
4. Check CDN and bot rules
A site can work perfectly in your browser and still fail for Googlebot-like requests.
Look for:
- managed challenge pages
- country-level blocking
- user-agent blocks
- rate-limit rules
- WAF rules applied to static text files
5. Do not overcomplicate robots.txt
For many public sites, simple is safer:
User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xml
Complex rules create more places for accidental blocking.
6. Retest after the fix
After deployment, retest the live file and then use Google Search Console's robots.txt report or URL Inspection again.
If the issue was temporary, Search Console may need time to refresh its cached state.
Why this matters
If Google cannot fetch robots.txt, it may pause crawling because it cannot confirm crawl permissions.
That can make indexing problems look like content problems, even when the real issue is infrastructure.
I wrote a fuller breakdown here:
https://visrank.org/blog/why-google-search-console-says-robots-txt-unreachable
And if you want a broader launch checklist for crawlability, canonicals, schema, speed, security, and mobile basics:
https://visrank.org/blog/technical-seo-checklist-2026
The short version:
Before rewriting content, prove Google can fetch the boring files.
Those boring files decide whether the rest of the site can even enter the conversation.
Top comments (0)