Over the past few weeks, devs experimenting with llms.txt
have noticed inconsistent behavior. In some cases, ChatGPT or Gemini report that the file is inaccessible, empty, or returning a 500 error, even when the file loads fine in a browser.
If you’ve run into this, here’s what’s going on.
The Problem
Why is my llms.txt not indexed by AI crawlers?
When testing llms.txt
across multiple deployments, I saw:
- Browsers loading the file without issues.
- AI crawlers (ChatGPT browsing tool, Gemini) claiming the file was unavailable or empty.
- Server logs showing 200 responses, not errors.
At first, this looked like a crawler-side bug. But after digging deeper into logs, headers, and responses, the real cause became clear.
The Cause: Content-Type Mismatch
The difference was in the Content-Type
header.
- Deployments serving -
llms.txt
astext/plain; charset=utf-8
worked fine. - Deployments serving it as
text/markdown; charset=utf-8
failed in ChatGPT and Gemini.
These crawlers are optimized for plain text ingestion. If the header suggests the file is markdown, they either skip it or return an error.
The Solution
To ensure llms.txt
is read correctly by AI crawlers serve the file with the correct header:
Content-Type: text/plain; charset=utf-8
Avoid serving as text/markdown
. Even if the file is plain text, the wrong header confuses crawlers.
Confirm by checking logs grouped by HTTP status — if you see 200s in your infra but errors in crawlers, suspect headers first.
Godspeed
https://x.com/dom_sipowicz
https://www.linkedin.com/in/dominiksipowicz/
Top comments (0)