The first time I checked my server logs and saw GPTBot, PerplexityBot, ClaudeBot and Google-Extended all crawling pages I'd never optimized for them, it clicked: a growing slice of my "search traffic" is now an AI model reading my site and deciding whether to quote it. And those models do not read like a browser. They have a budget (a context window), and most web pages spend that budget on nav bars, cookie banners, and <script> tags before they ever reach the actual content.
llms.txt is the small, boring fix for that. Here is what it is, why it is worth 20 minutes, and a free generator I built so you do not have to hand-write the Markdown.
What llms.txt actually is
It is a plain Markdown file you put at the root of your domain:
https://yoursite.com/llms.txt
Think of it as the mirror image of robots.txt. Where robots.txt tells crawlers what to avoid, llms.txt (proposed at llmstxt.org) hands a model a short, curated map of the pages you most want it to read and cite. The format is deliberately tiny:
# Your Project Name
> One-line summary of what this site is and who it is for.
Optional paragraph of extra context the model should know.
## Docs
- [Getting Started](https://yoursite.com/start): The 5-minute setup walkthrough
- [API Reference](https://yoursite.com/api): Every endpoint with examples
## Optional
- [Changelog](https://yoursite.com/changelog): Release history
That is the whole spec worth knowing: an H1 title, a blockquote summary, then ## sections holding Markdown links with short descriptions. A section literally named Optional signals pages a model can skip when its context is tight.
Why bother, honestly
I want to be straight about this, because there is a lot of hype: llms.txt is a young, voluntary standard. No model is contractually obligated to read it, and publishing one will not magically get you cited. What it does do is real and worth it:
- It removes friction. A model that fetches a clean index of your 8 best pages, with one-line descriptions, spends its limited attention on your substance instead of parsing 200KB of HTML to find it.
- It is a curation signal. You are telling the engine "these are the canonical pages, not the 400 tag-archive URLs." That is genuinely useful when a model is deciding what represents your site.
- It costs nothing and breaks nothing. It is a static text file. There is no downside, no render cost, no risk to your existing SEO.
So I treat it the way I treat a sitemap: not a ranking lever, just basic hygiene for how machines read me. The difference is the machines are now LLMs.
How to ship one in 10 minutes
- List your 5 to 15 most important pages. Not everything, the best things: your docs, your top guides, your product pages. If a page would embarrass you as your site's representative to ChatGPT, leave it out.
- Write a one-line summary of the site and a one-line description per link. Concrete beats clever, the description is what the model reads to decide relevance.
-
Group them into named sections (Docs, Guides, Products, About) and put low-priority links under
Optional. -
Save it as
llms.txt, upload to your domain root, and confirm it loads atyoursite.com/llms.txt. Use absolutehttps://URLs throughout.
That is it. If you want maximum coverage later, you can add an llms-full.txt that inlines the full text of key pages into one file, but start with the curated index.
The free generator
Writing the Markdown by hand is fine for one site. Doing it across a few projects, keeping the links absolute, and not fat-fingering the format gets old. So I built a free, no-signup, runs-entirely-in-your-browser generator:
You fill in your site name, a summary, and your sections and links; it outputs a spec-compliant llms.txt you can copy or download as a file. Nothing is uploaded, it is all client-side JavaScript, so you can draft one for an unpublished site safely. It seeds with a sensible starter template so you are editing, not staring at a blank box.
If you have already shipped an llms.txt, I would genuinely like to hear what you put in your Optional section, that is the part most people get wrong (they either dump everything in it or skip it entirely). Drop your URL in the comments and I will take a look.
Top comments (0)