Add llms.txt to your site in 5 minutes and get cited by ChatGPT, Perplexity & Co.

#ai #llm #tutorial #webdev

Most AI search tools — ChatGPT Search, Perplexity, Bing Copilot — crawl the web to find pages they can cite. The problem: they can't always tell which parts of your site are worth reading, or what your site is actually about. llms.txt fixes that.

It's a plain-text Markdown file you put at https://yoursite.com/llms.txt. Three lines are enough to get started.

The format

# Your Site Name

> One-sentence summary of what your site is and who it's for.

## Important pages

- [Page Title](https://yoursite.com/page): What this page covers.
- [Another Page](https://yoursite.com/other): Short description.

The first # heading is your site name. The > blockquote is the summary AI crawlers use as context when deciding whether your site is relevant to a query. Each ## Section lists the pages you want prioritized, with an optional short description per link.

That's the entire format. No build step, no framework, no schema validation — just a file at the right path.

Why it matters

Without llms.txt, an AI crawler has to infer what your site is about from the content of individual pages. It might index your blog posts but miss your documentation. It might understand what your tool does but not know which page to cite for a specific question.

With llms.txt, you give it a map. You're saying: this site is for developers who need X, and the pages most worth reading are these.

Perplexity, ChatGPT Search, and a growing number of AI crawlers actively look for this file. It's not mandatory, but skipping it is a missed signal that costs nothing to add.

A real example

Here's a llms.txt for a developer API tool:

# Sprytools

> Developer APIs for common backend tasks — email validation, IP geolocation, IBAN parsing, currency conversion — accessible behind one key.

## Key pages

- [API Hub & Pricing](https://sprytools.com/api): All APIs, pricing tiers, self-serve key management.
- [Email Validation](https://sprytools.com/tools/email-validation): Syntax + MX + disposable detection.
- [IP Geolocation](https://sprytools.com/tools/ip-geolocation): IPv4/IPv6, country, city, ASN.
- [IBAN Validator](https://sprytools.com/tools/iban-validator): MOD-97, 80+ countries, parsed output.

## About

All APIs share one key. No per-service sign-up. Free tier on each endpoint.

The description is factual, not aspirational. "Developer APIs for common backend tasks" tells a model what this site is in one phrase. "Empowering developers to build better APIs faster" does not.

Where to put it

The file goes at the root: https://yourdomain.com/llms.txt. Subdirectory paths won't be auto-discovered.

Static site / Next.js: drop it in public/:

public/
  llms.txt

Express or Hono: serve your public/ directory as static files, or add a route:

app.get('/llms.txt', (c) => {
  return c.text(`# My Site\n\n> Description.\n\n## Key pages\n\n- [Home](https://example.com/)\n`);
});

Astro: same as Next.js — put it in public/llms.txt.

After deploying, confirm it's reachable:

curl -I https://yoursite.com/llms.txt
# Should return HTTP/2 200, content-type: text/plain

Point to it from robots.txt

Add a pointer in your robots.txt file alongside the Sitemap directive:

Sitemap: https://yoursite.com/sitemap.xml
LLMs: https://yoursite.com/llms.txt

The LLMs: line isn't part of the original robots.txt spec, but it mirrors the established Sitemap: convention. Some parsers already read it, and it serves as documentation for anyone reading the file manually.

Common mistakes

Wrong path — must be /llms.txt at the root, not /docs/llms.txt or /blog/llms.txt. AI crawlers look for it at the root; subdirectory paths won't be found automatically.

Blocking AI crawlers in robots.txt while adding llms.txt — the crawlers still need Allow in robots.txt to fetch either file. A Disallow: / wildcard blocks llms.txt too.

Generic description — the > blockquote is the most important line. "Platform for developers" tells a model nothing. "REST API for email validation, DNS lookup, and IBAN parsing" tells it exactly what kind of queries your site can answer.

Missing UTF-8 — save as UTF-8 without BOM. Non-ASCII characters in site names or descriptions can cause parse errors in some crawlers.

Verify with CiteReady

After deploying, you can verify the full picture — llms.txt plus the four other AI readiness signals — in one call:

curl -X POST https://citeready-api.sprytools.com/v1/audit \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://yoursite.com"}'

The llms_txt check in the response tells you whether the file was found and parseable:

"llms_txt": {
  "status": "pass",
  "score": 20,
  "details": "llms.txt found at /llms.txt"
}

A missing or unreachable file returns fail. A file with wrong content-type or a 404 after deploy is the most common issue — worth verifying with a curl -I before assuming it's working.

Free tier at https://citeready.sprytools.com — 3 checks/day, no signup.

Have you added llms.txt to any of your projects? Curious whether anyone has noticed a measurable change in Perplexity or ChatGPT citations after adding it.