Hiroshi Toyama

Posted on May 3

Using llms.txt with Cursor and Claude Code: a concrete playbook

#ai #cursor #llm #documentation

llms.txt is a small text file on a documentation site—usually lists what the product is and links to the important Markdown pages. For coding agents, treat it as the canonical URL to open first when upstream behavior is unclear. This post is mostly setup and workflow, not theory.

What goes where

Location	Put this there
Official doc server	`https://example.com/llms.txt` (maintained by the library/vendor)
Your repo	URLs only (and short protocols), in agent rules—not a copy of their docs
Your repo `.cursor/rules/`	Project map, conventions, your architecture—not Next.js’s full manual

If you paste thousands of tokens of upstream docs into rules, every chat pays for them. Keeping pointers in rules and loading docs on demand avoids that.

One-time setup: a dedicated rules file

Create something like .cursor/rules/external-llms-docs.md (name does not matter; keep it scoped). Paste a stable list of llms.txt URLs your stack actually uses, grouped so humans and agents scan quickly.

# External docs — fetch on demand

Use web fetch / browser / search tools to load these when implementing or debugging
third-party behavior. Do not paste full upstream docs into the chat.

## Index URLs (read these first)

| Area | llms.txt |
| --- | --- |
| Next.js | https://nextjs.org/llms.txt |
| Tailwind | https://tailwindcss.com/llms.txt |
| Lucide | https://lucide.dev/llms.txt |
| Google ADK | https://adk.dev/llms.txt |

## Read order

1. Fetch the **llms.txt** for the dependency that owns the question.
2. Follow **only** links from that file (or obvious `/docs/*.md` siblings) for depth.
3. Prefer Markdown sources over scraping marketing HTML.
4. If types exist locally (`node_modules`, stubs), use them **after** you know which API surface applies (avoids guessing wrong symbols).

## Scope

- Questions about **our** repo layout → use `repo-map` rule / codebase search, not llms.txt.
- Questions about **their** API/version/docs → use the table above.

Why a separate file: Cursor injects rules by context; a fat global rule file makes unrelated edits heavier. Split internal vs external pointers.

Agent protocol (copy into the same file or AGENTS.md)

Make the sequence explicit so the model does not default to “grep node_modules for an hour.”

## External SDK protocol

When the user asks for behavior that depends on an external library version or API:

1. Identify which dependency owns the feature (package.json / imports).
2. If this file lists an llms.txt for that dependency, **fetch it before** writing code.
3. Summarize in ≤10 lines: version assumptions, file names, and APIs you will use—then implement.
4. Do not quote entire upstream pages back to the user; cite chapter/section or URL path only.

Concrete workflows

Implement a feature (e.g. App Router auth middleware).

User: “Add middleware-based auth with Next.js App Router.”
Agent: fetch https://nextjs.org/llms.txt, open the linked page that describes middleware.ts / matcher patterns.
Implement using current filenames and signatures from that fetch—not memory.

Debug “works on my machine” / deprecation.

User: “Tailwind v4 class names stopped working after upgrade.”
Agent: fetch Tailwind’s llms.txt first; confirm breaking-change notes and config file names, then open repo tailwind.config.* / CSS entry.

SDK with tiered dumps (example pattern).

Some sites expose a short index and a long bundle (names vary). Rule of thumb: start short, upgrade to full only if the stub did not answer.

# hypothetical layout on a docs host
/llms.txt          → links + overview
/llms-small.txt    → minimal surface (cheap)
/llms-full.txt     → everything (expensive)

Point your rules at the entry (llms.txt); let the fetched content tell the agent whether *-full exists.

Prompts that reinforce good habits

You can nudge behavior per task without editing rules:

“Before editing: fetch Next.js llms.txt and confirm middleware filename and export shape.”
“Use ADK llms.txt; don’t rely on training cutoff for API names.”
“After fetching Tailwind llms.txt, list which doc URLs you used (paths only).”

Minimal internal llms.txt (optional)

If you ship an internal library or architecture handbook on HTTPS, you can publish your own index at https://internal-docs.example.com/llms.txt:

# Internal platform — LLM index

## Auth
- Overview: https://internal-docs.example.com/auth/overview.md
- Breaking changes 2026: https://internal-docs.example.com/auth/changelog.md

## Data layer
- API conventions: https://internal-docs.example.com/db/conventions.md

Then add one line to .cursor/rules/external-llms-docs.md: Internal platform | https://internal-docs.example.com/llms.txt. Same mechanics as vendor docs.

Tooling reality check

This pattern assumes the agent can retrieve HTTPS text (built-in fetch, browser tool, MCP fetch, etc.). Air-gapped machines need a fallback (mirror snippets in rules, local static server, or vendor tarball—but accept resident token cost).

Do not put authenticated URLs with secrets in rules; use public docs or internal SSO-aware tooling outside plain markdown.

Anti-patterns

Dumping full upstream Markdown into .cursorrules “so the agent always knows.”
Skipping llms.txt and crawling random marketing pages (noisy HTML, wasted tokens).
Duplicating vendor docs under docs/vendor/ and indexing everything unless you truly need offline.

SEO note (short)

Search-engine teams have questioned llms.txt as an SEO lever; that is largely orthogonal. For coding agents, the win is predictable Markdown entrypoints and smaller always-on context—not rankings.

Summary

Add .cursor/rules/external-llms-docs.md with a table of llms.txt URLs plus read order and scope (external vs internal repo map).
Teach agents: fetch index → follow linked Markdown → then local types.
Use tiered files shallow-first when the provider offers them.
Optionally host your own llms.txt for internal platforms; still keep rules as pointers only.

DEV Community