If you have added an llms.txt to your site, here is the uncomfortable part: nothing tells you when it breaks. A missing title, a malformed link, a relative URL that an AI fetcher cannot resolve, and your carefully curated file just gets skipped. Silently. So I built a tiny GitHub Action that lints llms.txt on every push and fails the build when it is wrong.
Quick refresher: what is llms.txt?
llms.txt is a small markdown file at the root of your site that hands large language models a curated map of your best pages. It is the AI-search cousin of robots.txt and sitemap.xml: instead of letting a crawler guess, you tell ChatGPT, Perplexity, Claude and Google AI Overviews exactly what to read and cite.
The format is deliberately simple:
# Your Site
> One-line summary a model reads first.
## Section name
- [Page title](https://example.com/page): short note on what it is.
## Optional
- [Lower-priority page](https://example.com/extra): models may skip this to save context.
The catch is that "simple" is not the same as "hard to get wrong". The H1 is the only strictly required element, links must be real markdown link bullets, and an Optional section has special meaning. Those are exactly the things you forget at 1am.
Why a CI check
I treat llms.txt like any other build artifact. If a broken sitemap fails CI, a broken AI-readability file should too. The rules I wanted enforced:
-
Errors (break the build): the file exists and is non-empty, there is exactly one H1 title and it comes first, every link bullet is well-formed
- [name](url): notes, and no URL is empty. -
Warnings (optional break): a blockquote summary sits right under the title, links use absolute
https://URLs, every link has a: description, sections use H2, no empty sections, no duplicate URLs, and anOptionalsection exists.
The Action
Zero dependencies, pure Python standard library, so it runs in about a second on a stock runner with no setup step:
name: Validate llms.txt
on: [push, pull_request]
jobs:
llms-txt:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: atlashey-collab/llms-txt-action@v1
with:
target: public/llms.txt # or a live URL like https://example.com
You can point target at a file path or a deployed URL (a bare site URL gets /llms.txt appended). Flip fail-on-warning: true for strict mode. Every run drops a table of H1 / sections / links / errors / warnings into the job summary, with per-line messages.
It is MIT licensed and the full validator is one readable file: github.com/atlashey-collab/llms-txt-action.
Run it locally too
curl -sO https://raw.githubusercontent.com/atlashey-collab/llms-txt-action/v1/validate_llms_txt.py
python3 validate_llms_txt.py llms.txt
python3 validate_llms_txt.py https://example.com --fail-on-warning
Exit codes are CI-friendly: 0 valid, 1 validation failed, 2 usage error.
Honest caveat
llms.txt is a young convention. Adoption by the big AI engines is still uneven, and a valid file does not guarantee citations. But the cost of keeping it correct is now zero, and getting cited is impossible if the file is broken. That trade is easy.
If you do not have one yet, write a spec-compliant file first, then wire up the check. Either way, stop shipping a broken llms.txt and not knowing.
Top comments (0)