DEV Community

Liran Tal
Liran Tal

Posted on • Originally published at lirantal.com on

What is an LLMs.txt File?

Large Language Models have made it into mainstream fields of technologies, beyond code generation, beyond documentation and quite significantly into many sorts of human-computer interactions. How do we give these LLMs more true context so that they do not hallucinate? so that these models, whether GPT-4o, Claude 3.7 Sonnet, or any other, can be more reliable, trustworthy vessel of information? Meet the llms.txt file format.

What is the LLMs.txt file?

The llms.txt file is a newly proposed standard that is intended to provide large language models with relevant context and metadata in the form of a simple text file (it may be formatted as plain-text markdown ascii).

What are the use-cases for LLMs.txt file?

Originally, the llms.txt file was intended to be used in the context of allowing AI-based agents processes to more easily scrape data off of websites so that these self-learning and autonomous agents do not need to deal with HTML parsing, loading JavaScript and any other web scraping struggles. Instead, websites can provide a simple llms.txt file that contains the relevant context for each page, and LLMs can easily and quickly digest them without requiring further compute for parsing.

LLMs.txt for Websites

Due to the directory structure of websites, you can generate and plant llms.txt files in the root directory of your website but also they can be placed in documentation subdomain to allow GenAI code assistants to better embed and create the context for code snippets and suggested code examples. Some examples of these websites include:

We’re already seeing emerging llms context related tools such as llmstxt python project that compresses files into a single, LLM-friendly text file designed to get codebases ready for analysis by Large Language Models.

What’s next for LLMs.txt?

Given that contextual information is at the core of LLM integrations and agentic frameworks, are we going to see llms.txt in different shapes and forms, making it to more than just websites?

I personally think so. Some ideas that come to mind are to put llms.txt files in the following hubs as a starting point:

  • GitHub repositories
  • DockerHub images
  • The npm registry

LLMs.txt Directory

With the newly proposed llms.txt file standard, new directories have been emerging that index the llmstxt file format and allow to search and discover websites that have embraced this new file format. Some of which are:

Next up

LLMs are more ubiquitous than ever, but if you don’t want to risk privacy or spend, learn how to run a local LLM for inference with an offline-first approach.

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more