AI crawlers are becoming more common, and controlling how they access your site is important. I've been experimenting with LLM.txt files to set access rules for AI bots. Here's a Python script that uses the SERPSpur LLM.txt Generator to create and manage these files:
python
import requests
API_KEY = "your_api_key_here"
def generate_llm_txt(domain, rules):
url = "https://serpspur.com/tool/llms-txt-generator-tool/"
payload = {
"domain": domain,
"rules": rules, # e.g., {"allow": ["/blog/"], "disallow": ["/admin/"]}
"api_key": API_KEY
}
response = requests.post(url, json=payload)
if response.status_code == 200:
with open("llm.txt", "w") as f:
f.write(response.text)
print("LLM.txt generated!")
else:
print(f"Error: {response.status_code}")
Example
rules = {"allow": ["/public/"], "disallow": ["/private/", "/api/"]}
generate_llm_txt("mysite.com", rules)
This lets me define which sections AI crawlers can access. It's a neat way to protect sensitive content while keeping public pages open. Have you set up LLM.txt for your site yet?
Top comments (1)
Nice approach! I've been manually editing robots.txt for AI bots, but LLM.txt sounds like a more structured way to define access rules. Do you find that most major crawlers respect this file, or is it still early days for adoption?