I recently needed to control how AI crawlers access my documentation site, so I built a small tool to generate LLM.txt files. Here's how you can do it with the SERPSpur API:
python
import requests
API_KEY = "your_api_key_here"
def generate_llm_txt(rules):
response = requests.post(
"https://api.serpspur.com/v1/llm-txt/generate",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"rules": rules}
)
return response.text
Example rules to allow only certain AI agents
rules = {
"user_agent": ["GPTBot", "Google-Extended"],
"disallow": ["/private/", "/api/"],
"allow": ["/public/", "/docs/"]
}
llm_content = generate_llm_txt(rules)
print(llm_content)
This gives you fine-grained control over which AI systems can access your content and what they see. Have you implemented any LLM.txt configurations for your projects?
Top comments (3)
Interesting approach! I've been manually crafting robots.txt variations for different crawlers, but this seems much more maintainable at scale. Do you handle rate-limiting or caching for repeated generation calls?
That's a smart way to handle AI crawlers. I've set up LLM.txt for my blog, but had trouble configuring it for multiple user agents with overlapping rules. Does your tool handle priority conflicts well?
Interesting use case. I've been relying on robots.txt for crawler control, but LLM.txt seems like a good supplement for AI-specific instructions. Are there any gotchas when both files are present?