How to make your website AI-agent friendly in 30 minutes

#ai #opensource #webdev #tutorial

AI agents are the new browsers. They're crawling, reading, and trying to interact with your site right now — and over 50% of websites are basically unusable for them. If you want agents to actually work with your site, here's how to fix that in 30 minutes.

Step 1: Add llms.txt (5 min)

Create a /llms.txt file at your site root. This is a plain-text file that tells AI agents what your site does and how to use it. Think of it as robots.txt, but instead of saying "don't crawl this," you're saying "here's how to understand me."

The spec lives at llmstxt.org. Here's what a good one looks like:

# Acme API

> Acme provides a REST API for managing invoices and payments.

## Docs

- [API Reference](/docs/api): Full endpoint documentation
- [Authentication](/docs/auth): How to authenticate requests
- [Webhooks](/docs/webhooks): Event notification setup

## Key Endpoints

- POST /api/invoices - Create a new invoice
- GET /api/invoices/{id} - Retrieve invoice details
- GET /api/customers - List customers

## Notes

- All endpoints return JSON
- Rate limit: 100 requests/minute
- Auth: Bearer token in Authorization header

Drop this at yourdomain.com/llms.txt. That's it. Agents that support the spec will read it before trying to parse your HTML.

You can also add a /llms-full.txt with your complete documentation in markdown — useful for agents that want to ingest everything at once.

Step 2: Add structured data (10 min)

Agents parse your HTML, but they understand structured data much better. The fastest win is adding JSON-LD markup to your pages using schema.org vocabulary.

For a SaaS product page:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "SoftwareApplication",
  "name": "Acme Invoicing",
  "description": "Automated invoice management for small businesses",
  "applicationCategory": "BusinessApplication",
  "operatingSystem": "Web",
  "offers": {
    "@type": "Offer",
    "price": "29.00",
    "priceCurrency": "USD"
  },
  "url": "https://acme.com",
  "documentation": "https://acme.com/docs"
}
</script>

For a blog post, use Article. For an API, use WebAPI. For a business, use Organization.

Other quick wins:

Use semantic HTML: <nav>, <main>, <article>, <header> — not <div class="nav-wrapper-v2">.
Add descriptive meta tags: <meta name="description"> actually matters again.
Use clear heading hierarchy: Agents use h1→h6 to build a content tree, just like screen readers do.

Step 3: Make your API agent-accessible (10 min)

If you have an API, three things make it dramatically easier for agents to use:

Add an OpenAPI spec. Host it at a known location like /openapi.json:

{
  "openapi": "3.0.0",
  "info": {
    "title": "Acme API",
    "version": "1.0.0"
  },
  "paths": {
    "/api/invoices": {
      "post": {
        "summary": "Create an invoice",
        "requestBody": {
          "required": true,
          "content": {
            "application/json": {
              "schema": {
                "type": "object",
                "required": ["customer_id", "amount"],
                "properties": {
                  "customer_id": { "type": "string" },
                  "amount": { "type": "number" }
                }
              }
            }
          }
        }
      }
    }
  }
}

Return structured JSON errors, not HTML error pages:

{
  "error": "invalid_request",
  "message": "customer_id is required",
  "status": 400
}

Agents can't parse your pretty 404 page. They need machine-readable errors to self-correct.

Document rate limits in response headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1710892800

If you don't have an API, consider adding a single read-only endpoint — even a /api/info.json that returns basic metadata about your site is useful for agent discovery.

Step 4: Check your score (5 min)

Go to siliconfriendly.com/check and enter your domain. It evaluates your site across 30 criteria on an L0–L5 scale:

L0 – No agent support (most sites today)
L1 – Basic: has llms.txt or structured metadata
L2 – Navigable: semantic HTML, clear content hierarchy
L3 – Interactable: API with OpenAPI spec, structured errors
L4 – Integrated: agent.json, A2A support
L5 – Fully agent-native

Most sites start at L0. Following steps 1–3 above should get you to L2 or L3.

Bonus: Add agent.json

For agent-to-agent (A2A) discovery, you can add a /.well-known/agent.json file. This tells other AI agents what your agent (or service) can do and how to communicate with it:

{
  "name": "Acme Invoicing Agent",
  "description": "Manages invoices and payments",
  "url": "https://acme.com",
  "capabilities": ["create_invoice", "lookup_customer"],
  "protocol": "https",
  "auth": "bearer"
}

This is newer and less standardized, but the /.well-known/ convention is gaining traction. If you're building anything with agent interoperability in mind, it's worth adding now.

TL;DR

Step	Time	What
llms.txt	5 min	Tell agents what your site does
Structured data	10 min	JSON-LD + semantic HTML
API access	10 min	OpenAPI spec + JSON errors
Check score	5 min	siliconfriendly.com