DEV Community

Ashley Pfeiffer
Ashley Pfeiffer

Posted on

We Scored 100 Sites on Agent Readiness. Every One Failed.

AI agents are the next consumer of the web. Claude, ChatGPT, Siri, Copilot — they're all trying to interact with websites on behalf of users. But the web has no front door for them.

I built a scoring framework and ran it against 100 of the most visited sites. Four dimensions, 25 points each, 100 points possible.

Average score: 40. Highest: 48. Sites that passed: zero.

What I Measured

Discovery (0-25): Can an agent find out what your site does? Does it have a machine-readable capability manifest? An MCP endpoint? An llms.txt file?

Structure (0-25): Is the content semantically organized? Schema.org markup, clean HTML, sitemaps, meta tags.

Actions (0-25): Can an agent actually do things? Public API, formal spec (OpenAPI/GraphQL), documented authentication.

Policies (0-25): Does the site set rules for agents? Rate limits, brand voice guidelines, escalation paths, data handling policies.

The Surprising Finding

Most sites score well on Structure. Schema.org, clean HTML, HTTPS, sitemaps — the semantic foundation is solid.

The gap is entirely in Discovery and Policies. No site tells agents what capabilities are available. No site defines brand voice guidelines for agents. No site defines when an agent should stop and get a human.

The web invested heavily in being readable by search engine crawlers. It invested nothing in being discoverable by AI agents.

The Fix: agent.json

I built agent.json — an open spec that sits at your domain root (like robots.txt) and tells agents everything they need to know.

{
  "name": "Acme Fashion",
  "spec_version": "1.0",
  "description": "Premium fashion retailer",
  "capabilities": [
    { "name": "search_products", "type": "query" },
    { "name": "place_order", "type": "action" }
  ],
  "brand_voice": {
    "tone": "warm, knowledgeable, never pushy",
    "prohibited": ["competitor comparisons"]
  },
  "policies": {
    "rate_limit": "100/minute",
    "data_handling": "no_training_on_interactions"
  },
  "humans": {
    "triggers": ["complaint", "legal_question"],
    "channels": [{ "type": "email", "url": "support@acme.com" }]
  }
}
Enter fullscreen mode Exit fullscreen mode

Four required fields. Two minutes to create.

Try It Yourself

Score any site:

npx agentweb score https://yoursite.com
Enter fullscreen mode Exit fullscreen mode

You get a breakdown across all four dimensions with a letter grade and specific recommendations.

Generate a starter agent.json:

npx agentweb init --name "My Site" --industry retail
Enter fullscreen mode Exit fullscreen mode

Pre-fills capabilities, brand voice, policies, and escalation triggers for your industry.

Proxy any site as an MCP server (no code changes on the origin):

npx @agentweb-dev/middleware --origin https://yoursite.com
Enter fullscreen mode Exit fullscreen mode

This creates an MCP server in front of any website with tools for browsing, search, structured data extraction, policy lookup, and human escalation.

The Stack

Everything is open source and published to npm:

Package What It Does
agentweb Unified CLI: score, init, validate, generate
@agentweb-dev/middleware MCP proxy for any website
@agentweb-dev/commerce Structured catalog + agent negotiation
@agentweb-dev/seo Agent visibility analytics

Why I Built This

I work in data infrastructure (Director of Client Delivery at Cherre, a real estate data platform). My day job is about unifying fragmented data into a single source of truth so organizations can make better decisions.

The same problem exists on the open web for agents. The data is there (Schema.org, APIs, HTML), but there's no unified layer that tells agents what a site can do and how to interact with it. agent.json is that layer.

What's Next

The spec is early. The scoring methodology probably has gaps. I'd love feedback on what dimensions are missing, what the scoring weights should be, and what industries need specific attention.

Repo: github.com/ashpfeif12/agentweb

Full scoring report: LAUNCH_POST.md

If you score your own site, drop the result in the comments. I'm curious what patterns emerge across different industries.

Top comments (0)