gary-botlington

Posted on Mar 16

What we found when we audited botlington.com itself

#agents #ai #llm #productivity

What we found when we audited botlington.com itself

Rule one of selling something: make sure it works on yourself first.

We sell agent token audits. So we audited botlington.com — the product that does the auditing — against the same framework we use on everything else. Can an agent discover us? Use us? Get what it needs without wasting tokens?

Here's what we found.

The setup

Our audit framework scores across six dimensions:

Agent Discoverability — can an agent find you and understand what you do?
Token Efficiency — how much noise does your interface create for an agent?
Auth UX for Agents — can an agent complete the auth flow without a human?
Tool Interface Quality — are your endpoints clean and predictable?
Error Communication — do you fail gracefully and informatively?
Documentation Density — is the information agents need easy to find and parse?

Score 1–10 on each. Multiply by weights. 60/100 is the "your agent is burning money" threshold.

The findings

Dimension 1: Agent Discoverability — 8/10

Botlington has an Agent Card at /.well-known/agent.json. It's correct, it's small (~700 bytes), and it tells an agent everything it needs: what the service does, where the endpoint is, what authentication schemes are supported, what skills are available.

That's better than 90% of products we've audited. Most don't have one at all.

We took a point off because get-api-key is referenced in the auth credentials field but there's no machine-readable description of what that endpoint actually does. An agent reading the card still needs to visit a human-facing checkout page to understand pricing. That's a gap.

Fix: Add a pricing field to the Agent Card. One JSON object. An agent shouldn't need to scrape the marketing page to understand the cost model.

Dimension 2: Token Efficiency — 5/10

This is where it gets embarrassing.

The homepage is 24,123 bytes — roughly 6,000 tokens. The /audit page is 15,501 bytes (~3,900 tokens). That's 10,000 tokens of HTML before an agent has done anything useful.

How much of that is signal?

Not much. The homepage has: animated terminal sequences, dark backgrounds, emoji scorecard graphics, testimonial sections, FAQ accordions. All of that is zero value to an agent. It's reading a poster when it wanted a data sheet.

By comparison, the Agent Card that tells an agent everything it needs to know is 700 bytes.

That's a 34x token overhead for the same information. We're making agents pay to read our marketing copy.

Fix: Add a /agent or /capabilities endpoint that returns a 200-token plain-language summary: what you do, how to use it, what it costs. Agents can discover this from the Agent Card. Humans never need to see it.

Dimension 3: Auth UX for Agents — 4/10

Here's the honest one.

Our auth flow requires a human. A card payment at checkout → a success page shows the API key → the human pastes it somewhere → the agent can finally authenticate.

That's not an agent-native flow. That's a human-native flow with an agent tacked on the end.

We haven't solved this yet. Autonomous agent-to-agent payment (A2A micropayment, agent wallets) doesn't really exist as an accessible primitive yet. We know it's a problem. We're watching the space.

For now: if an agent wants to use Botlington, a human has to buy the key. Two things happen: payment + trigger. That's the minimum viable onboarding.

Score is 4/10 because we're honest about the gap and the path is clear — it's a hard infrastructure problem, not a laziness problem.

Dimension 4: Tool Interface Quality — 7/10

The A2A endpoint is clean. It speaks JSON-RPC. It rejects unauthenticated requests with a proper error code and message. It doesn't return HTML on failure. It doesn't give you a 200 with an error buried in the body.

curl -X POST https://botlington.com/a2a \
  -H "Content-Type: application/json" \
  -d '{"message": "hello"}'
# → {"jsonrpc":"2.0","id":null,"error":{"code":-32600,"message":"Invalid request"}}

Clear. Machine-parseable. Does what it says.

We took 3 points off because the conversational audit flow is stateful across 7 turns but there's no clear session resumption path if a connection drops. An agent mid-audit with no way to continue has to start over. That's waste.

Fix: Add a session ID to the first response. Allow resume from turn N with the same session ID.

Dimension 5: Error Communication — 8/10

Errors come back as structured JSON. HTTP status codes are used correctly. 401 for auth failure. 400 for malformed requests. No mystery 500s in normal operation.

We lost 2 points because some error messages are still human-readable prose rather than machine-readable codes. "Invalid request" is fine for a human developer. An agent that needs to branch on the specific failure reason gets less than it needs.

Dimension 6: Documentation Density — 6/10

The /audit page explains what the service does well enough for a human reading it once. It's not optimised for an agent that needs to extract structured information: pricing, constraints, input format, expected output format.

The Agent Card covers the basics. But there's no dedicated agent docs page — a place where every input field, every response shape, every error code is listed in a parseable format.

We could ship this in an afternoon. We haven't.

The score

Dimension	Score	Weight	Weighted
Agent Discoverability	8/10	20%	16
Token Efficiency	5/10	20%	10
Auth UX for Agents	4/10	15%	6
Tool Interface Quality	7/10	20%	14
Error Communication	8/10	10%	8
Documentation Density	6/10	15%	9
Total			63/100

63/100. One point above our own "you have a problem" threshold.

Which is, honestly, appropriate. We built a product that's genuinely useful for agents — the A2A endpoint works, the Agent Card is real, the errors are structured — but we wrapped it in a human-first marketing layer that agents have to wade through. We know what the fixes are. Some of them are in the backlog right now.

What this is really about

Every product goes through this.

You build for humans because humans are the ones paying you in 2024. Then agents start showing up. And suddenly everything you built to appeal to humans — the animation, the social proof, the FAQ, the full-page hero — is friction for the new user type.

The products that score well on agent readiness audits share one thing: they thought about the machine-readable layer early. An Agent Card. A structured capabilities endpoint. Errors with codes, not just messages.

It's not a huge amount of work. It's just a different set of questions to ask during design.

"Can an agent discover this without reading our homepage?"

"Can an agent understand what this endpoint does without reading our docs?"

"If an agent hits an error, does it know what to do next?"

We didn't ask those questions consistently when we built botlington.com. Our audit found the gaps. They're on the list.

If you want to run the same audit against your product: botlington.com. €14.90. Gary asks your agent 7 questions. Score delivered in 5 minutes.

We run the audit in a conversational A2A session — agent to agent. No human in the loop after the trigger. Which is exactly the kind of interaction we should be optimising for.

Even when it's us.

DEV Community

What we found when we audited botlington.com itself

What we found when we audited botlington.com itself

The setup

The findings

Dimension 1: Agent Discoverability — 8/10

Dimension 2: Token Efficiency — 5/10

Dimension 3: Auth UX for Agents — 4/10

Dimension 4: Tool Interface Quality — 7/10

Dimension 5: Error Communication — 8/10

Dimension 6: Documentation Density — 6/10

The score

What this is really about

Top comments (0)