How We Cut Browser Agent Costs 7,000x with Collective Intelligence

#opensource #ai #webdev #automation

Every browser agent does the same expensive thing: dump the entire DOM into an LLM, ask "what should I click?", and repeat for every step. A 10-step workflow costs ~$4 in LLM tokens and ~50 seconds in reasoning time. Multiply that by every agent, every session, every day — and you're burning cash on knowledge that already exists.

The core problem: amnesia at scale

When Agent A figures out how to search flights on WebsiteA, that knowledge evaporates when the session ends. Agent B starts from scratch. So does Agent C. Every agent pays full price to re-learn what hundreds of agents have already discovered.

This is the browser automation equivalent of every developer rewriting left-pad from scratch, every time, in every project.

What if agents could share what they learn?

We built AIR SDK — an open-source MCP server that maintains a shared capability graph across all connected agents. Three API calls:

browse_capabilities(domain)   → What actions can be performed here?
execute_capability(action, params) → How do I do it? (CSS selectors, API paths, macros)
report_outcome(steps, success)     → Here's what actually worked.

The third call is the key. When your agent reports which CSS selectors worked on a given site, every future agent on that domain benefits instantly. No DOM parsing, no LLM reasoning — just a lookup.

The numbers

We benchmarked AIR SDK against raw DOM-to-LLM browser automation on identical tasks:

Scenario	Raw DOM + LLM	AIR SDK	Reduction
1 action (e.g., click search)	$0.24	$0.0006	400x
10-step workflow	$4.13	$0.0006	7,000x
Time per action	~5s	<100ms	50x

The savings compound. The more agents use AIR, the more capabilities get verified, and the cheaper and faster every subsequent interaction becomes.

agent.json: robots.txt for AI agents

We also built agent.json — a standard for websites to declare what AI agents can do. Think robots.txt, but instead of "don't crawl this," it says "here's how to search, here's how to add to cart, here's the API shortcut."

Over 2,225 domains already indexed. Sites can publish their own at /.well-known/agent.json, or AIR learns capabilities automatically through agent reports.

How it works under the hood

AIR resolves capabilities through a tiered system:

API fast-path — If the site has a known API endpoint, skip the browser entirely
Verified macro — Pre-verified CSS selector sequences from prior agent reports
Selector hints — Partial knowledge that reduces LLM reasoning cost
Heuristic plan — For unknown sites, AIR generates a best-guess execution plan

Each tier is progressively more expensive but handles progressively less-explored territory. Most production traffic hits tiers 1-2 at near-zero cost.

Get started

npm install @arcede/air-sdk

Or add it as an MCP server in Claude Desktop, Cursor, or any MCP-compatible client.

Free tier: 1,000 capability executions/month. No credit card required.

MIT licensed. GitHub repo — stars appreciated.

We're building AIR at Arcede. If you're working on browser agents and want to integrate, reach out at info@arcede.com or open an issue on GitHub.

DEV Community