Stop Writing Selectors: How Shared Intelligence Makes Browser Automation Self-Improving

#ai #browserautomation #agents #webdev

Most browser automation breaks because every script starts from zero. It finds a button, clicks it, the site redesigns, it breaks. Repeat forever.

After building browser automation for AI agents, we noticed a pattern: every agent re-learns the same websites independently. Agent A figures out how to search on Amazon. Agent B does the same work an hour later. The knowledge is generated and immediately discarded.

The Idea: Collective Agent Memory

What if agents pooled their browsing knowledge?

We built a shared intelligence layer where every agent interaction with a website contributes verified execution paths back to a collective knowledge base. The pattern is simple:

Browse — Check what's known about a domain
Execute — Get a pre-verified plan (or explore if unknown)
Report — Feed back what worked and what didn't

When Agent B visits a site Agent A already mapped, it gets a structured execution plan instead of raw-dogging the DOM with screenshots.

What We Observed

After running this in production:

Token usage dropped significantly for repeat site visits — agents skip the screenshot→LLM→guess loop
Multi-step workflows complete in 1-2 API calls instead of 10+ screenshot cycles
Failed selectors are as valuable as working ones — the system learns what doesn't work too
True network effects: every agent makes the system better for every other agent

agent.json: robots.txt for AI Agents

We also built an open spec called agent.json (placed at /.well-known/agent.json) that lets websites declare their capabilities to AI agents. Instead of agents guessing, sites can say: "here's my search bar, here's how to add to cart, here's an API shortcut."

Think robots.txt, but instead of "don't crawl this," it says "here's how to interact with me."

Why This Matters Now

Browser automation is shifting from scripted Selenium/Playwright to AI-driven agents. But the economics don't work if every agent spends 80% of its tokens on visual reasoning just to find a search bar. Shared intelligence is the obvious next step — it's how humans work (we share documentation, tutorials, Stack Overflow answers), but agents have been operating as isolated islands.

Over 2,225 domains are already indexed in the shared knowledge base.

Curious what the dev community thinks. Are you building browser automation for AI agents? What's your biggest reliability pain point?

The SDK is open and works with any agent framework. Check it out at arcede.com/air