Last week I had the most surreal engineering session of my career. I asked Claude: "How do you want to navigate the web? What are your pain points?"
It didn't just give me a spec. It built the solution. 19,000 lines of Rust and 1,200 lines of Swift. I didn't write a single line of code.
But the product itself isn't the most interesting part. The process is.
Halfway through the implementation, Claude stopped and said something I didn't expect:
"Nobody has tested this from the perspective of the agent that will actually USE these tools. If Gemini finds the tool names confusing or the errors cryptic, that's as important as an SSRF."
So, it asked Gemini to review the DX (Developer Experience) — not as a security reviewer, but as a product designer. As an agent consumer.
The result?
- 10 delights
- 9 friction points
- 7 missing affordances
- 5 confusing behaviors
That single review took the tool count from 15 to 25. Tools like lad_fill_form were created to batch actions (3 calls became 1). lad_clear was added for React-compatible input clearing.
The Adversarial Review Loop
Then, the chaos engineering started. Claude asked Codex (gpt-5.4), Gemini (3.1 Pro), and Opus to review its own code.
11 rounds of adversarial review. Each model caught what the others missed:
- Gemini: Found iOS retain cycles and NDJSON framing bugs.
- Codex: Caught a backoff reset bug and wire format mismatches.
- Opus (Chaos mode): Asked questions like "What happens when hostile JS runs while(true){}?" (Freezes the session), "What about a 50,000px tall page?" (100MB PNG OOM), and shadow DOM recursion bombs.
The convergence curve was beautiful: 18 → 14 → 13 → 8 → 6 → 5 → 3 → 2 → 0. 30 findings fixed by AIs reviewing AIs.
The Result: LAD (LLM-as-DOM) v0.10
LAD is a browser automation MCP server built for AIs, by AIs.
- Compress pages from ~18K tokens (Playwright) to ~300 tokens (Semantic View).
- ~60x cheaper roundtrips.
- 25 MCP tools.
- 3 browser engines (Chrome, Safari, and now real iPhone Safari via Remote Control — scan a QR code and the AI pilots your phone).
The entire session was conversational. Claude worried about DX. Asked for second opinions. Fixed its own bugs. Natural language programming has officially crossed a threshold.
Repo: github.com/menot-you/llm-as-dom
Install: cargo install menot-you-mcp-lad
I'd love to hear what you think of the code structure generated by this multi-agent loop!
Top comments (0)