The Orchestrator — Issue #0: The Browser Wars Have Gone Agentic

#ai #agents #automation #opensource

The Orchestrator — Issue #0 (Pilot)

February 17, 2025

The Signal

The Browser Wars Have Gone Agentic

Every major AI lab is now shipping agents that can use a computer. And I mean actually use it — clicking buttons, filling forms, navigating websites. OpenAI's Operator, powered by their Computer-Using Agent (CUA) model, launched to Pro subscribers last month and has been the talk of the agent community ever since. It combines GPT-4o's vision with reinforcement learning to interpret screenshots and interact with GUIs like a human would.

But here's what made this week interesting: the open-source response arrived fast and loud. Browser Use, an open-source alternative, has been blowing up across Reddit, YouTube, and dev Twitter. It does roughly what Operator does — autonomous browser control — but you self-host it, use any LLM you want, and pay nothing. The pitch is simple: why pay $200/month for ChatGPT Pro when you can run browser automation locally?

I've been testing both. Operator is polished but constrained — it runs in OpenAI's sandbox, not your actual browser. Browser Use is rougher but far more flexible. For developers building agent workflows, Browser Use is the more interesting primitive. For end users who just want "book me a flight," Operator wins on UX.

The bigger picture: we're watching the "agent interface layer" get commoditized in real time. Anthropic has Computer Use in beta. Google is reportedly working on their own. Within six months, every frontier model will ship with browser control as a standard capability. The question isn't if agents will use our computers — it's who controls the session.

Agent Drops

Anthropic's Next Model Incoming — Anthropic is reportedly weeks away from releasing a new Claude model that combines standard language capabilities with deep reasoning, featuring a "sliding scale" for cost control. Early reports say it beats o3-mini-high on some coding benchmarks. This could be the hybrid reasoning model we've been waiting for.

Anthropic Economic Index — Anthropic analyzed 4 million+ Claude conversations to map how AI is actually being used in the economy. The headline finding: most usage is "augmentation" not "automation" — people working with AI, not replacing themselves. Software development dominates. Full paper on arXiv.

OpenAI's GPT-4.5 Roadmap — Sam Altman laid out the path forward: GPT-4.5 (codename "Orion") ships in weeks as the final non-chain-of-thought model. After that, o-series and GPT-series merge into one unified model that "knows when to think." The model picker is dying. Good riddance.

Eudia Launches with $105M — Legal AI startup Eudia came out of stealth with $105M to build AI agents for corporate legal teams. That's a massive seed-stage raise, and it signals that vertical agent plays (agents built for one specific domain) are where the money is flowing.

Figure AI at $39.5B — The humanoid robotics company is in talks to raise $1.5B at a $39.5B valuation. Not purely an "agent" company, but their robots run on agent architectures. The embodied agent space is getting absurdly well-funded.

Build This

A Personal Research Agent with Browser Use

Stack: Browser Use + Claude 3.5 Sonnet (or any LLM via API) + Python
Complexity: Intermediate
Cost: ~$0.10/run (API costs only)

The idea: build an agent that takes a research question, opens a browser, searches across multiple sources, extracts key findings, and returns a structured summary. Think OpenAI's Deep Research, but yours.

Install Browser Use (pip install browser-use)
Configure it with your preferred LLM (Claude, GPT-4o, or even a local model via Ollama)
Write a task prompt: "Research [topic]. Visit at least 3 sources. Extract key claims with URLs. Summarize in 500 words."
Add a simple output parser that structures the results into markdown

The magic is in the task decomposition — Browser Use handles the clicking and scrolling, your LLM handles the reasoning. Chain them together and you've got a research agent that costs pennies per query instead of $200/month.

Start simple. Get it working on one search engine. Then add multi-source routing. Then add fact-checking between sources. That's how you build agents that actually work — incrementally, not all at once.

One Link

Which Economic Tasks are Performed with AI? — Anthropic's full paper behind the Economic Index. It's the most rigorous look at real-world AI usage I've seen. Not vibes, not surveys — actual conversation data from millions of users mapped to occupational tasks. If you care about where agents are headed, this paper tells you where they already are. The finding that 37% of occupations use AI for at least a quarter of their tasks is both exciting and sobering. Read it.

The Orchestrator is a weekly newsletter about AI agents and autonomous AI. Written by Victor Iglesias.