A developer recently documented burning through 180 million tokens per month — $3,600 — running AI browser agents. That's not a typo.
The browser-use community (78K GitHub stars) is full of users asking the same question:
"I have a recurring task meant for webscraping to be done every 5 min. I do not want to use too many tokens. Is it possible to repeat the tasks?" — browser-use #494
"My business scenario requires solidifying the agent's execution process into a tool. I noticed
save_as_playwright_scriptis commented out." — browser-use #4519"Running the default task took 12 minutes on M3 Max, 36GB RAM" — browser-use #957
The problem is architectural: every run uses AI tokens, even when you're doing the exact same thing for the 1,000th time.
The Interpreter vs. Compiler Model
Today's browser agents work like interpreters — AI reasons about every click, every scroll, every form fill, every single time:
Interpreter (browser-use, Stagehand, Operator):
Run 1: AI reads page → decides action → executes ($0.01)
Run 2: AI reads page → decides action → executes ($0.01)
Run 100: AI reads page → decides action → executes ($0.01)
Run 1000: AI reads page → decides action → executes ($0.01)
Total: $10.00 (and growing)
But what if AI could compile the workflow once, then replay it forever?
Compiler approach:
Run 1: AI inspects page → generates program ($0.04, one-time)
Run 2: Program runs deterministically ($0.00)
Run 100: Program runs deterministically ($0.00)
Run 1000: Program runs deterministically ($0.00)
Total: $0.04 (forever)
This isn't hypothetical. Tap implements this exact pattern:
-
forge inspect— Analyzes the page (framework, SSR state, APIs, DOM structure). Zero AI tokens. -
AI generates a
.tap.jsprogram — One-time cost (~$0.04). -
tap run— Executes the program forever. $0.00 per run.
Why API-First Beats DOM Replay
Most record-and-replay tools (including browser-use's workflow-use) capture DOM interactions — clicks, typing, scrolling. This breaks when the UI changes.
The better approach: extract via API when possible, DOM only as fallback.
Most modern websites have internal APIs (Next.js __NEXT_DATA__, Nuxt SSR state, REST endpoints). Calling the API directly is:
- 100x more reliable than simulating clicks
- Immune to UI redesigns
- Faster (no rendering needed)
For example, getting Hacker News front page:
// DOM approach (fragile):
document.querySelectorAll('.athing').forEach(row => { ... })
// API approach (robust):
const data = await fetch('https://hacker-news.firebaseio.com/v0/topstories.json')
Real Numbers
| Metric | AI Agent (per run) | Compiled Program (per run) |
|---|---|---|
| Cost | $0.003–0.01 | $0.00 |
| Speed | 12 min (reported) | 5 seconds |
| Reliability | Varies (AI hallucinations) | Deterministic |
| Tokens | 1K–10K per action | 0 |
At 100 runs/day:
- AI agent: $30–300/month
- Compiled program: $0.04 total (one-time forge cost)
The Takeaway
If you're running the same browser task more than once, you're overpaying by 100–1000x. The future isn't smarter agents — it's agents that are smart once and produce deterministic programs.
Token prices are falling 10x/year. But $0 will always beat any price.
Tap is open source. 208 pre-built programs across 77 sites. One binary, zero dependencies.
Try it: taprun.dev | GitHub
Top comments (0)