DEV Community

Cover image for vscreen 0.2.0: the MCP tools got smarter, and now they build websites
Jon Retting
Jon Retting

Posted on

vscreen 0.2.0: the MCP tools got smarter, and now they build websites

vscreen gives AI agents a real Chromium browser, streamed live over WebRTC. The first release had 63 tools. They worked — but agents kept chaining them inefficiently. Three round-trips to click a button and see what happened.

0.2.0 consolidates 63 tools into 47 with a two-layer architecture, adds a live advisor that catches mistakes in real-time, and introduces a synthesis system that builds websites from scraped data.


Two layers, one fast path

Layer Tools Purpose
Workflow browse, observe, interact, extract, solve_challenge Entire workflows in a single call
Precision click, type, find, wait, scroll, etc. Exact control when needed

vscreen_browse navigates, waits, dismisses cookie banners, screenshots, and returns page info — one call instead of four. vscreen_interact clicks by visible text, returns a screenshot after. vscreen_extract pulls structured data in six modes: articles, table, kv, stats, links, or auto.

Fast path for 80% of tasks. Drop to precision tools when needed.


The advisor

The MCP server tracks every tool call in a sliding window and returns inline hints when it detects anti-patterns:

Pattern detected Hint
click → wait → screenshot vscreen_interact does this in one call
scroll → screenshot loop Use full_page=true instead
Repeated fixed waits Use condition="text" or "selector"
JS for built-in operations Use vscreen_get_page_info instead
5+ calls without Layer 1 Try vscreen_browse or vscreen_interact

One-shot, contextual. Plus vscreen_plan("fill out the form") for step-by-step tool recipes and vscreen_help(topic=...) for built-in docs. The server is self-documenting.


Synthesis Bubble

AI agents build live web pages from scraped data. One call does everything:

vscreen_synthesis_scrape_and_create({
  "instance_id": "dev",
  "title": "Tech News Roundup",
  "urls": [
    { "url": "https://arstechnica.com", "limit": 8, "source_label": "Ars" },
    { "url": "https://techcrunch.com", "limit": 8, "source_label": "TC" },
    { "url": "https://theverge.com", "limit": 8, "source_label": "Verge" }
  ]
})
Enter fullscreen mode Exit fullscreen mode

Three ephemeral tabs open in parallel. The page builds live via SSE as each source finishes. Component type auto-selected: 1–3 → hero, 4–12 → card grid, 13+ → content list.

31 Svelte 5 components — card grids, sortable tables, bar/line/pie charts (raw SVG), timelines, accordions, code blocks, image galleries, and more. The scraper runs 5 strategies (JSON-LD, <article>, heading+link, card heuristics, OpenGraph) with ad filtering, quality scoring, and timeout budgets.


Key fixes

Fix Detail
Zombie processes Aggressive process group kills + lock file cleanup
CAPTCHA solver 2-phase vision: crop header → identify target, then tiles in batches of 3
Orphaned tasks AbortOnDrop RAII guards on all spawned tokio tasks
Grid math saturating_sub prevents u32 underflow panics

Try it

./target/release/vscreen --dev --synthesis --mcp-sse 0.0.0.0:8451
Enter fullscreen mode Exit fullscreen mode

Pre-built Linux binaries on the releases page. Full changelog in the repo.

GitHub: github.com/jameswebb68/vscreen

Top comments (0)