DEV Community

Zee
Zee

Posted on

We Built a Custom Playwright Rendering Pipeline for Our MCP Server — Here is What We Learned

We Built a Custom Playwright Rendering Pipeline for Our MCP Server — Heres What We Learned

At Haunt API, we build web extraction tools for AI agents. Our MCP server lets Claude and other AI assistants extract structured data from any URL. Simple enough on paper — fetch a page, parse the HTML, return JSON.

The problem? Half the internet doesnt want to be fetched.

The Problem With Just Use Playwright

Most web scraping tutorials go something like this:

from playwright.async_api import async_playwright

async with async_playwright() as p:
    browser = await p.chromium.launch()
    page = await browser.new_page()
    await page.goto(url)
    html = await page.content()
Enter fullscreen mode Exit fullscreen mode

And that works! For a demo. For a product that real users depend on, it falls apart fast:

  • Sites detect headless browsers and serve captchas or empty pages
  • SPA pages need time to render — how long do you wait? 2 seconds? 5? 10?
  • You are burning resources loading images, fonts, and CSS when you only need text
  • Every render costs the same — no caching, no intelligence

We went through all of these. Here is how we solved each one.

Lesson 1: Do Not Use One Tool For Everything

Our pipeline has three tiers, and most requests never hit Playwright:

  1. Direct HTTP — Works for approximately 80% of the web. Fast, cheap, no browser needed.
  2. FlareSolverr — Handles Cloudflare challenges and basic JS rendering.
  3. Playwright — Full browser rendering for JS-heavy SPAs that return empty skeletons.

The key insight: we detect skeleton pages — HTML that has an empty root div but no actual content — and only spin up the browser when we need to.

Lesson 2: Smart Wait Strategies Beat Fixed Timers

The worst thing about browser automation is the waiting. A fixed sleep is either too short or too long. We built three concurrent wait strategies — first one to trigger wins:

  • Content Stability — Poll visible text every 200ms. If unchanged for 1 second, done.
  • Network Idle — Wait for no new requests for 500ms.
  • Meaningful Content — Wait until 500+ chars of visible text exist.

This cut our average render time from 6 seconds to under 3.

Lesson 3: Fingerprint Rotation Matters

Headless Chromium has tells. We rotate fingerprints per-URL — same site sees a consistent browser, different sites see different browsers. 10 viewport variants across Windows, macOS, and Linux UAs.

Lesson 4: Block What You Do Not Need

When extracting text data, images and fonts are dead weight. We block them at the network level plus 20+ tracking domains. This cuts HTML payload by 40-60%.

Lesson 5: Cache Renders, Not Requests

If two users extract data from the same URL within 5 minutes, the page probably has not changed. Cache hits return in 0ms.

The Architecture

Six modules, each with a single job:

  • server.py — FastAPI orchestration, browser lifecycle
  • fingerprint.py — UA/viewport/locale rotation
  • smart_wait.py — Content stability + network idle detection
  • site_detect.py — Static vs SPA classification
  • cache.py — LRU render cache with TTL
  • stealth.py — Resource blocking + headless detection evasion

Each module is approximately 100 lines. Easy to test, easy to modify.

What We Learned

  1. Do not reach for the browser first. Most pages are server-rendered.
  2. Wait smarter, not longer.
  3. Be a moving target with fingerprint rotation.
  4. Cache aggressively.
  5. Build modules, not monoliths.

The Playwright browser engine is the oven. Everything around it — the routing, the waiting, the caching, the stealth — is the recipe. That is where the actual engineering lives.


We are Haunt API — web extraction built for AI agents. If you are building with Claude, Cursor, or any AI assistant, our MCP server gives your agent the ability to extract data from any URL.

Top comments (0)