DEV Community

Shanewas Ahmed
Shanewas Ahmed

Posted on

Open-source Playwright wrapper that passes bot.sannysoft.com, pixelscan, and CreepJS in headless mode

Been scraping for a while and got tired of getting blocked the moment a page loads. Standard Playwright leaks everywhere — TLS fingerprint, navigator.webdriver, WebGL renderer differences.

Built a library that handles all of that before the page even loads:

  • 1. Region-spoofed TLS handshakes (Japan, US, EU, Korea) instead of Python's defaults
  • 2. JS patches that strip automation flags before first paint
  • 3. Human-like mouse movements and typing speed
  • 4. Auto proxy/session rotation on block detection
  • 5. Built-in MCP server for agent integration

It's on PyPI, MIT licensed:

pip install agentic-stealth-browser

https://github.com/shanewas/agentic-stealth-browser

Tested against bot.sannysoft.com, pixelscan.net, and CreepJS — clean passes in headless mode.

Would love feedback from people who actually hit these problems daily. What's the hardest site you've had to scrape?

Top comments (2)

Collapse
 
double_chen_70da460344c73 profile image
Double CHEN

The region-spoofed TLS handshake part is the interesting bit here. A lot of wrappers stop at navigator.webdriver and JS patches, then still fail on TLS/JA3 or WebGL mismatches. In my own browser-agent runs, I ended up separating fingerprint handling from the action layer: stealth browser first, then a state/receipt loop so the agent can prove it actually finished the task.

Collapse
 
shanewas profile image
Shanewas Ahmed

Thanks! Yeah, the TLS part was honestly the most painful to get right. The first version I had just imported curl_cffi's fingerprints into Playwright's connect params and called it a day — but the profile mapping (which JA3 for which region, which cipher order matches a real Japanese Chrome 134, etc.) took actual trial and error with packet captures. The browser vendors don't exactly document their TLS stacks.

The separation you're describing — fingerprint layer vs action layer — that's exactly where I landed too, just from the opposite direction. I started with stealth patches and human behavior bundled together, then kept pulling them apart as I realized each layer has a totally different failure mode. TLS fails silently (page loads but you're flagged), JS patches fail obviously (Cloudflare challenge), human behavior fails somewhere in the middle (you get through but engagement drops). Mixing them into one class made debugging a nightmare.

The state/receipt loop for proving task completion is clever — I've been thinking about something similar for the MCP server. Right now it just returns {"status": "ok"} and trusts the agent, which feels fragile for anything multi-step. Did you end up checking receipts at the page level (screenshots, DOM snapshots) or at the network level (request/response pairs)?