Discussion on: 🚀 Permi v0.3.0 – Major Improvements to JS Scanning, AI Accuracy, and Speed

View post

How does Playwright improve your JS crawling compared to other tools like Puppeteer or Selenium? I'm curious if you've noticed any significant differences in terms of performance or capability, especially when dealing with dynamic content.

Peter Nasarah Dashe • May 12

Great question! I chose Playwright for Permi's JS crawling because of three concrete advantages I observed while testing against modern SPAs and anti‑bot protections.

Auto‑waiting & reliability
Playwright automatically waits for elements to be ready before interacting. With Puppeteer, I often had to add manual waitForSelector or timeouts – which broke whenever the network lagged. Selenium was even worse; flaky tests were the norm. Playwright’s built‑in auto‑wait reduced my crawl failures by about 40%.

Network interception & stealth
Playwright’s route API makes it trivial to block images, fonts, and trackers – speeding up scans significantly. More importantly, I integrated playwright‑stealth to mimic real browser fingerprints. This bypasses Cloudflare and other anti‑bot systems that Selenium (and even vanilla Puppeteer) struggle with. Without stealth, many SPAs returned 403 or empty pages.
Multi‑browser & debugging
Puppeteer only works with Chromium, but I’ve seen sites behave differently in Firefox or WebKit. Playwright supports all three with the same API – critical for comprehensive crawling. And its trace viewer saved me hours debugging why a page hung; Selenium offers nothing close.

Performance numbers (real crawl on a React site):

Selenium: 48s, 3 timeouts, 2 false‑negatives (missed links)

Puppeteer (stealth added manually): 32s, 1 timeout

Playwright + stealth: 22s, 0 timeouts, all dynamic content captured

So for Permi, Playwright wasn’t just a marginal improvement – it made JS scanning viable at scale. Still experimental in the community edition, but the foundation is solid.

Hope that helps! Happy to share code snippets if you're implementing something similar.

— Nasarah (Permi)

Rasmus Ros • May 13

That tracks. It usually pays off most on auth heavy SPAs with service workers and delayed API hydration, where brittle waits turn crawling into a coin flip. The auto waiting point matters more in practice than most benchmark comparisons do.