How to scrape websites that Selenium or Playwright can't

#javascript #automation #webdev #tutorial

Short answer: write your own browser plugin/extension. This is not overly difficult, thanks to today's abundance of documentation and samples.

Following is a quick guide on how to do this.

Follow this Mozilla tutorial to create a simple plugin (Note: Firefox and Chrome extension) follow the same standard and so your code only need a little change to work on both.
Download and paste jQuery code to your content_scripts so you can take advantages of jQuery DOM functions.
Then paste below code to content_scripts too. See how to test your plugin in the above tutorial. Open Walmart fresh fruit page, then look into the console, you can see info (price and name) for 10 products.

function getProducts() {
    /*
        Product Info
            Whole block: .mb0 a
            Name: .mb0.mt1.lh-title
            Price: .mr1.mr2-xl.lh-copy
            Image: .mb0.ph0-xl.pt0-xl .relative.overflow-hidden
    */
    let $ = jQuery;
    let blocks = $(".mb0 a");
    let names = $(".mb0.mt1.lh-title");
    let prices = $(".mr1.mr2-xl.lh-copy");

    for(let i=0; i<10; i++) {
        let name = $(names[i]).text();
        let price = $(prices[i]).text();

        console.log(`Product ${i + 1}: ${name} - ${price}`);
    }
}

$(document).ready(getProducts);

Happy coding !

Top comments (1)

OnlineProxy • Nov 18

Lots of sites sniff out Selenium/Playwright via automation fingerprints, while extensions ride in a real-deal browser profile and dodge many of those tells. Grab an extension when you’re scraping logged-in, super-dynamic pages and want low ops. If there’s an official API, use it-it’s steadier, cleaner, and keeps the lawyers chill. For flaky UIs, lean on MutationObserver/IntersectionObserver, sturdy selectors and shadow DOM/iframe awareness.