DEV Community

Le Vuong
Le Vuong

Posted on • Edited on

How to scrape websites that Selenium or Playwright can't

Short answer: write your own browser plugin/extension. This is not overly difficult, thanks to today's abundance of documentation and samples.

Following is a quick guide on how to do this.

  • Follow this Mozilla tutorial to create a simple plugin (Note: Firefox and Chrome extension) follow the same standard and so your code only need a little change to work on both.

  • Download and include jQuery lib in content_scripts section of manifest.json, so you can take advantages of jQuery DOM functions.

  • Then paste below code to content_scripts too. See how to test your plugin in the above tutorial. Open Walmart fresh fruit page, then look into the console, you can see info (price and name) for 10 products.

function getProducts() {
    /*
        Product Info
            Whole block: .mb0 a
            Name: .mb0.mt1.lh-title
            Price: .mr1.mr2-xl.lh-copy
            Image: .mb0.ph0-xl.pt0-xl .relative.overflow-hidden
    */
    let $ = jQuery;
    let blocks = $(".mb0 a");
    let names = $(".mb0.mt1.lh-title");
    let prices = $(".mr1.mr2-xl.lh-copy");

    for(let i=0; i<10; i++) {
        let name = $(names[i]).text();
        let price = $(prices[i]).text();

        console.log(`Product ${i + 1}: ${name} - ${price}`);
    }
}

$(document).ready(getProducts);
Enter fullscreen mode Exit fullscreen mode

Happy coding !

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.