DEV Community

Dmytro Krasun
Dmytro Krasun

Posted on • Originally published at screenshotone.com

5 2

How to block requests with Puppeteer

If you want to speed up scrapping or make screenshots faster, you can disable all the requests that do not make any crucial impact on the results.

Puppeteer allows blocking any outgoing requests while loading the page. Whether you want to block ads, tracking scripts, or different types of resources, it is relatively easy to do with Puppeteer.

A fully working example of blocking requests

Let's start with a fully working example on how to intercept and block requests in Puppeteer:

const puppeteer = require('puppeteer');
const wildcardMatch = require('wildcard-match');

const blockRequest = wildcardMatch(['*.css', '*.js'], { separator: false });

(async () => {
    const browser = await puppeteer.launch({});
    try {

        const page = await browser.newPage();
        page.setRequestInterception(true);

        page.on('request', (request) => {
            if (blockRequest(request.url())) {
                const u = request.url();
                console.log(`request to ${u.substring(0, 50)}...${u.substring(u.length - 5)} is aborted`);

                request.abort();

                return;
            }

            request.continue();
        });

        await page.goto('https://screenshotone.com/');
    } catch (e) {
        console.log(e)
    } finally {
        await browser.close();
    }
})();
Enter fullscreen mode Exit fullscreen mode

The result is:

request to https://screenshotone.com/main.7a76b580aa30ffecb0b...f.css is aborted
request to https://screenshotone.com/js/bootstrap.min.592b9fa...ab.js is aborted
request to https://screenshotone.com/js/highlight.min.e13cfba...5f.js is aborted
request to https://screenshotone.com/main.min.dabf7f45921a731...45.js is aborted
Enter fullscreen mode Exit fullscreen mode

Sorry, but I won't show you the resulting screenshot of the site because it looks awful without CSS and JS.

A step-by-step explanation

The most crucial step is not to forget to enable request interception before sending any request:

// ... 
const page = await browser.newPage();
page.setRequestInterception(true);
// ... 
Enter fullscreen mode Exit fullscreen mode

Otherwise, the trick won't work.

After request interception is enabled, you can listen to any new outgoing request while the page is being loaded and decide on a per-request basis whether to block the request or not.

If you want to block all requests to www.google-analytics.com to speed up the site loading and to avoid tracking, then just filter requests based on the domain substring:

page.on('request', (request) => {
    if (request.url().includes('www.google-analytics.com')) {    
        request.abort();

        return;

    }

    request.continue();
});
Enter fullscreen mode Exit fullscreen mode

The better option is to parse URL, extract domain, and filter based on the domain name:

page.on('request', (request) => {
    const domain = url.parse(request.url(), false).hostname;
    if (domain == 'www.google-analytics.com') {
        request.abort();

        return;
    }

    request.continue();
});
Enter fullscreen mode Exit fullscreen mode

Because you might have an URL that accidentally might include www.google-analytics.com.

Blocking requests by resource type

If you need to block a set of requests by the resource type, like images or stylesheets, regardless of the extension and URL pattern, you can use the request.resourceType() method to test against blocking resource type:

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch({});
    try {
        const page = await browser.newPage();
        page.setRequestInterception(true);

        page.on('request', (request) => {
            if (request.resourceType() == "stylesheet" || request.resourceType() == "script") {
                const u = request.url();
                console.log(`request to ${u.substring(0, 50)}...${u.substring(u.length - 5)} is aborted`);

                request.abort();

                return;
            }

            request.continue();
        });

        await page.goto('https://screenshotone.com/');
    } catch (e) {
        console.log(e)
    } finally {
        await browser.close();
    }
})();
Enter fullscreen mode Exit fullscreen mode

The result is the same as for the initial example:

request to https://screenshotone.com/main.7a76b580aa30ffecb0b...f.css is aborted
request to https://screenshotone.com/js/bootstrap.min.592b9fa...ab.js is aborted
request to https://screenshotone.com/js/highlight.min.e13cfba...5f.js is aborted
request to https://screenshotone.com/main.min.dabf7f45921a731...45.js is aborted
Enter fullscreen mode Exit fullscreen mode

Puppetteer supports blocking the next resource types:

  • document
  • stylesheet
  • image
  • media
  • font
  • script
  • texttrack
  • xhr
  • fetch
  • eventsource
  • websocket
  • manifest
  • other

As you see, it is pretty straightforward.

Have a nice day 👋

I hope I have helped you tackle request blocking in Puppeteer, and I honestly wish you a nice day!

Sentry blog image

How to reduce TTFB

In the past few years in the web dev world, we’ve seen a significant push towards rendering our websites on the server. Doing so is better for SEO and performs better on low-powered devices, but one thing we had to sacrifice is TTFB.

In this article, we’ll see how we can identify what makes our TTFB high so we can fix it.

Read more

Top comments (0)

Heroku

Simplify your DevOps and maximize your time.

Since 2007, Heroku has been the go-to platform for developers as it monitors uptime, performance, and infrastructure concerns, allowing you to focus on writing code.

Learn More

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay