DEV Community

Cover image for User Agent string difference in Puppeteer headless and headful
Sony AK
Sony AK

Posted on

Puppeteer User Agent User Agent string difference in Puppeteer headless and headful

Today I will talk about the User Agent difference when we running Puppeteer in headless and headful mode.

For people not familiar with Puppeteer, Puppeteer is a Node library that provides many high-level API to control the headless Chrome or Chromium over DevTools protocol. You can go to https://pptr.dev/ for more details.

Puppeteer in headless mode means you control Chrome or Chromium browser without displaying the browser UI. In the opposite, Puppeteer in headful mode will display the browser UI and this is useful for debugging.

As mentioned here https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent, User Agent string is a characteristic string that allows the network protocol peers to identify the application type, operating system, software vendor or software version of the requesting software user agent.

Web browser send User-Agent request header when we browse a web pages on the internet. Here is sample of my User Agent.

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36

Preparation

Install Puppeteer with this command.

npm i puppeteer
Enter fullscreen mode Exit fullscreen mode

The code

OK now let's create a code to show User Agent string when running Puppeteer in headless mode.

File puppeteer_headless.js

const puppeteer = require('puppeteer');

(async () => {
        const browser = await puppeteer.launch();

        console.log(await browser.userAgent());

        await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

Run it.

node puppeteer_headless.js
Enter fullscreen mode Exit fullscreen mode

On my machine it will display like below.

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/79.0.3945.0 Safari/537.36

Please notice there is sub string HeadlessChrome there.

OK now let's create a code to show User Agent string when running Puppeteer in headful mode.

File puppeteer_headful.js

const puppeteer = require('puppeteer');

(async () => {
        const browser = await puppeteer.launch({ headless: false });

        console.log(await browser.userAgent());

        await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

Run with

node puppeteer_headful.js
Enter fullscreen mode Exit fullscreen mode

On my machine it will display like below.

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.0 Safari/537.36

Now we can see that this User Agent string is similar like normal web browser User Agent string.

Why this is interesting? Suppose you want to scrap a website using Puppeteer in headless mode and the target website put a protection by detecting the User Agent string (blocking ChromeHeadless) then your scraping activity might be blocked.

How to set User Agent on headless Chrome

Anyway we still can set User Agent string in Puppeteer headless mode, it will override the default headless Chrome User Agent string.

Here is the code sample.

File puppeteer_set_user_agent.js

const puppeteer = require('puppeteer');

(async () => {
        // prepare for headless chrome
        const browser = await puppeteer.launch();
        const page = await browser.newPage();

        // set user agent (override the default headless User Agent)
        await page.setUserAgent('Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36');

        // go to Google home page
        await page.goto('https://google.com');

        // get the User Agent on the context of Puppeteer
        const userAgent = await page.evaluate(() => navigator.userAgent );

        // If everything correct then no 'HeadlessChrome' sub string on userAgent
        console.log(userAgent);

        await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

It will display User Agent that we already set before we browse to Google web page.

Thank you and I hope you enjoy it.

Top comments (12)

Collapse
 
mnasirayub profile image
Muhammad Nasir Ayub

Hi everyone,
I am using Puppeteer library in NodeJS for runtime PDF file generation. It works fine on my local system, but when I deploy my app on a cPanel Based CentOs Os server, it throws an error. Any solution would be appreciated.

Collapse
 
omicron666 profile image
Omicron

There must be other altered behaviours too. Some tests were not working in headless mode, after developping them with browser display.

Collapse
 
sonyarianto profile image
Sony AK

ic ic, thanks for the info

Collapse
 
dynamitebud profile image
Rudra

Hi I wanted to know how to change the cdc variable to go undetected from the message of "chrome is controlled by an automation software". No idea if the site detected...

Collapse
 
sonyarianto profile image
Sony AK

Hi Rudra, thanks for the question. Actually I still have no idea about it as well. But any use case for you to hide that thing?

I found this link help.applitools.com/hc/en-us/artic... that maybe related to it?

Collapse
 
dynamitebud profile image
Rudra

Thank you.

Collapse
 
shakilsultan profile image
Shakil Sultan Ali

That's perfect dude! Thank you!

Collapse
 
mcdrecords profile image
M.C.D

Hello sir,
could you help me please?
I would like to load random useragent for each page lunch.
How do i do that?
Example:
page.setUserAgent('/utils/referers.txt');

thank you

Collapse
 
j2403sam profile image
Sam

I would have a variable called userAgents that is an array of user agent strings then do something like

await page.setUserAgent(userAgents[Math.floor(Math.random()*userAgents.length)]);

Collapse
 
mouhannadlrx profile image
Mouhannad

I logged in just to add like to this comment

Thread Thread
 
sonyarianto profile image
Sony AK

thank you :)

Collapse
 
martinratinaud profile image
Martin Ratinaud

Nice article, thanks