DEV Community

Sandip Shrestha
Sandip Shrestha

Posted on

1 1

Master Web Automation: Automate Browsing Tasks with Puppeteer

Introduction

In the world of web automation, Puppeteer has emerged as one of the most powerful tools for controlling headless browsers. With the ability to automate repetitive tasks, scrape data from websites, and even generate screenshots and PDFs, Puppeteer is a must-have tool for developers, testers, and anyone looking to interact with the web programmatically. In this blog, we’ll dive into the key features of Puppeteer and show you how to leverage it to automate web browsing tasks effectively.

About Puppeteer

As a JavaScript developer, you might find yourself wanting to do something crazy—like extracting data from a website that doesn’t offer a free API. But let’s be real—why would they give you free access to their data? That’s where Puppeteer comes in.

Maybe you're a content creator who’s too busy to post regularly on social media, and hiring someone just to do that feels like a waste. Or perhaps you need to automate a tedious web task that eats up your time. Instead of relying on others, you can let Puppeteer handle it for you.

Puppeteer is a Node.js library that provides a high-level API to control headless Chrome (or Chromium). It allows you to perform actions just like a real user—clicking buttons, filling out forms, taking screenshots, and even generating PDFs. It’s widely used for web scraping, UI testing, performance monitoring, and automating repetitive browser tasks.

Imagine being able to log in to multiple accounts automatically, extract stock market data, monitor price drops, or schedule social media posts—all without lifting a finger. That’s the power of Puppeteer!

Whether you want to scrape valuable data, automate social media posts, or streamline repetitive tasks, Puppeteer has got your back. In the next sections, we'll explore how to set it up and use it effectively.

Setting up Puppeteer with Node.Js and TypeScript

First let's create a project directory and initialize Node.Js project

# Create a new folder for the project
mkdir puppeteer-demo  

# Navigate into the folder
cd puppeteer-demo  

# Initialize a Node.js project (you can press Enter for default settings)
npm init -y  
Enter fullscreen mode Exit fullscreen mode

This will generate a package.json file, which will manage your project dependencies.
Install typescript and necessary dependencies.

# Install TypeScript as a dev dependency
npm install -D typescript

# Install ts-node for running TypeScript files directly
npm install -D ts-node

# Install @types/node for Node.js type definitions
npm install -D @types/node

Enter fullscreen mode Exit fullscreen mode

Run this command to generate tsconfig.json

npx tsc --init
Enter fullscreen mode Exit fullscreen mode

Modify your tsconfig.json for better compatibility.

{
  "compilerOptions": {
    "target": "ES6",             
    "module": "CommonJS",        
    "outDir": "./dist",          
    "rootDir": "./src",         
    "strict": true              
  }
}

Enter fullscreen mode Exit fullscreen mode

Create srcfolder and index.ts file inside it.
You can manually run your development script using:

ts-node src/index.ts
Enter fullscreen mode Exit fullscreen mode

Or you can modify the scripts in package.json

"scripts": {
    "test": "echo \"Error: no test specified\" && exit 1",
    "dev": "ts-node src/index.ts"
  }
Enter fullscreen mode Exit fullscreen mode

With this setup, you can run your development script using:

npm run dev
Enter fullscreen mode Exit fullscreen mode

"You might be wondering why I’m using TypeScript instead of JavaScript. While you can definitely use JavaScript, TypeScript offers type safety, better autocompletion, and improved code readability—making development more efficient and less error-prone. If you're still using JavaScript, it's a great time to consider switching to TypeScript!"

Next you'll need to install puppeteer.

npm install puppeteer
Enter fullscreen mode Exit fullscreen mode

It automatically downloads a compatible version of chromium. This ensures that Puppeteer works reliably without any browser compatibility issues.

However if you don't want puppeteer to download chromium and want to use a system-installed version, you can skip the download by setting the PUPPETEER_SKIP_DOWNLOAD environment variable before installation:

PUPPETEER_SKIP_DOWNLOAD=true npm install puppeteer
Enter fullscreen mode Exit fullscreen mode

You can later tell Puppeteer to use your system-installed Chrome by specifying its path:

const browser = await puppeteer.launch({ executablePath: '/path/to/chrome' });
Enter fullscreen mode Exit fullscreen mode

Now let's setup a basic project.

import puppeteer from 'puppeteer';

(async () => {
  // Launch a headless browser
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Navigate to a website
  await page.goto('https://example.com');

  // Take a screenshot
  await page.screenshot({ path: 'example.png' });

  console.log('Screenshot saved as example.png');

  // Close the browser
  await browser.close();
})();

Enter fullscreen mode Exit fullscreen mode

Congrats! You've just automated your first browser task with Puppeteer.

Scrape data using puppeteer

Let's scrape quotes from https://quotes.toscrape.com. Here’s the code:

import puppeteer from "puppeteer";

(async () => {
  // Headless mode is enabled by default. Set it to false to see the browser in action.

  const browser = await puppeteer.launch({ headless: false })
  const page = await browser.newPage();

  const VIEWPORT = { width: 1920, height: 1080 };
  await page.setViewport(VIEWPORT); // set viewport

  await page.goto("https://quotes.toscrape.com/", {
    waitUntil: "domcontentloaded",
  });

 // Extract all quotes from the page
 const quotes = await page.evaluate(() => {
  const quoteList = document.querySelectorAll(".quote");

  return Array.from(quoteList).map((quote) => {
    const text = quote.querySelector(".text")?.innerText || "";
    const author = quote.querySelector(".author")?.innerText || "";

    return { text, author };
   });
 });

  // Display the quotes
  console.log(quotes);

  await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

You’ll see the list of quotes printed in your console.

Login to Instagram using puppeteer

Here’s how to automate Instagram login:

import puppeteer from "puppeteer";

(async () => {
  // Launch the browser
  const browser = await puppeteer.launch({ headless: false });

  // Create a new page
  const page = await browser.newPage();

  const VIEWPORT = { width: 1920, height: 1080 };
  await page.setViewport(VIEWPORT); // set viewport

  // Go to the page
  await page.goto("https://www.instagram.com/", {
    waitUntil: "networkidle2",
    timeout: 60000,
  });

  // enter username and password
  await page.type('input[name="username"]', 'username');
  await page.type('input[name="password"]', 'password');

  // click the login button
  await page.click('button[type="submit"]');

  await page.waitForNavigation();

 // take screenshot
  await page.screenshot({path: "instagram-login.png"});

  await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

Now you're automating Instagram login!

Conclusion

In this blog, we've explored how Puppeteer can be a game-changer for automating tasks, from web scraping to interacting with web pages like a real user. Whether you're looking to extract valuable data, automate social media posts, or streamline repetitive web tasks, Puppeteer offers a powerful, flexible solution.

We covered the essential steps for setting up Puppeteer with Node.js and TypeScript, walked through basic examples like taking screenshots and scraping data, and even demonstrated how to automate Instagram logins. By incorporating Puppeteer into your workflow, you can save time, reduce manual effort, and create more efficient processes for various web-related tasks.

So, the next time you're faced with a tedious web task, why not automate it with Puppeteer? With its simplicity and versatility, it’s the perfect tool to level up your web automation game.

Try Puppeteer and automate your task--no more manual work!
Happy coding! 🎉

Sentry blog image

How I fixed 20 seconds of lag for every user in just 20 minutes.

Our AI agent was running 10-20 seconds slower than it should, impacting both our own developers and our early adopters. See how I used Sentry Profiling to fix it in record time.

Read more

Top comments (2)

Collapse
 
thekbbohara profile image
Kb Bohara

kuch kool example dete vai saab.

Collapse
 
sandip_shrest profile image
Sandip Shrestha

lol😂

The best way to debug slow web pages cover image

The best way to debug slow web pages

Tools like Page Speed Insights and Google Lighthouse are great for providing advice for front end performance issues. But what these tools can’t do, is evaluate performance across your entire stack of distributed services and applications.

Watch video

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay