Introduction
In the world of web automation, Puppeteer has emerged as one of the most powerful tools for controlling headless browsers. With the ability to automate repetitive tasks, scrape data from websites, and even generate screenshots and PDFs, Puppeteer is a must-have tool for developers, testers, and anyone looking to interact with the web programmatically. In this blog, we’ll dive into the key features of Puppeteer and show you how to leverage it to automate web browsing tasks effectively.
About Puppeteer
As a JavaScript developer, you might find yourself wanting to do something crazy—like extracting data from a website that doesn’t offer a free API. But let’s be real—why would they give you free access to their data? That’s where Puppeteer comes in.
Maybe you're a content creator who’s too busy to post regularly on social media, and hiring someone just to do that feels like a waste. Or perhaps you need to automate a tedious web task that eats up your time. Instead of relying on others, you can let Puppeteer handle it for you.
Puppeteer is a Node.js library that provides a high-level API to control headless Chrome (or Chromium). It allows you to perform actions just like a real user—clicking buttons, filling out forms, taking screenshots, and even generating PDFs. It’s widely used for web scraping, UI testing, performance monitoring, and automating repetitive browser tasks.
Imagine being able to log in to multiple accounts automatically, extract stock market data, monitor price drops, or schedule social media posts—all without lifting a finger. That’s the power of Puppeteer!
Whether you want to scrape valuable data, automate social media posts, or streamline repetitive tasks, Puppeteer has got your back. In the next sections, we'll explore how to set it up and use it effectively.
Setting up Puppeteer with Node.Js and TypeScript
First let's create a project directory and initialize Node.Js project
# Create a new folder for the project
mkdir puppeteer-demo
# Navigate into the folder
cd puppeteer-demo
# Initialize a Node.js project (you can press Enter for default settings)
npm init -y
This will generate a package.json
file, which will manage your project dependencies.
Install typescript and necessary dependencies.
# Install TypeScript as a dev dependency
npm install -D typescript
# Install ts-node for running TypeScript files directly
npm install -D ts-node
# Install @types/node for Node.js type definitions
npm install -D @types/node
Run this command to generate tsconfig.json
npx tsc --init
Modify your tsconfig.json
for better compatibility.
{
"compilerOptions": {
"target": "ES6",
"module": "CommonJS",
"outDir": "./dist",
"rootDir": "./src",
"strict": true
}
}
Create src
folder and index.ts
file inside it.
You can manually run your development script using:
ts-node src/index.ts
Or you can modify the scripts
in package.json
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1",
"dev": "ts-node src/index.ts"
}
With this setup, you can run your development script using:
npm run dev
"You might be wondering why I’m using TypeScript instead of JavaScript. While you can definitely use JavaScript, TypeScript offers type safety, better autocompletion, and improved code readability—making development more efficient and less error-prone. If you're still using JavaScript, it's a great time to consider switching to TypeScript!"
Next you'll need to install puppeteer.
npm install puppeteer
It automatically downloads a compatible version of chromium. This ensures that Puppeteer works reliably without any browser compatibility issues.
However if you don't want puppeteer to download chromium and want to use a system-installed version, you can skip the download by setting the PUPPETEER_SKIP_DOWNLOAD
environment variable before installation:
PUPPETEER_SKIP_DOWNLOAD=true npm install puppeteer
You can later tell Puppeteer to use your system-installed Chrome by specifying its path:
const browser = await puppeteer.launch({ executablePath: '/path/to/chrome' });
Now let's setup a basic project.
import puppeteer from 'puppeteer';
(async () => {
// Launch a headless browser
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Navigate to a website
await page.goto('https://example.com');
// Take a screenshot
await page.screenshot({ path: 'example.png' });
console.log('Screenshot saved as example.png');
// Close the browser
await browser.close();
})();
Congrats! You've just automated your first browser task with Puppeteer.
Scrape data using puppeteer
Let's scrape quotes from https://quotes.toscrape.com. Here’s the code:
import puppeteer from "puppeteer";
(async () => {
// Headless mode is enabled by default. Set it to false to see the browser in action.
const browser = await puppeteer.launch({ headless: false })
const page = await browser.newPage();
const VIEWPORT = { width: 1920, height: 1080 };
await page.setViewport(VIEWPORT); // set viewport
await page.goto("https://quotes.toscrape.com/", {
waitUntil: "domcontentloaded",
});
// Extract all quotes from the page
const quotes = await page.evaluate(() => {
const quoteList = document.querySelectorAll(".quote");
return Array.from(quoteList).map((quote) => {
const text = quote.querySelector(".text")?.innerText || "";
const author = quote.querySelector(".author")?.innerText || "";
return { text, author };
});
});
// Display the quotes
console.log(quotes);
await browser.close();
})();
You’ll see the list of quotes printed in your console.
Login to Instagram using puppeteer
Here’s how to automate Instagram login:
import puppeteer from "puppeteer";
(async () => {
// Launch the browser
const browser = await puppeteer.launch({ headless: false });
// Create a new page
const page = await browser.newPage();
const VIEWPORT = { width: 1920, height: 1080 };
await page.setViewport(VIEWPORT); // set viewport
// Go to the page
await page.goto("https://www.instagram.com/", {
waitUntil: "networkidle2",
timeout: 60000,
});
// enter username and password
await page.type('input[name="username"]', 'username');
await page.type('input[name="password"]', 'password');
// click the login button
await page.click('button[type="submit"]');
await page.waitForNavigation();
// take screenshot
await page.screenshot({path: "instagram-login.png"});
await browser.close();
})();
Now you're automating Instagram login!
Conclusion
In this blog, we've explored how Puppeteer can be a game-changer for automating tasks, from web scraping to interacting with web pages like a real user. Whether you're looking to extract valuable data, automate social media posts, or streamline repetitive web tasks, Puppeteer offers a powerful, flexible solution.
We covered the essential steps for setting up Puppeteer with Node.js and TypeScript, walked through basic examples like taking screenshots and scraping data, and even demonstrated how to automate Instagram logins. By incorporating Puppeteer into your workflow, you can save time, reduce manual effort, and create more efficient processes for various web-related tasks.
So, the next time you're faced with a tedious web task, why not automate it with Puppeteer? With its simplicity and versatility, it’s the perfect tool to level up your web automation game.
Try Puppeteer and automate your task--no more manual work!
Happy coding! 🎉
Top comments (2)
kuch kool example dete vai saab.
lol😂