DEV Community

Alex Spinov
Alex Spinov

Posted on

Puppeteer Has a Free API — Browser Automation by Google

Puppeteer is Google's free Node.js library for controlling Chrome/Chromium — the tool behind every serious web scraping and browser automation project.

Why Puppeteer?

  • Official Google project — maintained by the Chrome DevTools team
  • Headless and headed modes — run without visible browser for speed
  • Full Chrome DevTools Protocol — access everything Chrome can do
  • PDF generation — render any page to pixel-perfect PDF
  • Screenshot API — full page or element-level captures
  • Network interception — modify requests and responses on the fly

Quick Start

# Install (downloads Chromium automatically)
npm install puppeteer

# Or without bundled browser (use your own Chrome)
npm install puppeteer-core
Enter fullscreen mode Exit fullscreen mode

Web Scraping Example

import puppeteer from "puppeteer";

const browser = await puppeteer.launch();
const page = await browser.newPage();

await page.goto("https://news.ycombinator.com");

// Extract all story titles and links
const stories = await page.$$eval(".titleline > a", (links) =>
  links.map((a) => ({
    title: a.textContent,
    url: a.href,
  }))
);

console.log(`Found ${stories.length} stories`);
stories.forEach((s) => console.log(`${s.title}: ${s.url}`));

await browser.close();
Enter fullscreen mode Exit fullscreen mode

PDF Generation (Free Alternative to Paid APIs)

import puppeteer from "puppeteer";

const browser = await puppeteer.launch();
const page = await browser.newPage();

// Render any HTML to PDF
await page.setContent(`
  <html>
    <style>
      body { font-family: Arial; padding: 40px; }
      h1 { color: #2563eb; }
      .invoice-table { width: 100%; border-collapse: collapse; }
      .invoice-table td { padding: 8px; border-bottom: 1px solid #eee; }
    </style>
    <h1>Invoice #1234</h1>
    <table class="invoice-table">
      <tr><td>Web Scraping Service</td><td>$500</td></tr>
      <tr><td>Data Cleaning</td><td>$200</td></tr>
      <tr><td><strong>Total</strong></td><td><strong>$700</strong></td></tr>
    </table>
  </html>
`);

await page.pdf({
  path: "invoice.pdf",
  format: "A4",
  printBackground: true,
  margin: { top: "20mm", bottom: "20mm", left: "15mm", right: "15mm" },
});

await browser.close();
Enter fullscreen mode Exit fullscreen mode

Screenshot Automation

import puppeteer from "puppeteer";

const browser = await puppeteer.launch();
const page = await browser.newPage();

// Set viewport for consistent screenshots
await page.setViewport({ width: 1920, height: 1080 });
await page.goto("https://example.com", { waitUntil: "networkidle2" });

// Full page screenshot
await page.screenshot({ path: "full-page.png", fullPage: true });

// Element screenshot
const element = await page.$("header");
await element.screenshot({ path: "header.png" });

// Specific area
await page.screenshot({
  path: "hero-section.png",
  clip: { x: 0, y: 0, width: 1920, height: 600 },
});

await browser.close();
Enter fullscreen mode Exit fullscreen mode

Form Automation

import puppeteer from "puppeteer";

const browser = await puppeteer.launch({ headless: false }); // Visible for demo
const page = await browser.newPage();

await page.goto("https://example.com/signup");

// Type with realistic delays
await page.type("#name", "John Doe", { delay: 50 });
await page.type("#email", "john@example.com", { delay: 50 });
await page.type("#password", "SecureP@ss123", { delay: 50 });

// Select dropdown
await page.select("#country", "US");

// Check checkbox
await page.click("#terms");

// Click submit and wait for navigation
await Promise.all([
  page.waitForNavigation(),
  page.click("#submit-btn"),
]);

console.log("Current URL:", page.url());
await browser.close();
Enter fullscreen mode Exit fullscreen mode

Network Interception

import puppeteer from "puppeteer";

const browser = await puppeteer.launch();
const page = await browser.newPage();

// Enable request interception
await page.setRequestInterception(true);

page.on("request", (req) => {
  // Block images and CSS for faster scraping
  if (["image", "stylesheet", "font"].includes(req.resourceType())) {
    req.abort();
  } else {
    req.continue();
  }
});

// Track API responses
page.on("response", async (res) => {
  if (res.url().includes("/api/")) {
    const data = await res.json().catch(() => null);
    if (data) console.log("API Response:", data);
  }
});

await page.goto("https://example.com");
await browser.close();
Enter fullscreen mode Exit fullscreen mode

Stealth Mode (Avoid Bot Detection)

import puppeteer from "puppeteer-extra";
import StealthPlugin from "puppeteer-extra-plugin-stealth";

puppeteer.use(StealthPlugin());

const browser = await puppeteer.launch({ headless: "new" });
const page = await browser.newPage();

// Set realistic user agent
await page.setUserAgent(
  "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36"
);

// Set realistic viewport
await page.setViewport({ width: 1366, height: 768 });

await page.goto("https://example.com");
Enter fullscreen mode Exit fullscreen mode

Puppeteer vs Playwright vs Selenium

Feature Puppeteer Playwright Selenium
Browsers Chrome/Chromium Chrome, Firefox, Safari All
Language Node.js Node/Python/Java/C# All major
Speed Fast Fastest Slowest
Auto-wait Partial Full None
Stealth plugins Yes Limited Limited
PDF generation Excellent Good None
Maintained by Google Microsoft Community

Need to scrape data from any website and get it in structured JSON? Check out my web scraping tools on Apify — no coding required, results in minutes.

Have a custom data extraction project? Email me at spinov001@gmail.com — I build tailored scraping solutions for businesses.

Top comments (0)