Muhammad Ikramullah Khan

Posted on Feb 6

Lightpanda: The Beginner's Guide to the Fastest Headless Browser

#programming #node #webdev #javascript

You're scraping a few hundred product pages every hour. Your server keeps dying.

Chrome eats more and more memory. 200MB per browser instance. You spin up 10 instances and watch your RAM disappear. The server slows down. Then crashes. You restart everything and the cycle repeats.

Then you check your AWS bill. $300 per month just to keep Chrome alive. For what? Scraping a few hundred pages?

You try everything. Lower concurrency. Kill instances faster. More RAM. Nothing solves the core problem. Chrome was built for humans browsing the web, not machines scraping at scale.

That's when you discover Lightpanda. A browser built from scratch for automation. Not Chrome with the UI removed. An actual ground-up rebuild designed for machines.

24MB of memory instead of 207MB. 2 seconds to scrape 100 pages instead of 25 seconds. Your AWS bill drops to $80.

Let me show you how to use it.

What is Lightpanda?

Lightpanda is a headless browser written in Zig (a systems programming language) from the ground up. It's not based on Chrome, Firefox, or any existing browser. It was built specifically for web automation, scraping, and AI agents.

The key differences:

Chrome headless is a full desktop browser with the display turned off. All the code for rendering graphics, managing tabs, handling extensions, and displaying a UI is still there. You're paying (in memory and CPU) for features you never use.

Lightpanda only builds what automation needs. DOM tree, JavaScript execution, network requests. Nothing else. No rendering engine. No UI code. Just the essentials.

The results:

11x faster than Chrome (real benchmark on AWS)
9x less memory (24MB vs 207MB)
Instant startup (no 3-second wait)
Compatible with Puppeteer and Playwright (drop-in replacement)

When to use Lightpanda:

Web scraping at any scale
Testing websites with JavaScript
Building AI agents that browse the web
Any automation that doesn't need visual rendering

When to still use Chrome:

Taking screenshots
Generating PDFs
Testing actual visual rendering
Complex debugging with DevTools

Installing Lightpanda

Let's get Lightpanda running on your machine. I'll show you the easiest way for each platform.

Option 1: Using npm (Recommended for Beginners)

This is the simplest way. The npm package handles everything for you.

Requirements:

Node.js installed (version 14 or higher)
That's it!

Step 1: Create a new project

mkdir lightpanda-test
cd lightpanda-test
npm init -y

Step 2: Install Lightpanda

npm install @lightpanda/browser

The package automatically downloads the right binary for your operating system. No manual setup needed.

Step 3: Test it works

Create a file called test.js:

import { lightpanda } from '@lightpanda/browser';

const options = {
  host: '127.0.0.1',
  port: 9222,
};

(async () => {
  // Start Lightpanda
  const proc = await lightpanda.serve(options);

  console.log('Lightpanda is running!');
  console.log('Process ID:', proc.pid);

  // Stop Lightpanda
  proc.stdout.destroy();
  proc.stderr.destroy();
  proc.kill();

  console.log('Lightpanda stopped');
})();

Step 4: Run it

node test.js

You should see:

🐼 Running Lightpanda's CDP server...
{ pid: 12345 }
Lightpanda is running!
Process ID: 12345
Lightpanda stopped

If you see that, it works! Lightpanda starts and stops almost instantly.

Option 2: Download Binary Directly

If you don't want to use Node.js, you can download the binary.

For Linux (x86_64):

curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/download/nightly/lightpanda-x86_64-linux
chmod +x ./lightpanda
./lightpanda -h

For Mac (Apple Silicon):

curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/download/nightly/lightpanda-aarch64-macos
chmod +x ./lightpanda
./lightpanda -h

Note: There's no Intel Mac build yet. Intel Mac users should use Docker or the npm package.

For Windows:

Windows users need to use WSL2 (Windows Subsystem for Linux). Once WSL2 is installed, follow the Linux instructions above from inside your WSL terminal.

Move to PATH (optional):

To run lightpanda from anywhere:

sudo mv lightpanda /usr/local/bin/

Now lightpanda works from any directory.

Option 3: Docker (Works Everywhere)

Docker works on all platforms.

docker run -d --name lightpanda -p 9222:9222 lightpanda/browser:nightly

This starts Lightpanda in a container and exposes port 9222 for connections.

Your First Lightpanda Script

Let's scrape Wikipedia and extract all the links. This will show you how Lightpanda works with Puppeteer.

Step 1: Install Puppeteer

We need puppeteer-core (not regular puppeteer). The regular version downloads Chrome, which we don't want.

npm install puppeteer-core

Step 2: Create the Script

Create scrape.js:

import { lightpanda } from '@lightpanda/browser';
import puppeteer from 'puppeteer-core';

const lpdOptions = {
  host: '127.0.0.1',
  port: 9222,
};

const puppeteerOptions = {
  browserWSEndpoint: `ws://${lpdOptions.host}:${lpdOptions.port}`,
};

(async () => {
  // Start Lightpanda
  console.log('Starting Lightpanda...');
  const proc = await lightpanda.serve(lpdOptions);

  // Connect Puppeteer to Lightpanda
  console.log('Connecting Puppeteer...');
  const browser = await puppeteer.connect(puppeteerOptions);
  const context = await browser.createBrowserContext();
  const page = await context.newPage();

  // Navigate to Wikipedia
  console.log('Loading Wikipedia...');
  await page.goto('https://en.wikipedia.org/wiki/Web_scraping');

  // Extract all links from the references section
  const links = await page.evaluate(() => {
    return Array.from(document.querySelectorAll('.reflist a.external'))
      .map(link => link.getAttribute('href'))
      .filter(href => href !== null);
  });

  console.log('Found', links.length, 'reference links:');
  links.forEach(link => console.log(link));

  // Clean up
  await page.close();
  await context.close();
  await browser.disconnect();

  proc.stdout.destroy();
  proc.stderr.destroy();
  proc.kill();

  console.log('Done!');
})();

Step 3: Run It

node scrape.js

Output:

Starting Lightpanda...
🐼 Running Lightpanda's CDP server...
Connecting Puppeteer...
Loading Wikipedia...
Found 47 reference links:
https://example.com/article1
https://example.com/article2
...
Done!

What just happened:

Lightpanda started (instant, no delay)
Puppeteer connected to Lightpanda via WebSocket
Lightpanda loaded the Wikipedia page
JavaScript executed, references loaded
We extracted all external links from references
Everything cleaned up

The entire process took about 2 seconds. With Chrome, this would take 5-7 seconds.

Understanding How Lightpanda Works

Let's break down what's happening.

The CDP Server

When you start Lightpanda, it runs a Chrome DevTools Protocol (CDP) server on port 9222.

const proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });

CDP is the standard protocol that tools like Puppeteer and Playwright use to control browsers. Lightpanda implements this protocol, which is why your existing Puppeteer/Playwright scripts work with minimal changes.

Connecting Puppeteer

Instead of launching a browser, you connect to an existing one:

// Old way (launches Chrome)
const browser = await puppeteer.launch();

// New way (connects to Lightpanda)
const browser = await puppeteer.connect({
  browserWSEndpoint: 'ws://127.0.0.1:9222'
});

Everything else in your Puppeteer script stays the same. That's the beauty of Lightpanda.

The Browser Lifecycle

1. lightpanda.serve() → Starts the browser process
2. puppeteer.connect() → Connects via WebSocket
3. browser.createBrowserContext() → Isolated session
4. context.newPage() → New page/tab
5. page.goto() → Load website
6. page.evaluate() → Run JavaScript, extract data
7. page.close() → Close page
8. context.close() → Close context
9. browser.disconnect() → Disconnect Puppeteer
10. proc.kill() → Stop Lightpanda

Each step is fast. No waiting. No delays.

Practical Example: Scraping Product Prices

Let's scrape a real e-commerce demo site.

import { lightpanda } from '@lightpanda/browser';
import puppeteer from 'puppeteer-core';

const lpdOptions = {
  host: '127.0.0.1',
  port: 9222,
};

(async () => {
  // Start Lightpanda
  const proc = await lightpanda.serve(lpdOptions);

  // Connect Puppeteer
  const browser = await puppeteer.connect({
    browserWSEndpoint: `ws://${lpdOptions.host}:${lpdOptions.port}`,
  });

  const context = await browser.createBrowserContext();
  const page = await context.newPage();

  // Go to demo e-commerce site
  await page.goto('https://demo-browser.lightpanda.io/campfire-commerce/');

  // Wait for products to load (JavaScript)
  await page.waitForSelector('.product');

  // Extract product data
  const products = await page.evaluate(() => {
    return Array.from(document.querySelectorAll('.product')).map(product => {
      return {
        name: product.querySelector('.product-name')?.textContent?.trim(),
        price: product.querySelector('.product-price')?.textContent?.trim(),
        rating: product.querySelector('.product-rating')?.textContent?.trim(),
      };
    });
  });

  console.log('Products found:', products.length);
  products.forEach(product => {
    console.log(`${product.name}: ${product.price} (${product.rating} stars)`);
  });

  // Cleanup
  await page.close();
  await context.close();
  await browser.disconnect();
  proc.kill();
})();

Output:

Products found: 12
Camping Tent Pro: $199.99 (4.5 stars)
Sleeping Bag Ultra: $79.99 (4.8 stars)
Portable Stove: $45.99 (4.2 stars)
...

This example shows:

Loading a JavaScript-heavy page
Waiting for elements to appear
Extracting structured data
Handling multiple elements

All the things you'd do with Chrome, but faster and using less memory.

Using Lightpanda with Playwright

Playwright works too. Here's the same Wikipedia scraper using Playwright:

import { lightpanda } from '@lightpanda/browser';
import { chromium } from 'playwright-core';

const lpdOptions = {
  host: '127.0.0.1',
  port: 9222,
};

(async () => {
  // Start Lightpanda
  const proc = await lightpanda.serve(lpdOptions);

  // Connect Playwright (using connectOverCDP)
  const browser = await chromium.connectOverCDP(
    `ws://${lpdOptions.host}:${lpdOptions.port}`
  );

  const context = await browser.newContext();
  const page = await context.newPage();

  // Navigate and extract
  await page.goto('https://en.wikipedia.org/wiki/Web_scraping');

  const links = await page.evaluate(() => {
    return Array.from(document.querySelectorAll('.reflist a.external'))
      .map(link => link.href);
  });

  console.log('Found', links.length, 'links');

  // Cleanup
  await page.close();
  await context.close();
  await browser.close();
  proc.kill();
})();

Key differences from Puppeteer:

Import playwright-core instead of puppeteer-core
Use chromium.connectOverCDP() instead of puppeteer.connect()
Use browser.newContext() instead of browser.createBrowserContext()

Everything else works the same way.

Common Patterns

Pattern 1: Reusable Connection

Don't start/stop Lightpanda for every page. Keep it running:

import { lightpanda } from '@lightpanda/browser';
import puppeteer from 'puppeteer-core';

class LightpandaBrowser {
  constructor() {
    this.proc = null;
    this.browser = null;
  }

  async start() {
    this.proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });
    this.browser = await puppeteer.connect({
      browserWSEndpoint: 'ws://127.0.0.1:9222',
    });
  }

  async scrape(url) {
    const context = await this.browser.createBrowserContext();
    const page = await context.newPage();

    await page.goto(url);
    const title = await page.title();

    await page.close();
    await context.close();

    return title;
  }

  async stop() {
    if (this.browser) await this.browser.disconnect();
    if (this.proc) this.proc.kill();
  }
}

// Usage
const lpd = new LightpandaBrowser();
await lpd.start();

const title1 = await lpd.scrape('https://example.com');
const title2 = await lpd.scrape('https://another-site.com');
const title3 = await lpd.scrape('https://third-site.com');

await lpd.stop();

Start once, scrape multiple pages, stop once. Much more efficient.

Pattern 2: Error Handling

Always wrap in try/catch and ensure cleanup:

import { lightpanda } from '@lightpanda/browser';
import puppeteer from 'puppeteer-core';

(async () => {
  let proc, browser, context, page;

  try {
    proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });
    browser = await puppeteer.connect({
      browserWSEndpoint: 'ws://127.0.0.1:9222',
    });
    context = await browser.createBrowserContext();
    page = await context.newPage();

    await page.goto('https://example.com');
    const title = await page.title();
    console.log('Title:', title);

  } catch (error) {
    console.error('Error:', error.message);
  } finally {
    // Always cleanup
    if (page) await page.close();
    if (context) await context.close();
    if (browser) await browser.disconnect();
    if (proc) proc.kill();
  }
})();

This ensures Lightpanda stops even if something fails.

Pattern 3: Waiting for Content

JavaScript sites need time to load:

// Wait for selector
await page.waitForSelector('.products', { timeout: 5000 });

// Wait for network to be idle
await page.goto('https://example.com', { waitUntil: 'networkidle0' });

// Wait for specific time
await new Promise(resolve => setTimeout(resolve, 2000));

Lightpanda executes JavaScript, but you still need to wait for async content.

Lightpanda vs Chrome: Side-by-Side

Here are real numbers from actual testing.

Test setup:

AWS EC2 m5.large instance
Scraping 100 pages from a local test site
Using Puppeteer
Both Chrome and Lightpanda running same script

Results:

Metric	Chrome Headless	Lightpanda	Improvement
Execution time	25.2 seconds	2.3 seconds	11x faster
Memory peak	207 MB	24 MB	9x less
Startup time	3-4 seconds	0.1 seconds	30x faster
Cold start (first page)	4.5 seconds	0.4 seconds	11x faster

What this means in practice:

If you're scraping 10,000 pages per day:

Chrome: ~70 hours of execution time
Lightpanda: ~6.4 hours of execution time

If you're running 10 concurrent browser instances:

Chrome: ~2GB RAM minimum
Lightpanda: ~240MB RAM

If you're on AWS with 4GB RAM:

Chrome: Max 8-10 instances
Lightpanda: Max 80-100 instances

Cost impact:

Real AWS costs before and after:

Before (Chrome): $300/month for t3.xlarge (4 vCPU, 16GB RAM)
After (Lightpanda): $80/month for t3.large (2 vCPU, 8GB RAM)

Savings: $220/month, or $2,640 per year.

When Lightpanda Might Not Work

Let's be honest about limitations.

1. Not All Websites Work (Yet)

Lightpanda implements most Web APIs, but not all. Some complex sites might not work.

What usually works:

Simple HTML + CSS sites
React/Vue/Angular single-page apps
Most e-commerce sites
News sites
Documentation sites

What might not work:

Very complex web apps (Gmail, Google Docs)
Sites using obscure Web APIs
Sites with extremely heavy JavaScript

If a site doesn't work, try Chrome as a fallback.

2. No Visual Rendering

You can't take screenshots or generate PDFs.

// This won't work with Lightpanda
await page.screenshot({ path: 'screenshot.png' });
await page.pdf({ path: 'page.pdf' });

For these, stick with Chrome.

3. Limited Debugging Tools

Chrome DevTools is incredibly powerful. Lightpanda doesn't have the same debugging experience (yet).

For complex debugging, use Chrome. For production scraping, use Lightpanda.

4. Playwright Compatibility Caveat

Playwright might choose different code paths as Lightpanda adds new features. A script that works today might need updates when Lightpanda adds new Web APIs.

This is rare, but worth knowing.

Migrating from Chrome to Lightpanda

Already have Puppeteer scripts? Migration is simple.

Before (Chrome):

import puppeteer from 'puppeteer';

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');
  const title = await page.title();
  console.log(title);

  await browser.close();
})();

After (Lightpanda):

import { lightpanda } from '@lightpanda/browser';
import puppeteer from 'puppeteer-core';

(async () => {
  const proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });

  const browser = await puppeteer.connect({
    browserWSEndpoint: 'ws://127.0.0.1:9222',
  });
  const context = await browser.createBrowserContext();
  const page = await context.newPage();

  await page.goto('https://example.com');
  const title = await page.title();
  console.log(title);

  await page.close();
  await context.close();
  await browser.disconnect();
  proc.kill();
})();

What changed:

Added Lightpanda import
Changed puppeteer to puppeteer-core
Changed puppeteer.launch() to puppeteer.connect()
Added lightpanda.serve() and proc.kill()
Added createBrowserContext() layer

The page interaction code stays identical. All your selectors, page.goto(), page.evaluate(), everything works the same.

Using Lightpanda Cloud

Don't want to manage your own infrastructure? Lightpanda offers a cloud service.

import puppeteer from 'puppeteer-core';

(async () => {
  const browser = await puppeteer.connect({
    browserWSEndpoint: 'wss://euwest.cloud.lightpanda.io/ws?token=YOUR_TOKEN',
  });

  const context = await browser.createBrowserContext();
  const page = await context.newPage();

  // Your scraping code here
  await page.goto('https://example.com');
  const title = await page.title();
  console.log(title);

  await page.close();
  await context.close();
  await browser.disconnect();
})();

Benefits:

No server management
Scales automatically
Choose between Lightpanda or Chrome
Global regions available

Visit lightpanda.io/cloud to request API access.

Best Practices

1. Keep Lightpanda Running Between Pages

Don't start/stop for every page:

// Bad (slow)
for (const url of urls) {
  const proc = await lightpanda.serve(...);
  // scrape
  proc.kill();
}

// Good (fast)
const proc = await lightpanda.serve(...);
for (const url of urls) {
  // scrape
}
proc.kill();

2. Use Browser Contexts for Isolation

Each context is isolated (separate cookies, storage):

const browser = await puppeteer.connect(...);

// Context 1
const context1 = await browser.createBrowserContext();
const page1 = await context1.newPage();
// Scrape site A

// Context 2 (completely separate)
const context2 = await browser.createBrowserContext();
const page2 = await context2.newPage();
// Scrape site B

// Clean up both
await context1.close();
await context2.close();

3. Add Reasonable Delays

Even though Lightpanda is fast, be polite:

const urls = ['url1', 'url2', 'url3'];

for (const url of urls) {
  await page.goto(url);
  // Extract data

  // Wait 1 second before next page
  await new Promise(resolve => setTimeout(resolve, 1000));
}

4. Handle Errors Gracefully

async function scrapePage(page, url) {
  try {
    await page.goto(url, { timeout: 10000 });
    return await page.title();
  } catch (error) {
    console.error(`Failed to scrape ${url}:`, error.message);
    return null;
  }
}

5. Monitor Memory Usage

Even though Lightpanda uses less memory, still monitor:

console.log('Memory usage:', process.memoryUsage());

Troubleshooting Common Issues

Issue 1: Connection Refused

Error:

Error: connect ECONNREFUSED 127.0.0.1:9222

Solution:
Lightpanda isn't running. Make sure lightpanda.serve() completes before connecting.

// Wait for Lightpanda to start
const proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });
await new Promise(resolve => setTimeout(resolve, 500));

Issue 2: Page Doesn't Load

Error:
Page stays blank or navigation times out.

Solution:
The website might use Web APIs not yet supported. Try with Chrome to confirm:

// Fallback to Chrome
const browser = await puppeteer.launch();

If it works with Chrome but not Lightpanda, open an issue on GitHub.

Issue 3: Selector Returns Nothing

Problem:
Your selectors worked with Chrome but return nothing with Lightpanda.

Solution:
Add wait conditions:

// Wait for element to appear
await page.waitForSelector('.product', { timeout: 5000 });

// Then extract
const products = await page.$$('.product');

Issue 4: Module Import Errors

Error:

Cannot use import statement outside a module

Solution:
Add "type": "module" to your package.json:

{
  "name": "my-scraper",
  "version": "1.0.0",
  "type": "module"
}

What's Next?

You now know how to:

Install Lightpanda
Connect Puppeteer or Playwright
Scrape websites with JavaScript
Handle common patterns
Migrate from Chrome

Next steps:

Try it yourself - Run the examples in this guide
Migrate one script - Pick a simple Puppeteer script, convert it
Measure the difference - Compare execution time and memory
Scale up - Run multiple instances, see how far you can push it

Lightpanda is actively developed. New features and Web API support are added regularly. Star the GitHub repo to follow progress.

Quick Reference

Installation:

npm install @lightpanda/browser puppeteer-core

Start Lightpanda:

import { lightpanda } from '@lightpanda/browser';
const proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });

Connect Puppeteer:

import puppeteer from 'puppeteer-core';
const browser = await puppeteer.connect({
  browserWSEndpoint: 'ws://127.0.0.1:9222',
});

Scrape a page:

const context = await browser.createBrowserContext();
const page = await context.newPage();
await page.goto('https://example.com');
const data = await page.evaluate(() => {
  return document.querySelector('h1').textContent;
});

Cleanup:

await page.close();
await context.close();
await browser.disconnect();
proc.kill();

Summary

Lightpanda is a headless browser built from scratch for automation and AI. It's 11x faster than Chrome and uses 9x less memory.

Key advantages:

Instant startup (no 3-second wait)
Tiny memory footprint (24MB vs 207MB)
Drop-in replacement for Chrome with Puppeteer/Playwright
Perfect for scraping, testing, and AI agents

When to use:

Any web scraping task
Building AI agents
Automated testing without visual rendering
Cost optimization (lower AWS bills)

When to use Chrome instead:

Taking screenshots
Generating PDFs
Complex debugging
Sites using very new Web APIs

Getting started is simple:

npm install @lightpanda/browser puppeteer-core
Replace puppeteer.launch() with puppeteer.connect()
Your existing Puppeteer code works with minimal changes

Try Lightpanda on your next scraping project. The speed and memory savings are real.

Happy scraping!

Resources:

GitHub: https://github.com/lightpanda-io/browser
Documentation: https://lightpanda.io/docs
Cloud: https://lightpanda.io (request API access)
Scrapy Github: https://github.com/IkramKhanNiazi/The-Scrapy-Handbook