You're scraping a few hundred product pages every hour. Your server keeps dying.
Chrome eats more and more memory. 200MB per browser instance. You spin up 10 instances and watch your RAM disappear. The server slows down. Then crashes. You restart everything and the cycle repeats.
Then you check your AWS bill. $300 per month just to keep Chrome alive. For what? Scraping a few hundred pages?
You try everything. Lower concurrency. Kill instances faster. More RAM. Nothing solves the core problem. Chrome was built for humans browsing the web, not machines scraping at scale.
That's when you discover Lightpanda. A browser built from scratch for automation. Not Chrome with the UI removed. An actual ground-up rebuild designed for machines.
24MB of memory instead of 207MB. 2 seconds to scrape 100 pages instead of 25 seconds. Your AWS bill drops to $80.
Let me show you how to use it.
What is Lightpanda?
Lightpanda is a headless browser written in Zig (a systems programming language) from the ground up. It's not based on Chrome, Firefox, or any existing browser. It was built specifically for web automation, scraping, and AI agents.
The key differences:
Chrome headless is a full desktop browser with the display turned off. All the code for rendering graphics, managing tabs, handling extensions, and displaying a UI is still there. You're paying (in memory and CPU) for features you never use.
Lightpanda only builds what automation needs. DOM tree, JavaScript execution, network requests. Nothing else. No rendering engine. No UI code. Just the essentials.
The results:
- 11x faster than Chrome (real benchmark on AWS)
- 9x less memory (24MB vs 207MB)
- Instant startup (no 3-second wait)
- Compatible with Puppeteer and Playwright (drop-in replacement)
When to use Lightpanda:
- Web scraping at any scale
- Testing websites with JavaScript
- Building AI agents that browse the web
- Any automation that doesn't need visual rendering
When to still use Chrome:
- Taking screenshots
- Generating PDFs
- Testing actual visual rendering
- Complex debugging with DevTools
Installing Lightpanda
Let's get Lightpanda running on your machine. I'll show you the easiest way for each platform.
Option 1: Using npm (Recommended for Beginners)
This is the simplest way. The npm package handles everything for you.
Requirements:
- Node.js installed (version 14 or higher)
- That's it!
Step 1: Create a new project
mkdir lightpanda-test
cd lightpanda-test
npm init -y
Step 2: Install Lightpanda
npm install @lightpanda/browser
The package automatically downloads the right binary for your operating system. No manual setup needed.
Step 3: Test it works
Create a file called test.js:
import { lightpanda } from '@lightpanda/browser';
const options = {
host: '127.0.0.1',
port: 9222,
};
(async () => {
// Start Lightpanda
const proc = await lightpanda.serve(options);
console.log('Lightpanda is running!');
console.log('Process ID:', proc.pid);
// Stop Lightpanda
proc.stdout.destroy();
proc.stderr.destroy();
proc.kill();
console.log('Lightpanda stopped');
})();
Step 4: Run it
node test.js
You should see:
🐼 Running Lightpanda's CDP server...
{ pid: 12345 }
Lightpanda is running!
Process ID: 12345
Lightpanda stopped
If you see that, it works! Lightpanda starts and stops almost instantly.
Option 2: Download Binary Directly
If you don't want to use Node.js, you can download the binary.
For Linux (x86_64):
curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/download/nightly/lightpanda-x86_64-linux
chmod +x ./lightpanda
./lightpanda -h
For Mac (Apple Silicon):
curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/download/nightly/lightpanda-aarch64-macos
chmod +x ./lightpanda
./lightpanda -h
Note: There's no Intel Mac build yet. Intel Mac users should use Docker or the npm package.
For Windows:
Windows users need to use WSL2 (Windows Subsystem for Linux). Once WSL2 is installed, follow the Linux instructions above from inside your WSL terminal.
Move to PATH (optional):
To run lightpanda from anywhere:
sudo mv lightpanda /usr/local/bin/
Now lightpanda works from any directory.
Option 3: Docker (Works Everywhere)
Docker works on all platforms.
docker run -d --name lightpanda -p 9222:9222 lightpanda/browser:nightly
This starts Lightpanda in a container and exposes port 9222 for connections.
Your First Lightpanda Script
Let's scrape Wikipedia and extract all the links. This will show you how Lightpanda works with Puppeteer.
Step 1: Install Puppeteer
We need puppeteer-core (not regular puppeteer). The regular version downloads Chrome, which we don't want.
npm install puppeteer-core
Step 2: Create the Script
Create scrape.js:
import { lightpanda } from '@lightpanda/browser';
import puppeteer from 'puppeteer-core';
const lpdOptions = {
host: '127.0.0.1',
port: 9222,
};
const puppeteerOptions = {
browserWSEndpoint: `ws://${lpdOptions.host}:${lpdOptions.port}`,
};
(async () => {
// Start Lightpanda
console.log('Starting Lightpanda...');
const proc = await lightpanda.serve(lpdOptions);
// Connect Puppeteer to Lightpanda
console.log('Connecting Puppeteer...');
const browser = await puppeteer.connect(puppeteerOptions);
const context = await browser.createBrowserContext();
const page = await context.newPage();
// Navigate to Wikipedia
console.log('Loading Wikipedia...');
await page.goto('https://en.wikipedia.org/wiki/Web_scraping');
// Extract all links from the references section
const links = await page.evaluate(() => {
return Array.from(document.querySelectorAll('.reflist a.external'))
.map(link => link.getAttribute('href'))
.filter(href => href !== null);
});
console.log('Found', links.length, 'reference links:');
links.forEach(link => console.log(link));
// Clean up
await page.close();
await context.close();
await browser.disconnect();
proc.stdout.destroy();
proc.stderr.destroy();
proc.kill();
console.log('Done!');
})();
Step 3: Run It
node scrape.js
Output:
Starting Lightpanda...
🐼 Running Lightpanda's CDP server...
Connecting Puppeteer...
Loading Wikipedia...
Found 47 reference links:
https://example.com/article1
https://example.com/article2
...
Done!
What just happened:
- Lightpanda started (instant, no delay)
- Puppeteer connected to Lightpanda via WebSocket
- Lightpanda loaded the Wikipedia page
- JavaScript executed, references loaded
- We extracted all external links from references
- Everything cleaned up
The entire process took about 2 seconds. With Chrome, this would take 5-7 seconds.
Understanding How Lightpanda Works
Let's break down what's happening.
The CDP Server
When you start Lightpanda, it runs a Chrome DevTools Protocol (CDP) server on port 9222.
const proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });
CDP is the standard protocol that tools like Puppeteer and Playwright use to control browsers. Lightpanda implements this protocol, which is why your existing Puppeteer/Playwright scripts work with minimal changes.
Connecting Puppeteer
Instead of launching a browser, you connect to an existing one:
// Old way (launches Chrome)
const browser = await puppeteer.launch();
// New way (connects to Lightpanda)
const browser = await puppeteer.connect({
browserWSEndpoint: 'ws://127.0.0.1:9222'
});
Everything else in your Puppeteer script stays the same. That's the beauty of Lightpanda.
The Browser Lifecycle
1. lightpanda.serve() → Starts the browser process
2. puppeteer.connect() → Connects via WebSocket
3. browser.createBrowserContext() → Isolated session
4. context.newPage() → New page/tab
5. page.goto() → Load website
6. page.evaluate() → Run JavaScript, extract data
7. page.close() → Close page
8. context.close() → Close context
9. browser.disconnect() → Disconnect Puppeteer
10. proc.kill() → Stop Lightpanda
Each step is fast. No waiting. No delays.
Practical Example: Scraping Product Prices
Let's scrape a real e-commerce demo site.
import { lightpanda } from '@lightpanda/browser';
import puppeteer from 'puppeteer-core';
const lpdOptions = {
host: '127.0.0.1',
port: 9222,
};
(async () => {
// Start Lightpanda
const proc = await lightpanda.serve(lpdOptions);
// Connect Puppeteer
const browser = await puppeteer.connect({
browserWSEndpoint: `ws://${lpdOptions.host}:${lpdOptions.port}`,
});
const context = await browser.createBrowserContext();
const page = await context.newPage();
// Go to demo e-commerce site
await page.goto('https://demo-browser.lightpanda.io/campfire-commerce/');
// Wait for products to load (JavaScript)
await page.waitForSelector('.product');
// Extract product data
const products = await page.evaluate(() => {
return Array.from(document.querySelectorAll('.product')).map(product => {
return {
name: product.querySelector('.product-name')?.textContent?.trim(),
price: product.querySelector('.product-price')?.textContent?.trim(),
rating: product.querySelector('.product-rating')?.textContent?.trim(),
};
});
});
console.log('Products found:', products.length);
products.forEach(product => {
console.log(`${product.name}: ${product.price} (${product.rating} stars)`);
});
// Cleanup
await page.close();
await context.close();
await browser.disconnect();
proc.kill();
})();
Output:
Products found: 12
Camping Tent Pro: $199.99 (4.5 stars)
Sleeping Bag Ultra: $79.99 (4.8 stars)
Portable Stove: $45.99 (4.2 stars)
...
This example shows:
- Loading a JavaScript-heavy page
- Waiting for elements to appear
- Extracting structured data
- Handling multiple elements
All the things you'd do with Chrome, but faster and using less memory.
Using Lightpanda with Playwright
Playwright works too. Here's the same Wikipedia scraper using Playwright:
import { lightpanda } from '@lightpanda/browser';
import { chromium } from 'playwright-core';
const lpdOptions = {
host: '127.0.0.1',
port: 9222,
};
(async () => {
// Start Lightpanda
const proc = await lightpanda.serve(lpdOptions);
// Connect Playwright (using connectOverCDP)
const browser = await chromium.connectOverCDP(
`ws://${lpdOptions.host}:${lpdOptions.port}`
);
const context = await browser.newContext();
const page = await context.newPage();
// Navigate and extract
await page.goto('https://en.wikipedia.org/wiki/Web_scraping');
const links = await page.evaluate(() => {
return Array.from(document.querySelectorAll('.reflist a.external'))
.map(link => link.href);
});
console.log('Found', links.length, 'links');
// Cleanup
await page.close();
await context.close();
await browser.close();
proc.kill();
})();
Key differences from Puppeteer:
- Import
playwright-coreinstead ofpuppeteer-core - Use
chromium.connectOverCDP()instead ofpuppeteer.connect() - Use
browser.newContext()instead ofbrowser.createBrowserContext()
Everything else works the same way.
Common Patterns
Pattern 1: Reusable Connection
Don't start/stop Lightpanda for every page. Keep it running:
import { lightpanda } from '@lightpanda/browser';
import puppeteer from 'puppeteer-core';
class LightpandaBrowser {
constructor() {
this.proc = null;
this.browser = null;
}
async start() {
this.proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });
this.browser = await puppeteer.connect({
browserWSEndpoint: 'ws://127.0.0.1:9222',
});
}
async scrape(url) {
const context = await this.browser.createBrowserContext();
const page = await context.newPage();
await page.goto(url);
const title = await page.title();
await page.close();
await context.close();
return title;
}
async stop() {
if (this.browser) await this.browser.disconnect();
if (this.proc) this.proc.kill();
}
}
// Usage
const lpd = new LightpandaBrowser();
await lpd.start();
const title1 = await lpd.scrape('https://example.com');
const title2 = await lpd.scrape('https://another-site.com');
const title3 = await lpd.scrape('https://third-site.com');
await lpd.stop();
Start once, scrape multiple pages, stop once. Much more efficient.
Pattern 2: Error Handling
Always wrap in try/catch and ensure cleanup:
import { lightpanda } from '@lightpanda/browser';
import puppeteer from 'puppeteer-core';
(async () => {
let proc, browser, context, page;
try {
proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });
browser = await puppeteer.connect({
browserWSEndpoint: 'ws://127.0.0.1:9222',
});
context = await browser.createBrowserContext();
page = await context.newPage();
await page.goto('https://example.com');
const title = await page.title();
console.log('Title:', title);
} catch (error) {
console.error('Error:', error.message);
} finally {
// Always cleanup
if (page) await page.close();
if (context) await context.close();
if (browser) await browser.disconnect();
if (proc) proc.kill();
}
})();
This ensures Lightpanda stops even if something fails.
Pattern 3: Waiting for Content
JavaScript sites need time to load:
// Wait for selector
await page.waitForSelector('.products', { timeout: 5000 });
// Wait for network to be idle
await page.goto('https://example.com', { waitUntil: 'networkidle0' });
// Wait for specific time
await new Promise(resolve => setTimeout(resolve, 2000));
Lightpanda executes JavaScript, but you still need to wait for async content.
Lightpanda vs Chrome: Side-by-Side
Here are real numbers from actual testing.
Test setup:
- AWS EC2 m5.large instance
- Scraping 100 pages from a local test site
- Using Puppeteer
- Both Chrome and Lightpanda running same script
Results:
| Metric | Chrome Headless | Lightpanda | Improvement |
|---|---|---|---|
| Execution time | 25.2 seconds | 2.3 seconds | 11x faster |
| Memory peak | 207 MB | 24 MB | 9x less |
| Startup time | 3-4 seconds | 0.1 seconds | 30x faster |
| Cold start (first page) | 4.5 seconds | 0.4 seconds | 11x faster |
What this means in practice:
If you're scraping 10,000 pages per day:
- Chrome: ~70 hours of execution time
- Lightpanda: ~6.4 hours of execution time
If you're running 10 concurrent browser instances:
- Chrome: ~2GB RAM minimum
- Lightpanda: ~240MB RAM
If you're on AWS with 4GB RAM:
- Chrome: Max 8-10 instances
- Lightpanda: Max 80-100 instances
Cost impact:
Real AWS costs before and after:
- Before (Chrome): $300/month for t3.xlarge (4 vCPU, 16GB RAM)
- After (Lightpanda): $80/month for t3.large (2 vCPU, 8GB RAM)
Savings: $220/month, or $2,640 per year.
When Lightpanda Might Not Work
Let's be honest about limitations.
1. Not All Websites Work (Yet)
Lightpanda implements most Web APIs, but not all. Some complex sites might not work.
What usually works:
- Simple HTML + CSS sites
- React/Vue/Angular single-page apps
- Most e-commerce sites
- News sites
- Documentation sites
What might not work:
- Very complex web apps (Gmail, Google Docs)
- Sites using obscure Web APIs
- Sites with extremely heavy JavaScript
If a site doesn't work, try Chrome as a fallback.
2. No Visual Rendering
You can't take screenshots or generate PDFs.
// This won't work with Lightpanda
await page.screenshot({ path: 'screenshot.png' });
await page.pdf({ path: 'page.pdf' });
For these, stick with Chrome.
3. Limited Debugging Tools
Chrome DevTools is incredibly powerful. Lightpanda doesn't have the same debugging experience (yet).
For complex debugging, use Chrome. For production scraping, use Lightpanda.
4. Playwright Compatibility Caveat
Playwright might choose different code paths as Lightpanda adds new features. A script that works today might need updates when Lightpanda adds new Web APIs.
This is rare, but worth knowing.
Migrating from Chrome to Lightpanda
Already have Puppeteer scripts? Migration is simple.
Before (Chrome):
import puppeteer from 'puppeteer';
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
const title = await page.title();
console.log(title);
await browser.close();
})();
After (Lightpanda):
import { lightpanda } from '@lightpanda/browser';
import puppeteer from 'puppeteer-core';
(async () => {
const proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });
const browser = await puppeteer.connect({
browserWSEndpoint: 'ws://127.0.0.1:9222',
});
const context = await browser.createBrowserContext();
const page = await context.newPage();
await page.goto('https://example.com');
const title = await page.title();
console.log(title);
await page.close();
await context.close();
await browser.disconnect();
proc.kill();
})();
What changed:
- Added Lightpanda import
- Changed
puppeteertopuppeteer-core - Changed
puppeteer.launch()topuppeteer.connect() - Added
lightpanda.serve()andproc.kill() - Added
createBrowserContext()layer
The page interaction code stays identical. All your selectors, page.goto(), page.evaluate(), everything works the same.
Using Lightpanda Cloud
Don't want to manage your own infrastructure? Lightpanda offers a cloud service.
import puppeteer from 'puppeteer-core';
(async () => {
const browser = await puppeteer.connect({
browserWSEndpoint: 'wss://euwest.cloud.lightpanda.io/ws?token=YOUR_TOKEN',
});
const context = await browser.createBrowserContext();
const page = await context.newPage();
// Your scraping code here
await page.goto('https://example.com');
const title = await page.title();
console.log(title);
await page.close();
await context.close();
await browser.disconnect();
})();
Benefits:
- No server management
- Scales automatically
- Choose between Lightpanda or Chrome
- Global regions available
Visit lightpanda.io/cloud to request API access.
Best Practices
1. Keep Lightpanda Running Between Pages
Don't start/stop for every page:
// Bad (slow)
for (const url of urls) {
const proc = await lightpanda.serve(...);
// scrape
proc.kill();
}
// Good (fast)
const proc = await lightpanda.serve(...);
for (const url of urls) {
// scrape
}
proc.kill();
2. Use Browser Contexts for Isolation
Each context is isolated (separate cookies, storage):
const browser = await puppeteer.connect(...);
// Context 1
const context1 = await browser.createBrowserContext();
const page1 = await context1.newPage();
// Scrape site A
// Context 2 (completely separate)
const context2 = await browser.createBrowserContext();
const page2 = await context2.newPage();
// Scrape site B
// Clean up both
await context1.close();
await context2.close();
3. Add Reasonable Delays
Even though Lightpanda is fast, be polite:
const urls = ['url1', 'url2', 'url3'];
for (const url of urls) {
await page.goto(url);
// Extract data
// Wait 1 second before next page
await new Promise(resolve => setTimeout(resolve, 1000));
}
4. Handle Errors Gracefully
async function scrapePage(page, url) {
try {
await page.goto(url, { timeout: 10000 });
return await page.title();
} catch (error) {
console.error(`Failed to scrape ${url}:`, error.message);
return null;
}
}
5. Monitor Memory Usage
Even though Lightpanda uses less memory, still monitor:
console.log('Memory usage:', process.memoryUsage());
Troubleshooting Common Issues
Issue 1: Connection Refused
Error:
Error: connect ECONNREFUSED 127.0.0.1:9222
Solution:
Lightpanda isn't running. Make sure lightpanda.serve() completes before connecting.
// Wait for Lightpanda to start
const proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });
await new Promise(resolve => setTimeout(resolve, 500));
Issue 2: Page Doesn't Load
Error:
Page stays blank or navigation times out.
Solution:
The website might use Web APIs not yet supported. Try with Chrome to confirm:
// Fallback to Chrome
const browser = await puppeteer.launch();
If it works with Chrome but not Lightpanda, open an issue on GitHub.
Issue 3: Selector Returns Nothing
Problem:
Your selectors worked with Chrome but return nothing with Lightpanda.
Solution:
Add wait conditions:
// Wait for element to appear
await page.waitForSelector('.product', { timeout: 5000 });
// Then extract
const products = await page.$$('.product');
Issue 4: Module Import Errors
Error:
Cannot use import statement outside a module
Solution:
Add "type": "module" to your package.json:
{
"name": "my-scraper",
"version": "1.0.0",
"type": "module"
}
What's Next?
You now know how to:
- Install Lightpanda
- Connect Puppeteer or Playwright
- Scrape websites with JavaScript
- Handle common patterns
- Migrate from Chrome
Next steps:
- Try it yourself - Run the examples in this guide
- Migrate one script - Pick a simple Puppeteer script, convert it
- Measure the difference - Compare execution time and memory
- Scale up - Run multiple instances, see how far you can push it
Lightpanda is actively developed. New features and Web API support are added regularly. Star the GitHub repo to follow progress.
Quick Reference
Installation:
npm install @lightpanda/browser puppeteer-core
Start Lightpanda:
import { lightpanda } from '@lightpanda/browser';
const proc = await lightpanda.serve({ host: '127.0.0.1', port: 9222 });
Connect Puppeteer:
import puppeteer from 'puppeteer-core';
const browser = await puppeteer.connect({
browserWSEndpoint: 'ws://127.0.0.1:9222',
});
Scrape a page:
const context = await browser.createBrowserContext();
const page = await context.newPage();
await page.goto('https://example.com');
const data = await page.evaluate(() => {
return document.querySelector('h1').textContent;
});
Cleanup:
await page.close();
await context.close();
await browser.disconnect();
proc.kill();
Summary
Lightpanda is a headless browser built from scratch for automation and AI. It's 11x faster than Chrome and uses 9x less memory.
Key advantages:
- Instant startup (no 3-second wait)
- Tiny memory footprint (24MB vs 207MB)
- Drop-in replacement for Chrome with Puppeteer/Playwright
- Perfect for scraping, testing, and AI agents
When to use:
- Any web scraping task
- Building AI agents
- Automated testing without visual rendering
- Cost optimization (lower AWS bills)
When to use Chrome instead:
- Taking screenshots
- Generating PDFs
- Complex debugging
- Sites using very new Web APIs
Getting started is simple:
npm install @lightpanda/browser puppeteer-core- Replace
puppeteer.launch()withpuppeteer.connect() - Your existing Puppeteer code works with minimal changes
Try Lightpanda on your next scraping project. The speed and memory savings are real.
Happy scraping!
Resources:
- GitHub: https://github.com/lightpanda-io/browser
- Documentation: https://lightpanda.io/docs
- Cloud: https://lightpanda.io (request API access)
- Scrapy Github: https://github.com/IkramKhanNiazi/The-Scrapy-Handbook
Top comments (0)