Antonios Thanasis

Posted on Mar 3 • Edited on May 5

How We Fixed a Puppeteer Memory Leak in a Laravel IoT App

#puppeteer #javascript #docker

We were running an IoT application on Azure and kept hitting a frustrating problem — randomly, Docker commands would hang or fail entirely. No obvious reason. Just… dead.

After digging, we found the culprit: a Puppeteer scraper silently leaking Chromium processes every time it ran multiple scrapes.

🔍 Investigation

First thing we checked was memory:

ps aux --sort=-%mem | head -20

We spotted multiple orphaned Chromium processes, each eating ~90MB:

green  10374  0.1  1.1  /usr/lib/chromium/chrome --type=renderer --headless ...
green  10367  0.1  1.1  /usr/lib/chromium/chrome --type=renderer --headless ...
green  10347  0.9  1.0  /usr/lib/chromium/chrome --headless --enable-automation ...

Docker stats confirmed the problem:

sudo docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}"

scraper_1       320.5MiB / 7.772GiB  ← 🚨
app_1           143.8MiB / 7.772GiB
queue_worker_1  120.5MiB / 7.772GiB

The scraper container was using much more memory than expected, and it kept growing over time.

🕵️ The Root Cause

The scraper visits router admin pages using Puppeteer to collect bandwidth data. Originally, the code launched one browser per device:

Sequential failures left Chromium processes orphaned.
Memory usage grew with each failed scrape.
Docker eventually ran out of memory or hung.

Original Problematic Pattern

// ❌ Launches a browser per URL
const browser = await puppeteer.launch();
const page = await browser.newPage();

try {
    await page.goto(url);
    // scrape logic
    await browser.close(); // only runs on success
} catch (err) {
    // may never reach browser.close()
}

Each failure left a Chromium instance alive.
With 15+ routers, these quickly stacked up.

✅ The Fix: Single Browser, Multiple Pages

Instead of launching a browser per device, we:

Launch one shared browser.
Open/close a page per URL.
Always close the browser in a finally block.

Runner (scraper logic)

const runner = async ({ url, browser }) => {
    const page = await browser.newPage();

    try {
        // Go to the router URL
        await page.goto(url, { waitUntil: 'load', timeout: 15000 });

        // Fill in login form
        await page.$eval('input[name="username"]', (el, v) => el.value = v, config.teltonika_username);
        await page.$eval('input[name="password"]', (el, v) => el.value = v, config.teltonika_password);
        await page.$eval('input[type="submit"]', btn => btn.click());

        // Wait for page to load after login
        await page.waitForTimeout(15000);

        // Scrape the values
        const sent = await page.$eval("#lb_tx_sum", el => el.textContent.trim());
        const received = await page.$eval("#lb_rx_sum", el => el.textContent.trim());
        const total_today_usage = await page.$eval("#lb_all_sum", el => el.textContent.trim());

        return { fields: { sent, received, total_today_usage } };

    } catch (err) {
        throw err; // let the caller handle errors
    } finally {
        // Always close the page to prevent leaks
        await page.close();
    }
};

Main Scraper Loop

const browser = await puppeteer.launch({
    executablePath: process.env.PUPPETEER_EXECUTABLE_PATH,
    args: ["--ignore-certificate-errors", "--no-sandbox", "--disable-dev-shm-usage"]
});

try {
    for (const dom of domains) {
        const results = await teltonika.exec({ id: dom.id, url: dom.url, browser });
        // send results to API
    }
} finally {
    await browser.close(); // ✅ guaranteed cleanup
}

📊 Results

Metric	Before	After
Scraper container memory	320.5 MiB	34.77 MiB
Orphaned Chromium processes	3+ (growing)	0 ✅
VM RAM usage	7/8 GB	Stable
Docker commands failing	Frequently	Never

That's a 10x memory reduction from a single finally block.

🧠 Key Takeaways

Use try/catch/finally with Puppeteer — ensures browser cleanup even if an error occurs.
Do not launch one browser per URL — reuse a single instance for multiple pages.
Set timeouts for goto, waitForSelector, and waitForTimeout — avoids hangs on unreachable hosts.
Scrape sequentially for many devices — parallel Chromium launches explode memory usage.
Monitor with ps aux | grep chromium — if processes accumulate, you have a leak.

🔧 Bonus: Quick Cleanup if You're Already Leaking

If you're in this situation right now and need to free memory immediately:

# Kill all orphaned Chromium
pkill -f chromium

# Free OS cache (safe)
sudo sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

# Add swap as a safety net
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

This approach ensures Puppeteer scrapers are robust, memory-efficient, and production-safe, even when scraping dozens of devices sequentially.

Top comments (3)

Alex Serebriakov • Apr 8

memory leaks with puppeteer are a classic — browser.close() in finally blocks helps but zombie processes still accumulate over time in long-running services. if screenshot/PDF generation is the part that's leaking, moving it to an API call (like snapapi.pics) offloads the browser process lifecycle entirely. your app just fires HTTP requests, no chrome to leak

Antonios Thanasis • Apr 15

yeah totally agree — puppeteer leaks are kind of inevitable in long-running services 😅 even with finally blocks you still end up chasing zombie processes after a while

offloading screenshots/PDFs to an API is actually a really nice approach, especially if that’s the main source of the leaks. not having to manage the browser lifecycle at all is a big win

in our case we wanted to keep everything self-contained, but yeah for a lot of setups that trade-off (less control vs more stability) is definitely worth it 👍

Alex Serebriakov • Apr 8

solid breakdown. puppeteer memory leaks in prod are a real pain — especially long-running workers

been using snapapi.pics to avoid running chromium altogether. REST API for screenshots, no browser processes to leak