DEV Community

Antonios Thanasis
Antonios Thanasis

Posted on

How We Fixed a Puppeteer Memory Leak in a Laravel IoT App

We were running a Laravel IoT application on an Azure D2s_v3 VM (2 vCPUs, 8GB RAM) and kept hitting a frustrating problem β€” randomly, Docker commands would hang or fail entirely. No obvious reason. Just… dead.

After digging, we found the culprit: a Puppeteer scraper silently leaking Chromium processes every time it ran multiple scrapes.


πŸ” Investigation

First thing we checked was memory:

ps aux --sort=-%mem | head -20
Enter fullscreen mode Exit fullscreen mode

We spotted multiple orphaned Chromium processes, each eating ~90MB:

green  10374  0.1  1.1  /usr/lib/chromium/chrome --type=renderer --headless ...
green  10367  0.1  1.1  /usr/lib/chromium/chrome --type=renderer --headless ...
green  10347  0.9  1.0  /usr/lib/chromium/chrome --headless --enable-automation ...
Enter fullscreen mode Exit fullscreen mode

Docker stats confirmed the problem:

sudo docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}"
Enter fullscreen mode Exit fullscreen mode
scraper_1       320.5MiB / 7.772GiB  ← 🚨
app_1           143.8MiB / 7.772GiB
queue_worker_1  120.5MiB / 7.772GiB
Enter fullscreen mode Exit fullscreen mode

The scraper container was using much more memory than expected, and it kept growing over time.


πŸ•΅οΈ The Root Cause

The scraper visits router admin pages using Puppeteer to collect bandwidth data. Originally, the code launched one browser per device:

  • Sequential failures left Chromium processes orphaned.

  • Memory usage grew with each failed scrape.

  • Docker eventually ran out of memory or hung.

Original Problematic Pattern

// ❌ Launches a browser per URL
const browser = await puppeteer.launch();
const page = await browser.newPage();

try {
    await page.goto(url);
    // scrape logic
    await browser.close(); // only runs on success
} catch (err) {
    // may never reach browser.close()
}
Enter fullscreen mode Exit fullscreen mode
  • Each failure left a Chromium instance alive.

  • With 15+ routers, these quickly stacked up.


βœ… The Fix: Single Browser, Multiple Pages

Instead of launching a browser per device, we:

  1. Launch one shared browser.

  2. Open/close a page per URL.

  3. Always close the browser in a finally block.

Runner (scraper logic)

const runner = async ({ url, browser }) => {
    const page = await browser.newPage();

    try {
        // Go to the router URL
        await page.goto(url, { waitUntil: 'load', timeout: 15000 });

        // Fill in login form
        await page.$eval('input[name="username"]', (el, v) => el.value = v, config.teltonika_username);
        await page.$eval('input[name="password"]', (el, v) => el.value = v, config.teltonika_password);
        await page.$eval('input[type="submit"]', btn => btn.click());

        // Wait for page to load after login
        await page.waitForTimeout(15000);

        // Scrape the values
        const sent = await page.$eval("#lb_tx_sum", el => el.textContent.trim());
        const received = await page.$eval("#lb_rx_sum", el => el.textContent.trim());
        const total_today_usage = await page.$eval("#lb_all_sum", el => el.textContent.trim());

        return { fields: { sent, received, total_today_usage } };

    } catch (err) {
        throw err; // let the caller handle errors
    } finally {
        // Always close the page to prevent leaks
        await page.close();
    }
};
Enter fullscreen mode Exit fullscreen mode

Main Scraper Loop

const browser = await puppeteer.launch({
    executablePath: process.env.PUPPETEER_EXECUTABLE_PATH,
    args: ["--ignore-certificate-errors", "--no-sandbox", "--disable-dev-shm-usage"]
});

try {
    for (const dom of domains) {
        const results = await teltonika.exec({ id: dom.id, url: dom.url, browser });
        // send results to API
    }
} finally {
    await browser.close(); // βœ… guaranteed cleanup
}
Enter fullscreen mode Exit fullscreen mode

πŸ“Š Results

Metric Before After
Scraper container memory 320.5 MiB 34.77 MiB
Orphaned Chromium processes 3+ (growing) 0 βœ…
VM RAM usage 7/8 GB Stable
Docker commands failing Frequently Never

That's a 10x memory reduction from a single finally block.


🧠 Key Takeaways

  1. Use try/catch/finally with Puppeteer β€” ensures browser cleanup even if an error occurs.

  2. Do not launch one browser per URL β€” reuse a single instance for multiple pages.

  3. Set timeouts for goto, waitForSelector, and waitForTimeout β€” avoids hangs on unreachable hosts.

  4. Scrape sequentially for many devices β€” parallel Chromium launches explode memory usage.

  5. Monitor with ps aux | grep chromium β€” if processes accumulate, you have a leak.

πŸ”§ Bonus: Quick Cleanup if You're Already Leaking

If you're in this situation right now and need to free memory immediately:

# Kill all orphaned Chromium
pkill -f chromium

# Free OS cache (safe)
sudo sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

# Add swap as a safety net
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
Enter fullscreen mode Exit fullscreen mode

This approach ensures Puppeteer scrapers are robust, memory-efficient, and production-safe, even when scraping dozens of devices sequentially.


Top comments (0)