We were running a Laravel IoT application on an Azure D2s_v3 VM (2 vCPUs, 8GB RAM) and kept hitting a frustrating problem β randomly, Docker commands would hang or fail entirely. No obvious reason. Justβ¦ dead.
After digging, we found the culprit: a Puppeteer scraper silently leaking Chromium processes every time it ran multiple scrapes.
π Investigation
First thing we checked was memory:
ps aux --sort=-%mem | head -20
We spotted multiple orphaned Chromium processes, each eating ~90MB:
green 10374 0.1 1.1 /usr/lib/chromium/chrome --type=renderer --headless ...
green 10367 0.1 1.1 /usr/lib/chromium/chrome --type=renderer --headless ...
green 10347 0.9 1.0 /usr/lib/chromium/chrome --headless --enable-automation ...
Docker stats confirmed the problem:
sudo docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}"
scraper_1 320.5MiB / 7.772GiB β π¨
app_1 143.8MiB / 7.772GiB
queue_worker_1 120.5MiB / 7.772GiB
The scraper container was using much more memory than expected, and it kept growing over time.
π΅οΈ The Root Cause
The scraper visits router admin pages using Puppeteer to collect bandwidth data. Originally, the code launched one browser per device:
Sequential failures left Chromium processes orphaned.
Memory usage grew with each failed scrape.
Docker eventually ran out of memory or hung.
Original Problematic Pattern
// β Launches a browser per URL
const browser = await puppeteer.launch();
const page = await browser.newPage();
try {
await page.goto(url);
// scrape logic
await browser.close(); // only runs on success
} catch (err) {
// may never reach browser.close()
}
Each failure left a Chromium instance alive.
With 15+ routers, these quickly stacked up.
β The Fix: Single Browser, Multiple Pages
Instead of launching a browser per device, we:
Launch one shared browser.
Open/close a page per URL.
Always close the browser in a finally block.
Runner (scraper logic)
const runner = async ({ url, browser }) => {
const page = await browser.newPage();
try {
// Go to the router URL
await page.goto(url, { waitUntil: 'load', timeout: 15000 });
// Fill in login form
await page.$eval('input[name="username"]', (el, v) => el.value = v, config.teltonika_username);
await page.$eval('input[name="password"]', (el, v) => el.value = v, config.teltonika_password);
await page.$eval('input[type="submit"]', btn => btn.click());
// Wait for page to load after login
await page.waitForTimeout(15000);
// Scrape the values
const sent = await page.$eval("#lb_tx_sum", el => el.textContent.trim());
const received = await page.$eval("#lb_rx_sum", el => el.textContent.trim());
const total_today_usage = await page.$eval("#lb_all_sum", el => el.textContent.trim());
return { fields: { sent, received, total_today_usage } };
} catch (err) {
throw err; // let the caller handle errors
} finally {
// Always close the page to prevent leaks
await page.close();
}
};
Main Scraper Loop
const browser = await puppeteer.launch({
executablePath: process.env.PUPPETEER_EXECUTABLE_PATH,
args: ["--ignore-certificate-errors", "--no-sandbox", "--disable-dev-shm-usage"]
});
try {
for (const dom of domains) {
const results = await teltonika.exec({ id: dom.id, url: dom.url, browser });
// send results to API
}
} finally {
await browser.close(); // β
guaranteed cleanup
}
π Results
| Metric | Before | After |
|---|---|---|
| Scraper container memory | 320.5 MiB | 34.77 MiB |
| Orphaned Chromium processes | 3+ (growing) | 0 β |
| VM RAM usage | 7/8 GB | Stable |
| Docker commands failing | Frequently | Never |
That's a 10x memory reduction from a single finally block.
π§ Key Takeaways
Use try/catch/finally with Puppeteer β ensures browser cleanup even if an error occurs.
Do not launch one browser per URL β reuse a single instance for multiple pages.
Set timeouts for goto, waitForSelector, and waitForTimeout β avoids hangs on unreachable hosts.
Scrape sequentially for many devices β parallel Chromium launches explode memory usage.
Monitor with ps aux | grep chromium β if processes accumulate, you have a leak.
π§ Bonus: Quick Cleanup if You're Already Leaking
If you're in this situation right now and need to free memory immediately:
# Kill all orphaned Chromium
pkill -f chromium
# Free OS cache (safe)
sudo sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
# Add swap as a safety net
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
This approach ensures Puppeteer scrapers are robust, memory-efficient, and production-safe, even when scraping dozens of devices sequentially.
Top comments (0)