DEV Community

IronSoftware
IronSoftware

Posted on

Puppeteer Zombie Processes and Browser Instances (Issue Fixed)

Developers running Puppeteer or PuppeteerSharp in production often discover their servers accumulating orphaned Chrome processes that consume memory but never terminate. These zombie browser processes escape disposal calls, pile up under load, and eventually exhaust system resources. The problem intensifies in containerized environments where memory limits trigger OOM kills, and debugging becomes nearly impossible when logs scatter across dozens of orphaned browser instances.

The Problem

When Puppeteer launches a browser, it spawns multiple operating system processes: a main browser process, GPU process, network service, and renderer processes for each tab. These processes communicate with the Node.js or .NET application through WebSocket connections using the Chrome DevTools Protocol. The parent application sends commands and receives responses, but it does not directly control the lifecycle of these child processes.

Under normal conditions, calling browser.close() in Node.js or Browser.CloseAsync() in PuppeteerSharp signals the browser to terminate gracefully. The browser process should then exit, taking its child processes with it. In practice, this signal frequently fails to reach all processes, or the processes fail to respond to it. The result is a zombie process that remains in memory, consuming resources but no longer responding to commands.

The zombie process problem compounds under high-traffic conditions. Each failed disposal leaves behind processes consuming 100-300MB of RAM. With hundreds of PDF generation requests per hour, memory consumption climbs steadily. Server monitoring shows RAM usage increasing even when the application itself reports successful disposal. Eventually, the operating system intervenes with OOM kills, or the server becomes unresponsive.

Error Messages and Symptoms

Applications experiencing zombie browser processes typically encounter these errors and behaviors:

Error: Protocol error (Target.createTarget): Target closed.

TimeoutError: Timed out after 30000 ms while trying to connect to the browser!
Most likely the browser was closed.

Error: Browser disconnected unexpectedly (closed?)

PuppeteerSharp.TargetClosedException: Protocol error (Target.activateTarget):
Target closed. (Session closed. Most likely the page has been closed.)

Container killed with exit code 137 (OOM killed by kernel)

Error: ECONNREFUSED - Connection refused
Enter fullscreen mode Exit fullscreen mode

Observable symptoms include:

  • Memory usage climbing without corresponding application load
  • ps aux | grep chrome showing dozens of orphaned Chrome processes
  • Application logs reporting successful browser disposal while processes remain
  • Container restarts due to memory limits without application errors
  • Browser launch operations timing out due to resource exhaustion
  • Server becoming unresponsive during peak traffic periods
  • Logs appearing fragmented across multiple Chrome process outputs

Who Is Affected

The zombie process issue affects deployments across multiple dimensions:

Container Deployments: Docker and Kubernetes environments suffer most severely. Containers typically run with memory ceilings (512MB-2GB), and each orphaned Chrome process consumes a substantial portion of that allocation. The OOM killer terminates containers abruptly, causing request failures and potential data loss.

Long-Running Server Applications: Web servers and background services running Puppeteer for PDF generation accumulate zombie processes over days or weeks of operation. The gradual memory growth is often mistaken for a memory leak in application code rather than orphaned browser processes.

Serverless Functions: AWS Lambda, Azure Functions, and Google Cloud Functions face unique challenges. Each function invocation may leave behind processes that persist beyond the function's execution timeout, consuming resources from subsequent invocations in warm containers.

CI/CD Pipelines: Build servers running Puppeteer for visual testing or screenshot generation accumulate zombie processes across multiple builds. Shared build agents eventually require manual intervention or restarts.

Linux Environments: Linux deployments face additional complexity because of how the kernel handles orphaned processes. When a parent process terminates without properly reaping its children, those children become zombies attached to PID 1 (init), where they remain until explicitly killed.

Evidence from the Developer Community

The zombie process problem is documented across GitHub issues, technical blogs, and developer forums spanning several years.

Timeline

Date Event Source
2019-01-15 Page.Dispose() never completing reported GitHub Issue #122
2020-02-15 Chrome zombie processes in Docker documented GitHub puppeteer Issue #5645
2021-08-17 DisposeAsync hanging in Docker reported GitHub PuppeteerSharp Issue #1489
2022-06-01 Zombie Chrome processes in Kubernetes clusters GitHub Issue #8695
2023-04-10 Multiple browser instances not closing properly GitHub puppeteer Issue #10030
2024-01-15 Zombie Chrome processes in Docker/Kubernetes GitHub puppeteer Issue #12854
2024-06-20 Browser version mismatch causing failures Multiple GitHub discussions

Community Reports

"We found 47 orphaned Chrome processes on our production server. Each was consuming 150-200MB of RAM. The application logs showed all browsers had been properly closed."
— DevOps Engineer, Reddit r/node, 2023

"When running PuppeteerSharp with Docker, I'm finding quite a lot of zombie Chrome processes that never get killed. Even using tini as an entry point didn't resolve the issue."
— Developer, GitHub Issue #1489

"The browser.close() call completes successfully according to our logs, but the Chrome processes remain. We've resorted to periodically running pkill -f chromium in a cron job."
— Developer, Stack Overflow, 2023

"After upgrading Chrome, our entire PDF generation pipeline broke. The library version no longer matched the browser version, and we had no way to know until production failed."
— Developer, GitHub Discussion, 2024

Teams have reported that a healthy Puppeteer deployment typically shows 2-3 Chrome processes, but zombie accumulation can push this to 30+ processes before intervention becomes necessary.

Root Cause Analysis

The zombie process problem stems from several interconnected factors in Puppeteer's architecture and its interaction with operating systems.

WebSocket Communication Failures

Puppeteer communicates with Chrome through WebSocket connections. When network conditions, system load, or timing issues cause the WebSocket connection to fail before the close command completes, the browser process receives no termination signal. The process continues running, believing it still has an active client.

Process Tree Management

Chrome spawns multiple child processes that form a process tree. The main browser process is the parent of GPU processes, utility processes, and renderer processes. When the parent process terminates abnormally, its children may not receive termination signals, especially on Linux where SIGTERM propagation depends on process group configuration.

Docker and Container Isolation

Containers complicate process management significantly. The PID namespace isolation means Chrome processes see a different process hierarchy than the host system. Without an init process (like tini or Docker's --init flag), zombie processes cannot be reaped properly because there is no PID 1 process designed to adopt orphans.

Async Disposal Race Conditions

PuppeteerSharp's DisposeAsync() implementation has documented edge cases where the disposal task never completes. The method internally waits for the browser process to acknowledge shutdown, but if the acknowledgment is lost or delayed, the disposal hangs indefinitely. Developers have reported that DisposeAsync() calls simply never return in certain Docker configurations.

Version Management Complexity

Puppeteer downloads and manages its own Chromium version. When library updates change the expected Chromium version, existing deployments can break. Auto-update mechanisms in CI/CD environments can inadvertently upgrade Puppeteer without upgrading Chromium, or vice versa, causing version mismatches that manifest as silent failures or zombie processes.

Scattered Logging

Debugging zombie processes requires examining logs from multiple sources: the Node.js or .NET application, each Chrome process, and system logs. Chrome processes write to their own stdout/stderr streams, which may not be captured by application logging. This makes it difficult to correlate browser behavior with application state.

Attempted Workarounds

The developer community has developed various strategies to manage zombie processes, each with significant trade-offs.

Workaround 1: Using Docker's Init Process

Approach: Run containers with an init process that properly reaps zombie children.

FROM node:18-slim

# Install tini for proper process supervision
RUN apt-get update && apt-get install -y tini

# Install Chrome dependencies
RUN apt-get install -y \
    chromium \
    fonts-liberation \
    libasound2 \
    libatk-bridge2.0-0 \
    libatk1.0-0 \
    libcups2 \
    --no-install-recommends

ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["node", "app.js"]
Enter fullscreen mode Exit fullscreen mode

Or using Docker's built-in init:

docker run --init --memory=1g your-puppeteer-app
Enter fullscreen mode Exit fullscreen mode

Limitations:

  • Only addresses zombie reaping, not the root cause of failed disposals
  • Requires container configuration changes that may conflict with existing infrastructure
  • Does not prevent memory consumption before zombies are reaped
  • Cannot prevent processes from becoming zombies in the first place

Workaround 2: Aggressive Process Killing

Approach: Periodically kill all Chrome processes and restart the browser pool.

const { execSync } = require('child_process');

// Kill all Chrome processes - nuclear option
function killAllChromeProcesses() {
    try {
        if (process.platform === 'linux') {
            execSync('pkill -9 -f chromium || true');
            execSync('pkill -9 -f chrome || true');
        } else if (process.platform === 'darwin') {
            execSync('pkill -9 -f "Google Chrome" || true');
        }
    } catch (error) {
        // Ignore errors - processes may not exist
    }
}

// Run periodically
setInterval(killAllChromeProcesses, 60000 * 30); // Every 30 minutes
Enter fullscreen mode Exit fullscreen mode

Limitations:

  • Kills active browser sessions, causing in-flight requests to fail
  • Creates race conditions with ongoing PDF generation
  • Not suitable for high-availability deployments
  • Crude approach that indicates architectural problems

Workaround 3: Browser Instance Pooling with Timeout

Approach: Maintain a pool of browser instances with forced recycling after a time limit.

const puppeteer = require('puppeteer');

class BrowserPool {
    constructor(maxAge = 300000, maxInstances = 3) {
        this.maxAge = maxAge;
        this.maxInstances = maxInstances;
        this.instances = [];
    }

    async getBrowser() {
        // Remove expired instances
        const now = Date.now();
        for (const instance of this.instances) {
            if (now - instance.createdAt > this.maxAge) {
                await this.destroyInstance(instance);
            }
        }
        this.instances = this.instances.filter(i => !i.destroyed);

        // Find available instance or create new one
        let instance = this.instances.find(i => !i.inUse);
        if (!instance && this.instances.length < this.maxInstances) {
            instance = await this.createInstance();
            this.instances.push(instance);
        }

        if (instance) {
            instance.inUse = true;
            return instance.browser;
        }

        throw new Error('No available browser instances');
    }

    async createInstance() {
        const browser = await puppeteer.launch({
            headless: 'new',
            args: ['--no-sandbox', '--disable-dev-shm-usage']
        });
        return {
            browser,
            createdAt: Date.now(),
            inUse: false,
            destroyed: false
        };
    }

    async destroyInstance(instance) {
        instance.destroyed = true;
        try {
            await Promise.race([
                instance.browser.close(),
                new Promise((_, reject) =>
                    setTimeout(() => reject(new Error('Close timeout')), 5000)
                )
            ]);
        } catch (error) {
            // Force kill if close times out
            const pid = instance.browser.process()?.pid;
            if (pid) {
                try {
                    process.kill(pid, 'SIGKILL');
                } catch (e) {
                    // Process may already be dead
                }
            }
        }
    }

    release(browser) {
        const instance = this.instances.find(i => i.browser === browser);
        if (instance) {
            instance.inUse = false;
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Limitations:

  • Complex implementation with potential for bugs
  • Forced destruction can still leave zombie child processes
  • Pool management adds latency and resource overhead
  • Does not solve the fundamental disposal problem

Workaround 4: Monitor and Alert on Process Count

Approach: Track Chrome process count and alert when it exceeds thresholds.

const { exec } = require('child_process');
const { promisify } = require('util');
const execAsync = promisify(exec);

async function countChromeProcesses() {
    const { stdout } = await execAsync('pgrep -c chromium || echo 0');
    return parseInt(stdout.trim(), 10);
}

async function monitorProcesses() {
    const count = await countChromeProcesses();
    const expectedMax = 6; // 2 browsers * 3 processes each

    if (count > expectedMax) {
        console.error(`WARNING: ${count} Chrome processes detected (expected max: ${expectedMax})`);
        // Send alert to monitoring system
        // Consider triggering cleanup or restart
    }
}

setInterval(monitorProcesses, 30000);
Enter fullscreen mode Exit fullscreen mode

Limitations:

  • Reactive rather than preventive
  • Requires external monitoring infrastructure
  • Alert fatigue if thresholds are too sensitive
  • Does not fix the underlying issue

Workaround 5: Version Pinning

Approach: Pin both Puppeteer and Chromium versions to prevent auto-update breakage.

{
  "dependencies": {
    "puppeteer": "21.5.0"
  }
}
Enter fullscreen mode Exit fullscreen mode
const puppeteer = require('puppeteer');

async function launchWithSpecificVersion() {
    const browser = await puppeteer.launch({
        executablePath: '/usr/bin/chromium-browser', // Use system Chromium
        headless: 'new'
    });
    return browser;
}
Enter fullscreen mode Exit fullscreen mode

Limitations:

  • Security updates require manual intervention
  • System Chromium may have different behavior than bundled version
  • Coordination required across development, staging, and production
  • Does not prevent zombie processes, only reduces one cause

A Different Approach: IronPDF

For .NET applications where managing Chromium processes has become a maintenance burden, IronPDF offers an alternative that eliminates the process management problem entirely. Instead of spawning external browser processes that must be tracked and disposed, IronPDF embeds the Chrome rendering engine within the .NET process itself.

Why IronPDF Avoids Zombie Processes

IronPDF's architecture is fundamentally different from Puppeteer's external process model:

Single Process Architecture: The Chrome rendering engine runs as part of the .NET application process, not as a separate executable. There are no external Chrome processes to become zombies because there are no external processes at all.

Standard .NET Disposal: Memory and resources are managed through normal .NET garbage collection and IDisposable patterns. When a PdfDocument is disposed, its resources are released through the same mechanisms as any other .NET object.

No WebSocket Communication: Without external processes, there is no WebSocket protocol layer that can fail, time out, or lose messages. Commands execute directly within the application's memory space.

No Version Coordination: The rendering engine version is fixed to the IronPDF library version. Upgrading IronPDF upgrades the rendering engine automatically, eliminating version mismatch issues.

Unified Logging: All rendering activity occurs within the application process, meaning logs appear in the application's standard output without scattering across multiple process streams.

Code Example

using IronPdf;
using System;
using System.Collections.Generic;
using System.Threading.Tasks;

// PDF generation service demonstrating process-stable operation
// No external Chrome processes to manage or monitor
public class StablePdfService
{
    private readonly ChromePdfRenderer _renderer;

    public StablePdfService()
    {
        // Initialize renderer once - no browser launch needed
        _renderer = new ChromePdfRenderer();

        // Configure rendering options
        _renderer.RenderingOptions.PaperSize = IronPdf.Rendering.PdfPaperSize.A4;
        _renderer.RenderingOptions.MarginTop = 25;
        _renderer.RenderingOptions.MarginBottom = 25;
    }

    public byte[] GeneratePdf(string htmlContent)
    {
        // Render HTML to PDF - no external process spawned
        // No zombie process possible because no process is created
        using (var pdf = _renderer.RenderHtmlAsPdf(htmlContent))
        {
            return pdf.BinaryData;
        }
        // Memory released through standard .NET disposal
    }

    public async Task<List<byte[]>> GenerateBatchAsync(IEnumerable<string> htmlDocuments)
    {
        var results = new List<byte[]>();

        // Process multiple PDFs without process accumulation
        foreach (var html in htmlDocuments)
        {
            using (var pdf = _renderer.RenderHtmlAsPdf(html))
            {
                results.Add(pdf.BinaryData);
            }
            // Each PDF is fully released before the next begins
            // No process monitoring required
        }

        return results;
    }

    public byte[] GenerateFromUrl(string url)
    {
        // Render external URL with JavaScript execution
        // Chrome engine runs internally, not as separate process
        using (var pdf = _renderer.RenderUrlAsPdf(url))
        {
            return pdf.BinaryData;
        }
    }
}

// Docker deployment - no init process or process supervision required
public class DockerCompatibleService
{
    public byte[] GenerateInContainer(string html)
    {
        // Works in standard Docker containers without --init flag
        // No zombie processes to reap because no external processes exist
        var renderer = new ChromePdfRenderer();

        using (var pdf = renderer.RenderHtmlAsPdf(html))
        {
            return pdf.BinaryData;
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Key points about this code:

  • No browser launch, close, or disposal methods to call
  • No WebSocket connections that can fail or time out
  • Standard using statements handle all resource cleanup
  • Process count remains constant regardless of PDF volume
  • Container deployments require no special configuration
  • Logs appear in application output, not scattered across processes

API Reference

For more details on the methods used:

Migration Considerations

Licensing

IronPDF is commercial software requiring a license for production deployment. Licenses are per-developer and include deployment to unlimited servers. A free trial is available for evaluation. Teams should weigh licensing costs against the engineering time currently spent managing Puppeteer processes and the infrastructure costs of running containers with extra headroom for zombie processes.

API Differences

Puppeteer exposes browser automation primitives (launch browser, create page, navigate, execute script, generate PDF). IronPDF provides a direct path from HTML to PDF without the browser automation layer. Migration involves:

  • Removing Puppeteer.launch() / BrowserFetcher code entirely
  • Replacing page creation and navigation with RenderHtmlAsPdf() or RenderUrlAsPdf()
  • Removing disposal code for browsers and pages
  • Removing process monitoring and cleanup infrastructure
  • Removing Docker init process configuration

For applications using Puppeteer purely for PDF generation, this simplifies the codebase significantly. For applications using Puppeteer for other browser automation (testing, scraping), IronPDF addresses only the PDF portion.

What You Gain

  • Zero external processes to manage, monitor, or clean up
  • No zombie process accumulation under any load conditions
  • Predictable memory usage bounded by application behavior
  • Simplified Docker deployments without init processes
  • Unified logging through application output
  • No version coordination between library and browser
  • Consistent behavior across Windows, Linux, and macOS

What to Consider

  • Commercial licensing versus open-source Puppeteer
  • IronPDF is specific to PDF generation; Puppeteer offers broader browser automation
  • Different rendering engine may produce slightly different output
  • Initial integration adds the IronPDF package (approximately 150MB)

Conclusion

Puppeteer's external process architecture creates inherent challenges when browser disposal fails. Zombie Chrome processes accumulate under load, exhaust container memory limits, and scatter debugging information across multiple process outputs. Version management between library and browser adds another failure mode. For .NET applications where PDF generation is the primary use case, IronPDF eliminates these issues by removing external processes from the architecture entirely.


Jacob Mellor is CTO at Iron Software and built the company's core codebase, pioneering C# PDF technology.


References

  1. Page.Dispose() never completing - GitHub Issue #122{:rel="nofollow"} - Early disposal issue documentation
  2. DisposeAsync hanging in Docker - GitHub Issue #1489{:rel="nofollow"} - PuppeteerSharp disposal hanging
  3. Chrome zombie processes in Docker - GitHub Issue #5645{:rel="nofollow"} - Container zombie process reports
  4. Docker container memory always increasing - GitHub Issue #8695{:rel="nofollow"} - Memory growth in containers
  5. Zombie Chrome processes in Docker/Kubernetes - GitHub Issue #12854{:rel="nofollow"} - Orchestration environment issues
  6. Browser instances not closing properly - GitHub Issue #10030{:rel="nofollow"} - Multiple browser disposal failures
  7. Puppeteer Zombie Process Solution - Medium{:rel="nofollow"} - Community workaround documentation
  8. Puppeteer Troubleshooting Guide{:rel="nofollow"} - Official troubleshooting documentation
  9. How to workaround RAM-leaking libraries like Puppeteer - DevForth{:rel="nofollow"} - Memory management strategies
  10. The Hidden Cost of Headless Browsers - Medium{:rel="nofollow"} - Production memory leak journey

For the latest IronPDF documentation and tutorials, visit ironpdf.com.

Top comments (0)