IronSoftware

Posted on Apr 8

Puppeteer Zombie Processes and Browser Instances (Issue Fixed)

#csharp #dotnet

Developers running Puppeteer or PuppeteerSharp in production often discover their servers accumulating orphaned Chrome processes that consume memory but never terminate. These zombie browser processes escape disposal calls, pile up under load, and eventually exhaust system resources. The problem intensifies in containerized environments where memory limits trigger OOM kills, and debugging becomes nearly impossible when logs scatter across dozens of orphaned browser instances.

The Problem

When Puppeteer launches a browser, it spawns multiple operating system processes: a main browser process, GPU process, network service, and renderer processes for each tab. These processes communicate with the Node.js or .NET application through WebSocket connections using the Chrome DevTools Protocol. The parent application sends commands and receives responses, but it does not directly control the lifecycle of these child processes.

Under normal conditions, calling browser.close() in Node.js or Browser.CloseAsync() in PuppeteerSharp signals the browser to terminate gracefully. The browser process should then exit, taking its child processes with it. In practice, this signal frequently fails to reach all processes, or the processes fail to respond to it. The result is a zombie process that remains in memory, consuming resources but no longer responding to commands.

The zombie process problem compounds under high-traffic conditions. Each failed disposal leaves behind processes consuming 100-300MB of RAM. With hundreds of PDF generation requests per hour, memory consumption climbs steadily. Server monitoring shows RAM usage increasing even when the application itself reports successful disposal. Eventually, the operating system intervenes with OOM kills, or the server becomes unresponsive.

Error Messages and Symptoms

Applications experiencing zombie browser processes typically encounter these errors and behaviors:

Error: Protocol error (Target.createTarget): Target closed.

TimeoutError: Timed out after 30000 ms while trying to connect to the browser!
Most likely the browser was closed.

Error: Browser disconnected unexpectedly (closed?)

PuppeteerSharp.TargetClosedException: Protocol error (Target.activateTarget):
Target closed. (Session closed. Most likely the page has been closed.)

Container killed with exit code 137 (OOM killed by kernel)

Error: ECONNREFUSED - Connection refused

Observable symptoms include:

Memory usage climbing without corresponding application load
ps aux | grep chrome showing dozens of orphaned Chrome processes
Application logs reporting successful browser disposal while processes remain
Container restarts due to memory limits without application errors
Browser launch operations timing out due to resource exhaustion
Server becoming unresponsive during peak traffic periods
Logs appearing fragmented across multiple Chrome process outputs

Who Is Affected

The zombie process issue affects deployments across multiple dimensions:

Container Deployments: Docker and Kubernetes environments suffer most severely. Containers typically run with memory ceilings (512MB-2GB), and each orphaned Chrome process consumes a substantial portion of that allocation. The OOM killer terminates containers abruptly, causing request failures and potential data loss.

Long-Running Server Applications: Web servers and background services running Puppeteer for PDF generation accumulate zombie processes over days or weeks of operation. The gradual memory growth is often mistaken for a memory leak in application code rather than orphaned browser processes.

Serverless Functions: AWS Lambda, Azure Functions, and Google Cloud Functions face unique challenges. Each function invocation may leave behind processes that persist beyond the function's execution timeout, consuming resources from subsequent invocations in warm containers.

CI/CD Pipelines: Build servers running Puppeteer for visual testing or screenshot generation accumulate zombie processes across multiple builds. Shared build agents eventually require manual intervention or restarts.

Linux Environments: Linux deployments face additional complexity because of how the kernel handles orphaned processes. When a parent process terminates without properly reaping its children, those children become zombies attached to PID 1 (init), where they remain until explicitly killed.

Evidence from the Developer Community

The zombie process problem is documented across GitHub issues, technical blogs, and developer forums spanning several years.

Timeline

Date	Event	Source
2019-01-15	Page.Dispose() never completing reported	GitHub Issue #122
2020-02-15	Chrome zombie processes in Docker documented	GitHub puppeteer Issue #5645
2021-08-17	DisposeAsync hanging in Docker reported	GitHub PuppeteerSharp Issue #1489
2022-06-01	Zombie Chrome processes in Kubernetes clusters	GitHub Issue #8695
2023-04-10	Multiple browser instances not closing properly	GitHub puppeteer Issue #10030
2024-01-15	Zombie Chrome processes in Docker/Kubernetes	GitHub puppeteer Issue #12854
2024-06-20	Browser version mismatch causing failures	Multiple GitHub discussions

Community Reports

"We found 47 orphaned Chrome processes on our production server. Each was consuming 150-200MB of RAM. The application logs showed all browsers had been properly closed."
— DevOps Engineer, Reddit r/node, 2023

"When running PuppeteerSharp with Docker, I'm finding quite a lot of zombie Chrome processes that never get killed. Even using tini as an entry point didn't resolve the issue."
— Developer, GitHub Issue #1489

"The browser.close() call completes successfully according to our logs, but the Chrome processes remain. We've resorted to periodically running pkill -f chromium in a cron job."
— Developer, Stack Overflow, 2023

"After upgrading Chrome, our entire PDF generation pipeline broke. The library version no longer matched the browser version, and we had no way to know until production failed."
— Developer, GitHub Discussion, 2024

Teams have reported that a healthy Puppeteer deployment typically shows 2-3 Chrome processes, but zombie accumulation can push this to 30+ processes before intervention becomes necessary.

Root Cause Analysis

The zombie process problem stems from several interconnected factors in Puppeteer's architecture and its interaction with operating systems.

WebSocket Communication Failures

Puppeteer communicates with Chrome through WebSocket connections. When network conditions, system load, or timing issues cause the WebSocket connection to fail before the close command completes, the browser process receives no termination signal. The process continues running, believing it still has an active client.

Process Tree Management

Chrome spawns multiple child processes that form a process tree. The main browser process is the parent of GPU processes, utility processes, and renderer processes. When the parent process terminates abnormally, its children may not receive termination signals, especially on Linux where SIGTERM propagation depends on process group configuration.

Docker and Container Isolation

Containers complicate process management significantly. The PID namespace isolation means Chrome processes see a different process hierarchy than the host system. Without an init process (like tini or Docker's --init flag), zombie processes cannot be reaped properly because there is no PID 1 process designed to adopt orphans.

Async Disposal Race Conditions

PuppeteerSharp's DisposeAsync() implementation has documented edge cases where the disposal task never completes. The method internally waits for the browser process to acknowledge shutdown, but if the acknowledgment is lost or delayed, the disposal hangs indefinitely. Developers have reported that DisposeAsync() calls simply never return in certain Docker configurations.

Version Management Complexity

Puppeteer downloads and manages its own Chromium version. When library updates change the expected Chromium version, existing deployments can break. Auto-update mechanisms in CI/CD environments can inadvertently upgrade Puppeteer without upgrading Chromium, or vice versa, causing version mismatches that manifest as silent failures or zombie processes.

Scattered Logging

Debugging zombie processes requires examining logs from multiple sources: the Node.js or .NET application, each Chrome process, and system logs. Chrome processes write to their own stdout/stderr streams, which may not be captured by application logging. This makes it difficult to correlate browser behavior with application state.

Attempted Workarounds

The developer community has developed various strategies to manage zombie processes, each with significant trade-offs.

Workaround 1: Using Docker's Init Process

Approach: Run containers with an init process that properly reaps zombie children.

FROM node:18-slim

# Install tini for proper process supervision
RUN apt-get update && apt-get install -y tini

# Install Chrome dependencies
RUN apt-get install -y \
    chromium \
    fonts-liberation \
    libasound2 \
    libatk-bridge2.0-0 \
    libatk1.0-0 \
    libcups2 \
    --no-install-recommends

ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["node", "app.js"]

Or using Docker's built-in init:

docker run --init --memory=1g your-puppeteer-app

Limitations:

Only addresses zombie reaping, not the root cause of failed disposals
Requires container configuration changes that may conflict with existing infrastructure
Does not prevent memory consumption before zombies are reaped
Cannot prevent processes from becoming zombies in the first place

Workaround 2: Aggressive Process Killing

Approach: Periodically kill all Chrome processes and restart the browser pool.

const { execSync } = require('child_process');

// Kill all Chrome processes - nuclear option
function killAllChromeProcesses() {
    try {
        if (process.platform === 'linux') {
            execSync('pkill -9 -f chromium || true');
            execSync('pkill -9 -f chrome || true');
        } else if (process.platform === 'darwin') {
            execSync('pkill -9 -f "Google Chrome" || true');
        }
    } catch (error) {
        // Ignore errors - processes may not exist
    }
}

// Run periodically
setInterval(killAllChromeProcesses, 60000 * 30); // Every 30 minutes

Limitations:

Kills active browser sessions, causing in-flight requests to fail
Creates race conditions with ongoing PDF generation
Not suitable for high-availability deployments
Crude approach that indicates architectural problems

Workaround 3: Browser Instance Pooling with Timeout

Approach: Maintain a pool of browser instances with forced recycling after a time limit.

const puppeteer = require('puppeteer');

class BrowserPool {
    constructor(maxAge = 300000, maxInstances = 3) {
        this.maxAge = maxAge;
        this.maxInstances = maxInstances;
        this.instances = [];
    }

    async getBrowser() {
        // Remove expired instances
        const now = Date.now();
        for (const instance of this.instances) {
            if (now - instance.createdAt > this.maxAge) {
                await this.destroyInstance(instance);
            }
        }
        this.instances = this.instances.filter(i => !i.destroyed);

        // Find available instance or create new one
        let instance = this.instances.find(i => !i.inUse);
        if (!instance && this.instances.length < this.maxInstances) {
            instance = await this.createInstance();
            this.instances.push(instance);
        }

        if (instance) {
            instance.inUse = true;
            return instance.browser;
        }

        throw new Error('No available browser instances');
    }

    async createInstance() {
        const browser = await puppeteer.launch({
            headless: 'new',
            args: ['--no-sandbox', '--disable-dev-shm-usage']
        });
        return {
            browser,
            createdAt: Date.now(),
            inUse: false,
            destroyed: false
        };
    }

    async destroyInstance(instance) {
        instance.destroyed = true;
        try {
            await Promise.race([
                instance.browser.close(),
                new Promise((_, reject) =>
                    setTimeout(() => reject(new Error('Close timeout')), 5000)
                )
            ]);
        } catch (error) {
            // Force kill if close times out
            const pid = instance.browser.process()?.pid;
            if (pid) {
                try {
                    process.kill(pid, 'SIGKILL');
                } catch (e) {
                    // Process may already be dead
                }
            }
        }
    }

    release(browser) {
        const instance = this.instances.find(i => i.browser === browser);
        if (instance) {
            instance.inUse = false;
        }
    }
}

Limitations:

Complex implementation with potential for bugs
Forced destruction can still leave zombie child processes
Pool management adds latency and resource overhead
Does not solve the fundamental disposal problem

Workaround 4: Monitor and Alert on Process Count

Approach: Track Chrome process count and alert when it exceeds thresholds.

const { exec } = require('child_process');
const { promisify } = require('util');
const execAsync = promisify(exec);

async function countChromeProcesses() {
    const { stdout } = await execAsync('pgrep -c chromium || echo 0');
    return parseInt(stdout.trim(), 10);
}

async function monitorProcesses() {
    const count = await countChromeProcesses();
    const expectedMax = 6; // 2 browsers * 3 processes each

    if (count > expectedMax) {
        console.error(`WARNING: ${count} Chrome processes detected (expected max: ${expectedMax})`);
        // Send alert to monitoring system
        // Consider triggering cleanup or restart
    }
}

setInterval(monitorProcesses, 30000);

Limitations:

Reactive rather than preventive
Requires external monitoring infrastructure
Alert fatigue if thresholds are too sensitive
Does not fix the underlying issue

Workaround 5: Version Pinning

Approach: Pin both Puppeteer and Chromium versions to prevent auto-update breakage.

{
  "dependencies": {
    "puppeteer": "21.5.0"
  }
}

const puppeteer = require('puppeteer');

async function launchWithSpecificVersion() {
    const browser = await puppeteer.launch({
        executablePath: '/usr/bin/chromium-browser', // Use system Chromium
        headless: 'new'
    });
    return browser;
}

Limitations:

Security updates require manual intervention
System Chromium may have different behavior than bundled version
Coordination required across development, staging, and production
Does not prevent zombie processes, only reduces one cause

A Different Approach: IronPDF

For .NET applications where managing Chromium processes has become a maintenance burden, IronPDF offers an alternative that eliminates the process management problem entirely. Instead of spawning external browser processes that must be tracked and disposed, IronPDF embeds the Chrome rendering engine within the .NET process itself.

Why IronPDF Avoids Zombie Processes

IronPDF's architecture is fundamentally different from Puppeteer's external process model:

Single Process Architecture: The Chrome rendering engine runs as part of the .NET application process, not as a separate executable. There are no external Chrome processes to become zombies because there are no external processes at all.

Standard .NET Disposal: Memory and resources are managed through normal .NET garbage collection and IDisposable patterns. When a PdfDocument is disposed, its resources are released through the same mechanisms as any other .NET object.

No WebSocket Communication: Without external processes, there is no WebSocket protocol layer that can fail, time out, or lose messages. Commands execute directly within the application's memory space.

No Version Coordination: The rendering engine version is fixed to the IronPDF library version. Upgrading IronPDF upgrades the rendering engine automatically, eliminating version mismatch issues.

Unified Logging: All rendering activity occurs within the application process, meaning logs appear in the application's standard output without scattering across multiple process streams.

Code Example

using IronPdf;
using System;
using System.Collections.Generic;
using System.Threading.Tasks;

// PDF generation service demonstrating process-stable operation
// No external Chrome processes to manage or monitor
public class StablePdfService
{
    private readonly ChromePdfRenderer _renderer;

    public StablePdfService()
    {
        // Initialize renderer once - no browser launch needed
        _renderer = new ChromePdfRenderer();

        // Configure rendering options
        _renderer.RenderingOptions.PaperSize = IronPdf.Rendering.PdfPaperSize.A4;
        _renderer.RenderingOptions.MarginTop = 25;
        _renderer.RenderingOptions.MarginBottom = 25;
    }

    public byte[] GeneratePdf(string htmlContent)
    {
        // Render HTML to PDF - no external process spawned
        // No zombie process possible because no process is created
        using (var pdf = _renderer.RenderHtmlAsPdf(htmlContent))
        {
            return pdf.BinaryData;
        }
        // Memory released through standard .NET disposal
    }

    public async Task<List<byte[]>> GenerateBatchAsync(IEnumerable<string> htmlDocuments)
    {
        var results = new List<byte[]>();

        // Process multiple PDFs without process accumulation
        foreach (var html in htmlDocuments)
        {
            using (var pdf = _renderer.RenderHtmlAsPdf(html))
            {
                results.Add(pdf.BinaryData);
            }
            // Each PDF is fully released before the next begins
            // No process monitoring required
        }

        return results;
    }

    public byte[] GenerateFromUrl(string url)
    {
        // Render external URL with JavaScript execution
        // Chrome engine runs internally, not as separate process
        using (var pdf = _renderer.RenderUrlAsPdf(url))
        {
            return pdf.BinaryData;
        }
    }
}

// Docker deployment - no init process or process supervision required
public class DockerCompatibleService
{
    public byte[] GenerateInContainer(string html)
    {
        // Works in standard Docker containers without --init flag
        // No zombie processes to reap because no external processes exist
        var renderer = new ChromePdfRenderer();

        using (var pdf = renderer.RenderHtmlAsPdf(html))
        {
            return pdf.BinaryData;
        }
    }
}

Key points about this code:

No browser launch, close, or disposal methods to call
No WebSocket connections that can fail or time out
Standard using statements handle all resource cleanup
Process count remains constant regardless of PDF volume
Container deployments require no special configuration
Logs appear in application output, not scattered across processes

API Reference

For more details on the methods used:

ChromePdfRenderer - Main rendering class
RenderHtmlAsPdf - HTML to PDF conversion
RenderUrlAsPdf - URL to PDF conversion
Docker Deployment Guide - Container configuration

Migration Considerations

Licensing

IronPDF is commercial software requiring a license for production deployment. Licenses are per-developer and include deployment to unlimited servers. A free trial is available for evaluation. Teams should weigh licensing costs against the engineering time currently spent managing Puppeteer processes and the infrastructure costs of running containers with extra headroom for zombie processes.

API Differences

Puppeteer exposes browser automation primitives (launch browser, create page, navigate, execute script, generate PDF). IronPDF provides a direct path from HTML to PDF without the browser automation layer. Migration involves:

Removing Puppeteer.launch() / BrowserFetcher code entirely
Replacing page creation and navigation with RenderHtmlAsPdf() or RenderUrlAsPdf()
Removing disposal code for browsers and pages
Removing process monitoring and cleanup infrastructure
Removing Docker init process configuration

For applications using Puppeteer purely for PDF generation, this simplifies the codebase significantly. For applications using Puppeteer for other browser automation (testing, scraping), IronPDF addresses only the PDF portion.

What You Gain

Zero external processes to manage, monitor, or clean up
No zombie process accumulation under any load conditions
Predictable memory usage bounded by application behavior
Simplified Docker deployments without init processes
Unified logging through application output
No version coordination between library and browser
Consistent behavior across Windows, Linux, and macOS

What to Consider

Commercial licensing versus open-source Puppeteer
IronPDF is specific to PDF generation; Puppeteer offers broader browser automation
Different rendering engine may produce slightly different output
Initial integration adds the IronPDF package (approximately 150MB)

Conclusion

Puppeteer's external process architecture creates inherent challenges when browser disposal fails. Zombie Chrome processes accumulate under load, exhaust container memory limits, and scatter debugging information across multiple process outputs. Version management between library and browser adds another failure mode. For .NET applications where PDF generation is the primary use case, IronPDF eliminates these issues by removing external processes from the architecture entirely.

Jacob Mellor is CTO at Iron Software and built the company's core codebase, pioneering C# PDF technology.

References

Page.Dispose() never completing - GitHub Issue #122{:rel="nofollow"} - Early disposal issue documentation
DisposeAsync hanging in Docker - GitHub Issue #1489{:rel="nofollow"} - PuppeteerSharp disposal hanging
Chrome zombie processes in Docker - GitHub Issue #5645{:rel="nofollow"} - Container zombie process reports
Docker container memory always increasing - GitHub Issue #8695{:rel="nofollow"} - Memory growth in containers
Zombie Chrome processes in Docker/Kubernetes - GitHub Issue #12854{:rel="nofollow"} - Orchestration environment issues
Browser instances not closing properly - GitHub Issue #10030{:rel="nofollow"} - Multiple browser disposal failures
Puppeteer Zombie Process Solution - Medium{:rel="nofollow"} - Community workaround documentation
Puppeteer Troubleshooting Guide{:rel="nofollow"} - Official troubleshooting documentation
How to workaround RAM-leaking libraries like Puppeteer - DevForth{:rel="nofollow"} - Memory management strategies
The Hidden Cost of Headless Browsers - Medium{:rel="nofollow"} - Production memory leak journey

For the latest IronPDF documentation and tutorials, visit ironpdf.com.

DEV Community

Puppeteer Zombie Processes and Browser Instances (Issue Fixed)

The Problem

Error Messages and Symptoms

Who Is Affected

Evidence from the Developer Community

Timeline

Community Reports

Root Cause Analysis

WebSocket Communication Failures

Process Tree Management

Docker and Container Isolation

Async Disposal Race Conditions

Version Management Complexity

Scattered Logging

Attempted Workarounds

Workaround 1: Using Docker's Init Process

Workaround 2: Aggressive Process Killing

Workaround 3: Browser Instance Pooling with Timeout

Workaround 4: Monitor and Alert on Process Count

Workaround 5: Version Pinning

A Different Approach: IronPDF

Why IronPDF Avoids Zombie Processes

Code Example

API Reference

Migration Considerations

Licensing

API Differences

What You Gain

What to Consider

Conclusion

References

Top comments (0)