IronSoftware

Posted on Feb 27

Puppeteer Sharp Memory Leaks and Chromium Resource Exhaustion (Fixed)

#csharp #dotnet

Developers using Puppeteer-Sharp for HTML-to-PDF conversion in .NET applications frequently encounter memory leaks that cause server crashes, out-of-memory exceptions, and zombie Chromium processes that refuse to terminate. These issues become acute in production environments where high traffic and container memory limits transform minor inefficiencies into service outages.

The Problem

Puppeteer-Sharp operates as a .NET wrapper around the Chrome DevTools Protocol, spawning actual Chromium browser processes for rendering operations. Each browser instance consumes between 100-200MB of RAM at minimum, with individual pages adding another 50-100MB depending on content complexity. This memory is allocated in the Chromium process, outside the .NET garbage collector's control.

The core issue stems from Chromium's process model. When a .NET application calls Puppeteer.LaunchAsync(), it spawns multiple OS-level processes: the main browser process plus renderer processes for each tab. These processes manage their own memory heaps, event loops, and network stacks. The .NET application communicates with them through WebSocket connections using the DevTools Protocol, but cannot directly reclaim their memory through standard disposal patterns.

Memory accumulates through several mechanisms. The Chrome rendering engine caches parsed HTML, CSS, JavaScript bytecode, images, and network responses. Even after Page.CloseAsync() completes, some of this cached data persists in the browser process. Multiple operations compound the problem: each PDF render cycle adds to the memory footprint, and the leaked memory never returns to the operating system until the entire Chromium process terminates.

Error Messages and Symptoms

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at PuppeteerSharp.Connection.SendAsync(String method, Object args)
   at PuppeteerSharp.Page.PdfAsync(PdfOptions options)

Page crashed!
Protocol error (Page.printToPDF): Target closed.

Container killed with exit code 137 (OOM killed)

When running in Docker or Kubernetes, memory limits trigger the OOM killer, abruptly terminating containers without clean shutdown. The Puppeteer-Sharp application receives no warning before the operating system intervenes.

Who Is Affected

The memory leak issue impacts specific deployment scenarios more severely than others.

High-traffic web applications generating PDFs on demand face the most severe problems. Each concurrent request spawns browser resources, and under load, memory consumption spirals beyond container limits within minutes. Web servers handling 50+ concurrent PDF requests routinely crash without intervention.

Dockerized deployments suffer disproportionately because containers typically run with memory ceilings. A 512MB container limit provides roughly 3-5 concurrent Chromium pages before the OOM killer activates. The default shared memory allocation (/dev/shm) in Docker is 64MB, insufficient for Chromium's shared memory requirements, causing additional failures.

Kubernetes pods exhibit cascading failures when one pod hits memory limits. The orchestrator restarts the pod, but pending requests fail, triggering retry storms that overwhelm replacement pods. Memory limits below 1GB per pod make stable Puppeteer-Sharp deployments nearly impossible.

Long-running batch processes accumulate memory over time. A process converting thousands of documents might start with 200MB footprint and reach multiple gigabytes hours later, eventually failing despite adequate initial resources.

Linux server deployments using .NET Core encounter additional complications. The Chromium sandbox requires specific kernel capabilities, and running without sandboxing (--no-sandbox) introduces security concerns while consuming additional memory for process isolation.

Evidence from the Developer Community

The memory leak problem is extensively documented across GitHub issues, technical blogs, and developer forums.

Timeline

Date	Event	Source
2018-09-21	Memory leak in Connection.cs callbacks identified	GitHub Issue #640
2019-01-15	Page.Dispose() never completing reported	GitHub Issue #122
2020-02-15	Chrome memory leak confirmed across Puppeteer versions	GitHub Issue #5893
2021-08-17	DisposeAsync hanging in Docker reported	GitHub Issue #1489
2022-11-08	Thread safety causing lockups documented	GitHub Issue #714
2023-08-17	AsyncDictionaryHelper leak adding 1KB per iteration	GitHub Discussion #2283
2024-01-15	Zombie processes in Docker/Kubernetes documented	GitHub Issue #12854

Community Reports

"After about 30 seconds of executing requests, the Chromium Helper (Renderer) memory starts ticking up by ~0.5 megabytes/second. This seems to happen on Mac and Windows."
— Developer report, GitHub Issue #9283

"We were seeing memory grow larger and larger with repeated invocations despite proper resource disposal."
— Developer report, GitHub Issue #2125

"The process memory usage continues to increase when you use Puppeteer in a long-running process. Your server monitoring tool starts reporting 'RAM is almost full.'"
— DevForth Technical Blog

"After 40K iterations, memory usage had increased by about 34MB, roughly 1KB per iteration."
— GitHub Discussion #2283, AsyncDictionaryHelper leak investigation

Multiple production teams have documented their experiences with Puppeteer memory issues. One team reported that implementing multiple smaller browser instances instead of one large instance helped manage the problem but introduced zombie process management challenges. A healthy deployment typically runs 2-3 Chrome processes, but failure scenarios can spawn dozens of orphaned instances.

Root Cause Analysis

The memory leak stems from architectural decisions in both Chromium and Puppeteer-Sharp's implementation.

Chromium's Process Model: Chrome spawns multiple processes by design for security isolation. The main browser process manages tabs, while separate renderer processes handle page content. Each renderer maintains its own V8 JavaScript heap, DOM tree, and rendering pipeline. These processes communicate through IPC, and memory cannot be shared or reclaimed across process boundaries without terminating the process entirely.

Callback Dictionary Accumulation: In Puppeteer-Sharp's Connection.cs, the _callbacks dictionary accumulates TaskCompletionSource objects for each DevTools Protocol message. Earlier versions failed to remove these entries after completion, causing managed memory growth. While fixed in newer versions, the pattern reveals how easy it is for memory to accumulate in async communication code.

AsyncDictionaryHelper Leak: The AsyncDictionaryHelper class adds entries to a MultiMap for tracking async operations. When entries are removed, empty collections in the MultiMap persist, causing approximately 1KB of leaked memory per operation. Over thousands of operations, this accumulates to significant memory consumption.

Disposal Timing Issues: The DisposeAsync() method can hang indefinitely in certain scenarios, particularly in Docker environments. Developers have reported that Browser.DisposeAsync() never completes, leaving Chromium processes running as zombies. Switching to synchronous Dispose() sometimes resolves this, but introduces its own timing complications.

Lack of Thread Safety: Internal collections in Puppeteer-Sharp were not designed for concurrent access. The NetworkManager._attemptedAuthentications dictionary can corrupt under parallel requests, causing request cancellations. Using Parallel.Invoke with Puppeteer-Sharp can cause one browser to seize control of another task's resources, leading to hangs and resource leaks.

Attempted Workarounds

The developer community has documented numerous approaches to mitigate Puppeteer-Sharp's memory issues, each with significant limitations.

Workaround 1: Aggressive Page Disposal

Approach: Close and dispose pages immediately after each operation rather than reusing them.

public async Task<byte[]> GeneratePdfWithDisposal(string html)
{
    await using var browser = await Puppeteer.LaunchAsync(new LaunchOptions
    {
        Headless = true,
        Args = new[] { "--no-sandbox", "--disable-dev-shm-usage" }
    });

    await using var page = await browser.NewPageAsync();
    await page.SetContentAsync(html);
    var pdfBytes = await page.PdfDataAsync();

    // Explicit cleanup before disposal
    await page.CloseAsync();
    // browser disposal happens via await using

    return pdfBytes;
}

Limitations:

Browser startup adds 1-3 seconds per PDF
Does not prevent Chromium's internal memory accumulation
DisposeAsync can hang indefinitely in Docker
High CPU overhead from repeated browser launches

Workaround 2: Browser Recycling After N Operations

Approach: Track operation count and restart the browser process periodically.

private IBrowser _browser;
private int _operationCount = 0;
private const int MaxOperations = 100;
private readonly SemaphoreSlim _lock = new(1, 1);

public async Task<byte[]> GeneratePdfWithRecycling(string html)
{
    await _lock.WaitAsync();
    try
    {
        if (_browser == null || _operationCount >= MaxOperations)
        {
            if (_browser != null)
            {
                await _browser.CloseAsync();
                _browser.Dispose();
            }
            _browser = await Puppeteer.LaunchAsync(new LaunchOptions
            {
                Headless = true,
                Args = new[] { "--no-sandbox", "--disable-dev-shm-usage" }
            });
            _operationCount = 0;
        }

        _operationCount++;
    }
    finally
    {
        _lock.Release();
    }

    await using var page = await _browser.NewPageAsync();
    await page.SetContentAsync(html);
    return await page.PdfDataAsync();
}

Limitations:

Semaphore serializes all PDF operations
Arbitrary operation limit requires tuning per environment
Memory still accumulates between recycles
Crash during disposal can leave orphaned processes

Workaround 3: Docker Init Process (dumb-init or tini)

Approach: Use an init process to properly reap zombie Chromium processes in containers.

FROM mcr.microsoft.com/dotnet/aspnet:8.0

# Install tini for zombie process reaping
RUN apt-get update && apt-get install -y tini chromium

ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["dotnet", "YourApp.dll"]

Or using Docker's built-in init:

docker run --init --memory=1g your-image

Limitations:

Only addresses zombie cleanup, not memory leaks
Requires container configuration changes
Does not prevent OOM crashes
Adds container startup complexity

Workaround 4: External Process Isolation

Approach: Run Puppeteer in a separate process that can be killed and restarted.

// Main application spawns worker processes
public async Task<byte[]> GeneratePdfIsolated(string html)
{
    var tempInput = Path.GetTempFileName();
    var tempOutput = Path.ChangeExtension(tempInput, ".pdf");

    await File.WriteAllTextAsync(tempInput, html);

    var process = new Process
    {
        StartInfo = new ProcessStartInfo
        {
            FileName = "dotnet",
            Arguments = $"run --project PdfWorker -- \"{tempInput}\" \"{tempOutput}\"",
            UseShellExecute = false
        }
    };

    process.Start();
    var completed = process.WaitForExit(30000);

    if (!completed)
    {
        process.Kill(true); // Kill entire process tree
        throw new TimeoutException("PDF generation timed out");
    }

    var result = await File.ReadAllBytesAsync(tempOutput);
    File.Delete(tempInput);
    File.Delete(tempOutput);

    return result;
}

Limitations:

Significant performance overhead from process spawning
File I/O adds latency and potential security concerns
Complex error handling across process boundaries
Does not scale well under high load

A Different Approach: IronPDF

For .NET applications requiring reliable PDF generation, IronPDF offers an alternative architecture that sidesteps Puppeteer-Sharp's memory management challenges. Rather than spawning external Chromium processes controlled through WebSocket communication, IronPDF embeds the Chrome rendering engine directly within the .NET process using the Chromium Embedded Framework (CEF).

Why IronPDF Handles Memory Differently

IronPDF's architecture differs fundamentally from Puppeteer-Sharp's process-per-browser model. The Chrome rendering engine initializes once when the first PDF operation occurs, then remains warm for subsequent renders. This eliminates the 1-3 second startup overhead per operation while maintaining the rendering engine under direct .NET process control.

Memory management occurs within the .NET application's address space. The ChromePdfRenderer class implements IDisposable, and calling Dispose() or using using statements releases resources through standard .NET patterns. There are no external processes to orphan, no WebSocket connections to hang, and no zombie processes to accumulate.

The embedded approach also eliminates thread safety concerns present in Puppeteer-Sharp. IronPDF's rendering operations are designed for concurrent use without semaphores or manual synchronization. Multiple threads can generate PDFs simultaneously without risk of one thread seizing another's browser instance.

Code Example

using IronPdf;
using System.Threading.Tasks;

// High-volume PDF generation service demonstrating memory-stable operation
public class PdfGenerationService
{
    // ChromePdfRenderer is lightweight and reusable across requests
    private readonly ChromePdfRenderer _renderer;

    public PdfGenerationService()
    {
        // One-time initialization; Chrome engine starts on first render
        _renderer = new ChromePdfRenderer();

        // Configure render options globally if needed
        _renderer.RenderingOptions.PaperSize = IronPdf.Rendering.PdfPaperSize.A4;
    }

    public async Task<byte[]> GeneratePdfAsync(string htmlContent)
    {
        // RenderHtmlAsPdfAsync handles rendering without spawning new processes
        // Memory is managed within the .NET process boundary
        var pdf = await _renderer.RenderHtmlAsPdfAsync(htmlContent);

        return pdf.BinaryData;
    }

    public async Task GenerateBatchAsync(IEnumerable<string> htmlDocuments)
    {
        // Parallel processing without semaphores or browser pool management
        var tasks = htmlDocuments.Select(async html =>
        {
            var pdf = await _renderer.RenderHtmlAsPdfAsync(html);
            return pdf;
        });

        var results = await Task.WhenAll(tasks);

        // Standard .NET disposal - no hanging or zombie processes
        foreach (var pdf in results)
        {
            pdf.Dispose();
        }
    }
}

// Docker deployment example - no special init process required
public class DockerOptimizedService
{
    public byte[] GeneratePdfInContainer(string html)
    {
        // Works in containers with standard memory limits
        // No --disable-dev-shm-usage or --no-sandbox flags needed
        using var renderer = new ChromePdfRenderer();
        using var pdf = renderer.RenderHtmlAsPdf(html);

        return pdf.BinaryData;
    }
}

Key points about this code:

The ChromePdfRenderer can be instantiated once and reused across many requests
Async operations do not require external process coordination
Standard using statements handle disposal without hanging
Parallel operations work without manual synchronization
Container deployments require no special init processes or memory workarounds

API Reference

For more details on the methods used:

ChromePdfRenderer - Main rendering class
RenderHtmlAsPdfAsync - Async rendering documentation
Docker Deployment - Container configuration guide

Migration Considerations

Licensing

IronPDF is commercial software requiring a license for production use. A free trial is available for evaluation, and licensing is per-developer rather than per-server. Organizations should factor licensing costs against the development time spent managing Puppeteer-Sharp's memory issues and infrastructure costs from over-provisioned containers.

API Differences

Puppeteer-Sharp follows the Chrome DevTools Protocol closely, exposing browser automation primitives. IronPDF provides a higher-level API focused specifically on PDF operations. Migration involves:

Replacing Puppeteer.LaunchAsync() with new ChromePdfRenderer()
Changing page.PdfAsync() to renderer.RenderHtmlAsPdf()
Removing browser lifecycle management code (launch, close, dispose patterns)
Removing concurrency control code (semaphores, browser pools)

The migration effort is typically measured in hours rather than days for straightforward PDF generation use cases.

What You Gain

Single-process architecture with standard .NET memory management
No zombie Chromium processes
No WebSocket connection management
Thread-safe rendering without manual synchronization
Consistent behavior across Windows, Linux, macOS, Docker, and cloud platforms
Sub-200ms render times without browser startup overhead

What to Consider

IronPDF is focused on PDF operations; general browser automation requires different tools
The embedded Chrome engine adds approximately 150MB to application size
Initial render incurs Chrome engine startup (subsequent renders are immediate)
Some Puppeteer-Sharp features for page interaction have no direct equivalent

Conclusion

Puppeteer-Sharp's architecture of spawning external Chromium processes creates inherent memory management challenges that compound under production workloads. The documented issues with callback dictionary accumulation, async disposal hanging, zombie processes, and thread safety represent fundamental limitations of the process-per-browser model in .NET applications.

For teams requiring stable, high-throughput PDF generation, migrating to IronPDF eliminates these architectural issues by keeping the rendering engine within the .NET process boundary where standard memory management applies.

Jacob Mellor built the original IronPDF and leads Iron Software's technical development with over 25 years of commercial software experience.

References

Managed memory leak in Connection.cs - GitHub Issue #640{:rel="nofollow"} - Original callback dictionary leak report
Potential memory leak in AsyncDictionaryHelper - GitHub Discussion #2283{:rel="nofollow"} - Memory accumulation analysis
Use Page.DisposeAsync and Browser.DisposeAsync hang forever - GitHub Issue #1489{:rel="nofollow"} - Disposal hanging in Docker
Lack of thread safety causes lock-up - GitHub Issue #714{:rel="nofollow"} - Thread safety documentation
Concurrently modified lists not thread-safe - GitHub Issue #1680{:rel="nofollow"} - NetworkManager concurrent access issues
Chrome browser requests leak memory - GitHub Issue #9283{:rel="nofollow"} - Chromium renderer memory growth
Chrome memory leak - GitHub Issue #5893{:rel="nofollow"} - Memory not released after page close
Zombie Chrome processes in Docker - GitHub Issue #12854{:rel="nofollow"} - Orphaned process documentation
Puppeteer hangs after page crash due to OOM - GitHub Issue #5846{:rel="nofollow"} - Kubernetes OOM behavior
The Hidden Cost of Headless Browsers - Medium{:rel="nofollow"} - Production memory leak journey
How to workaround RAM-leaking libraries like Puppeteer - DevForth{:rel="nofollow"} - Workaround documentation
Puppeteer Zombie Process Solution - Medium{:rel="nofollow"} - Docker init process approach

For the latest IronPDF documentation and tutorials, visit ironpdf.com.

DEV Community

Puppeteer Sharp Memory Leaks and Chromium Resource Exhaustion (Fixed)

The Problem

Error Messages and Symptoms

Who Is Affected

Evidence from the Developer Community

Timeline

Community Reports

Root Cause Analysis

Attempted Workarounds

Workaround 1: Aggressive Page Disposal

Workaround 2: Browser Recycling After N Operations

Workaround 3: Docker Init Process (dumb-init or tini)

Workaround 4: External Process Isolation

A Different Approach: IronPDF

Why IronPDF Handles Memory Differently

Code Example

API Reference

Migration Considerations

Licensing

API Differences

What You Gain

What to Consider

Conclusion

References

Top comments (0)