Developers using PuppeteerSharp for HTML-to-PDF conversion in .NET applications often encounter memory leaks that accumulate over time, eventually causing out-of-memory crashes. The issue stems from the complexity of managing Chromium browser lifecycles, where improper disposal of browser instances and pages leads to orphaned Chrome processes consuming system resources. This article examines the root causes, documents common patterns that cause memory growth, and presents an alternative approach using a library that manages Chrome lifecycle internally.
The Problem
PuppeteerSharp is a .NET port of the popular Node.js Puppeteer library, providing programmatic control over headless Chromium browsers. While it offers powerful browser automation capabilities including PDF generation, it requires developers to manually manage browser and page lifecycles. This manual management introduces multiple opportunities for memory leaks.
Each Chromium browser instance spawned by PuppeteerSharp consumes 50-200MB or more depending on page complexity. Individual tabs and pages consume additional memory, and this memory is not automatically released when operations complete. Without explicit disposal calls in the correct order, Chrome processes remain alive indefinitely, accumulating memory until the application crashes or the system runs out of resources.
The problem is particularly severe in long-running processes like web servers, background services, and Docker containers where PDF generation occurs repeatedly. Memory growth is gradual and insidious - not a sudden spike that would be immediately obvious, but a slow accumulation that eventually triggers OOM (out-of-memory) errors after hours or days of operation.
Error Messages and Symptoms
Developers encountering PuppeteerSharp memory issues typically observe these patterns:
System.OutOfMemoryException: Out of memory.
at PuppeteerSharp.Page.PdfDataAsync(PdfOptions options)
PuppeteerSharp.NavigationException: Timeout of 180000ms exceeded.
at PuppeteerSharp.Page.PdfAsync(String file, PdfOptions options)
PuppeteerSharp.TargetClosedException: Protocol error (Target.activateTarget):
Target closed. (Session closed. Most likely the page has been closed.)
Symptoms include:
- Memory usage climbing steadily with each PDF generation, never returning to baseline
- Dozens of orphaned Chrome processes visible in task manager or
ps aux - PDF generation operations timing out after working successfully for hours
- Docker containers being killed by OOM killer
-
DisposeAsync()calls hanging indefinitely - Application becoming unresponsive after processing several hundred documents
- Kubernetes pods restarting due to memory limits
Who Is Affected
This issue impacts any .NET application using PuppeteerSharp for PDF generation at scale:
Operating Systems: Windows, Linux, and macOS deployments, with Docker containers being particularly susceptible due to constrained memory limits.
Framework Versions: .NET Core 3.1, .NET 5, .NET 6, .NET 7, and .NET 8. The issue is architectural rather than framework-specific.
Use Cases: Invoice generation services, report generation, HTML-to-PDF conversion APIs, document templating systems, certificate generation, and any high-volume PDF workflow.
Environments: Docker containers, Kubernetes clusters, AWS Lambda (though limited to 15 minutes), Azure Functions, and traditional server deployments. The problem is most visible in containerized environments where memory limits are enforced.
Evidence from the Developer Community
Memory management issues with Puppeteer and PuppeteerSharp have been documented extensively across GitHub issues, blog posts, and community discussions.
Timeline
| Date | Event | Source |
|---|---|---|
| 2019-03-01 | Managed memory leak in Connection.cs identified | GitHub Issue #640 |
| 2020-01-01 | Chrome memory leak pattern documented | GitHub Issue #5893 |
| 2020-05-01 | Docker memory increase issue reported | GitHub Issue #5645 |
| 2021-06-01 | IAsyncDisposable support discussion | GitHub Issue #1456 |
| 2021-10-01 | DisposeAsync hanging forever in Docker | GitHub Issue #1489 |
| 2022-07-01 | Docker container memory always increasing | GitHub Issue #8695 |
| 2023-01-01 | Browser requests leak memory | GitHub Issue #9283 |
| 2024-08-01 | PdfDataAsync timeout in Chromium v127+ | GitHub Issue #2718 |
| 2024-10-01 | Production memory leak journey documented | Medium article |
Community Reports
"This wasn't the classic scenario where memory spikes and then recovers. This was something more insidious - a gradual, implicit memory increase that accumulated over time, slowly and steadily killing the service."
— Developer, Medium, October 2024"When running PuppeteerSharp with Docker, I'm finding quite a lot of zombie Chrome processes that never get killed. Even using tini as an entry point didn't resolve the issue. The logs showed that some DisposeAsync calls sometimes never complete."
— Developer, GitHub Issue #1489"The callback needs to be removed from the _callbacks dictionary. This causes a managed leak of memory eventually resulting in OOM."
— Developer, GitHub Issue #640"Having lack of RAM on server is a terrible thing because it activates operating system's out-of-memory killer who starts killing any processes randomly causing service downtime."
— DevForth Engineering Blog
Production teams have reported that a healthy deployment typically runs 2-3 Chrome processes, but when cleanup fails, dozens of orphaned Chrome instances accumulate. This simple count can reveal when disposal is failing before memory usage spikes catastrophically.
Root Cause Analysis
The memory leaks in PuppeteerSharp stem from several architectural factors:
1. Browser Lifecycle Complexity
PuppeteerSharp requires explicit management of multiple disposable resources:
-
BrowserFetcher- Downloads Chromium binaries -
Browser- The main Chromium process -
Page- Individual tabs within the browser -
BrowserContext- Incognito contexts for isolation
Each resource must be disposed in the correct order. Missing any disposal, or disposing in the wrong order, leaves resources orphaned.
2. Async Disposal Challenges
PuppeteerSharp implements IAsyncDisposable, but there are known issues where DisposeAsync() hangs indefinitely, particularly in Docker environments. The implementation routes DisposeAsync() to CloseAsync(), but the underlying task management has edge cases where completion is never signaled.
3. Chrome Process Management
On Linux (especially in Docker), processes with PID=1 receive special treatment that makes Chrome termination unreliable. Without proper process supervision (like dumb-init or Docker's --init flag), Chrome child processes become zombies that are never reaped.
4. Callback Dictionary Leak
A documented bug in Connection.cs caused callbacks to accumulate in a dictionary without removal, leading to managed memory growth independent of the Chrome process issues.
5. Tab Memory Growth
Even when reusing browser instances (a recommended optimization), individual tabs consume more memory over time and do not release it automatically. Eventually, tabs must be closed and recreated.
Attempted Workarounds
The PuppeteerSharp community has developed various approaches to mitigate memory issues.
Workaround 1: Explicit Try-Finally Disposal
Approach: Use try-finally blocks with explicit calls to both CloseAsync() and Dispose().
IPage page = null;
try
{
page = await browser.NewPageAsync();
await page.GoToAsync("https://example.com");
var pdfBytes = await page.PdfDataAsync();
// Process PDF...
}
finally
{
if (page != null)
{
await page.CloseAsync();
page.Dispose();
}
}
Limitations:
- Still requires manual tracking of every resource
- Does not prevent the DisposeAsync hanging issue
- Developers must remember to implement this pattern everywhere
- Browser instance itself still needs separate management
Workaround 2: Use Synchronous Dispose Instead of DisposeAsync
Approach: Call Dispose() instead of DisposeAsync() to avoid the hanging issue.
await using var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true });
var page = await browser.NewPageAsync();
// ... generate PDF
page.Dispose(); // Use synchronous Dispose
browser.Dispose(); // Instead of DisposeAsync
Limitations:
- May not properly wait for Chrome shutdown
- Can leave orphaned processes in edge cases
- Not idiomatic for async .NET code
Workaround 3: Docker Process Supervision
Approach: Use Docker's --init flag or dumb-init to properly reap zombie processes.
FROM mcr.microsoft.com/dotnet/aspnet:8.0
# Install dumb-init for proper process supervision
RUN apt-get update && apt-get install -y dumb-init
ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["dotnet", "YourApp.dll"]
Or with Docker run:
docker run --init your-image
Limitations:
- Only addresses zombie process cleanup, not memory leaks within Chrome
- Requires Docker 1.13.0 or later for --init
- Does not solve the DisposeAsync hanging issue
Workaround 4: Periodic Browser Restart
Approach: Track memory usage and restart the browser instance when it exceeds a threshold.
private int _pdfCount = 0;
private const int MaxPdfsPerBrowser = 100;
public async Task<byte[]> GeneratePdf(string html)
{
if (_pdfCount >= MaxPdfsPerBrowser)
{
await _browser.CloseAsync();
_browser.Dispose();
_browser = await Puppeteer.LaunchAsync(_launchOptions);
_pdfCount = 0;
}
_pdfCount++;
// Generate PDF...
}
Limitations:
- Adds latency when browser restarts (3+ seconds per restart)
- Complex to implement correctly with concurrent requests
- Arbitrary threshold may not match actual memory pressure
Workaround 5: Limit Concurrency
Approach: Restrict concurrent PDF generation to prevent memory spikes.
private static readonly SemaphoreSlim _semaphore = new(Environment.ProcessorCount - 1);
public async Task<byte[]> GeneratePdfWithConcurrencyLimit(string html)
{
await _semaphore.WaitAsync();
try
{
// Generate PDF
}
finally
{
_semaphore.Release();
}
}
Limitations:
- Does not prevent memory accumulation, only slows it
- Reduces throughput
- Does not address the root disposal issues
A Different Approach: IronPDF
For teams where managing Chromium lifecycle is consuming significant engineering effort, libraries that handle browser lifecycle internally eliminate the category of bugs entirely. IronPDF embeds a Chromium rendering engine but manages its lifecycle automatically, removing the burden of browser instance management from application code.
Why IronPDF Avoids This Issue
IronPDF's architecture differs from PuppeteerSharp in how it manages the Chrome rendering engine:
- Automatic lifecycle management: The Chrome engine is started, managed, and terminated internally without developer intervention
- No browser instance tracking: Developers do not need to track browser or page objects
- Memory efficiency: Built-in streaming support for large documents prevents memory spikes
-
Proper cleanup: Resources are released when
PdfDocumentobjects are disposed, using familiar .NET patterns - No external dependencies: Chrome is embedded within the library - no separate download or installation required
The difference is architectural: PuppeteerSharp exposes browser automation as a general-purpose API where PDF generation is one feature. IronPDF is purpose-built for PDF operations, using Chrome rendering internally without exposing the complexity.
Code Example
The following example demonstrates high-volume PDF generation without the lifecycle management overhead:
using IronPdf;
using System;
using System.Threading.Tasks;
public class PdfGenerationService
{
public PdfGenerationService()
{
// Optional: Configure once at startup
// IronPDF manages Chrome lifecycle automatically
Installation.LinuxAndDockerDependenciesAutoConfig = true;
}
public byte[] GeneratePdfFromHtml(string htmlContent)
{
// Create renderer - no browser launch delay
var renderer = new ChromePdfRenderer();
// Configure PDF options
renderer.RenderingOptions.MarginTop = 20;
renderer.RenderingOptions.MarginBottom = 20;
renderer.RenderingOptions.PaperSize = IronPdf.Rendering.PdfPaperSize.A4;
// Render HTML to PDF
// Chrome engine managed internally - no lifecycle to track
using (var pdf = renderer.RenderHtmlAsPdf(htmlContent))
{
// Memory is released when using block exits
return pdf.BinaryData;
}
}
public async Task GenerateBatchPdfs(int count)
{
var renderer = new ChromePdfRenderer();
// Process thousands of PDFs without memory accumulation
for (int i = 0; i < count; i++)
{
string html = $@"
<html>
<head>
<style>
body {{ font-family: Arial, sans-serif; padding: 40px; }}
h1 {{ color: #333; }}
.invoice-number {{ font-size: 24px; color: #666; }}
</style>
</head>
<body>
<h1>Invoice</h1>
<p class='invoice-number'>INV-{i:D6}</p>
<p>Generated: {DateTime.UtcNow:yyyy-MM-dd HH:mm:ss}</p>
</body>
</html>";
using (var pdf = renderer.RenderHtmlAsPdf(html))
{
pdf.SaveAs($"/output/invoice_{i:D6}.pdf");
}
// Memory returns to baseline after each iteration
// No browser restarts required
// No orphaned Chrome processes
}
}
public byte[] GenerateFromUrl(string url)
{
var renderer = new ChromePdfRenderer();
// Render external URL - JavaScript executes automatically
using (var pdf = renderer.RenderUrlAsPdf(url))
{
return pdf.BinaryData;
}
}
}
Key points about this code:
- No
BrowserFetcher.DownloadAsync()- Chrome is embedded - No
Puppeteer.LaunchAsync()- browser lifecycle is automatic - No browser or page disposal code - managed by the library
- Standard
usingblocks release memory predictably - Same code works on Windows, Linux, and macOS without changes
- Docker containers work without
--initflag or process supervisors
Comparison: PuppeteerSharp vs IronPDF Setup
PuppeteerSharp approach:
// Download Chromium (required on each deployment)
var browserFetcher = new BrowserFetcher();
await browserFetcher.DownloadAsync();
// Launch browser (3+ second startup time)
await using var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
Headless = true,
Args = new[] { "--no-sandbox", "--disable-dev-shm-usage" }
});
// Create page
await using var page = await browser.NewPageAsync();
// Navigate and generate PDF
await page.SetContentAsync(html);
var pdfBytes = await page.PdfDataAsync();
// Must dispose page, then browser, in correct order
// DisposeAsync may hang in Docker
IronPDF approach:
var renderer = new ChromePdfRenderer();
using var pdf = renderer.RenderHtmlAsPdf(html);
var pdfBytes = pdf.BinaryData;
// Done - no lifecycle management
API Reference
For details on the methods used above:
- ChromePdfRenderer - Main rendering class
- RenderHtmlAsPdf - HTML to PDF conversion
- Docker and Linux Deployment - Container configuration guide
- IronPDF vs PuppeteerSharp Comparison - Detailed feature comparison
Migration Considerations
Licensing
IronPDF is commercial software with per-developer licensing. A free trial is available for evaluation. Teams should verify that IronPDF meets their requirements before committing to migration, particularly if PuppeteerSharp was chosen specifically for its open-source license.
API Differences
The APIs differ significantly in philosophy:
- PuppeteerSharp: General browser automation API with PDF as one capability
- IronPDF: Purpose-built PDF API using Chrome rendering internally
Migration involves replacing browser lifecycle code with direct PDF generation calls. For applications using PuppeteerSharp only for PDF generation, this simplifies the codebase. For applications using browser automation features beyond PDF (screenshots, testing, scraping), IronPDF would only replace the PDF portion.
What You Gain
- Elimination of browser lifecycle management code
- No Chrome process accumulation or zombie processes
- Predictable memory behavior without monitoring infrastructure
- Consistent behavior across Windows, Linux, and macOS
- Docker containers without process supervision requirements
- Faster PDF generation (no browser launch overhead per operation)
What to Consider
- Commercial licensing cost versus engineering time spent on memory management
- Migration effort for existing PuppeteerSharp codebases
- If using PuppeteerSharp for non-PDF browser automation, that code remains separate
- Different rendering engine may produce slightly different output formatting
Conclusion
PuppeteerSharp memory issues stem from the inherent complexity of managing Chromium browser lifecycles in long-running .NET applications. The combination of async disposal challenges, Chrome process management on Linux, and callback dictionary leaks creates a category of bugs that requires ongoing engineering attention. For teams where PDF generation is the primary use case, switching to a library with managed Chrome lifecycle eliminates these issues at the architectural level.
Jacob Mellor leads technical development at Iron Software and has 25+ years experience building developer tools.
References
- Managed memory leak in Connection.cs - Issue #640{:rel="nofollow"} - Original memory leak identification
- DisposeAsync hanging forever - Issue #1489{:rel="nofollow"} - Docker disposal hanging issue
- PdfDataAsync timeout in Chromium v127+ - Issue #2718{:rel="nofollow"} - Recent PDF generation timeout
- Docker container memory always increasing - Issue #8695{:rel="nofollow"} - Container memory growth
- Chrome memory leak - Issue #5893{:rel="nofollow"} - Chrome process memory leak
- Browser requests leak memory - Issue #9283{:rel="nofollow"} - Request-related memory leak
- The Hidden Cost of Headless Browsers: A Puppeteer Memory Leak Journey{:rel="nofollow"} - Production memory leak case study
- How to simply workaround RAM-leaking libraries like Puppeteer{:rel="nofollow"} - Community workaround guide
- Optimizing Puppeteer PDF generation{:rel="nofollow"} - Performance optimization strategies
- Puppeteer Troubleshooting Documentation{:rel="nofollow"} - Official troubleshooting guide
- PuppeteerSharp Memory Management Considerations{:rel="nofollow"} - Memory management best practices
For IronPDF documentation and tutorials, visit ironpdf.com.
Top comments (0)