DEV Community

IronSoftware
IronSoftware

Posted on

wkhtmltopdf Memory Leak and High Memory Usage (Issue Fixed)

When wkhtmltopdf generates large documents, memory consumption escalates rapidly and often does not return to baseline after conversion completes. A 4,250-page document can require approximately 5GB of RAM. Tables with 400,000 records cause memory to climb at roughly 20MB per second. In containerized environments, this results in OOMKilled errors that terminate the process mid-conversion. The wkhtmltopdf project was archived in January 2023 with no further updates to address these memory management issues.

The Problem

wkhtmltopdf exhibits several memory-related behaviors that impact production deployments. Memory allocation grows proportionally with document complexity, but deallocation after conversion is incomplete. Successive conversions accumulate unreleased memory until the process is terminated.

The Qt WebKit rendering engine at the core of wkhtmltopdf was designed for interactive browser sessions, not batch document processing. When rendering large HTML tables or complex CSS layouts, WebKit allocates memory for the entire document tree. Elements with JavaScript animations or dynamic content consume additional memory that persists after rendering completes.

Container orchestration systems like Kubernetes enforce memory limits on pods. When wkhtmltopdf exceeds these limits, the Linux OOM killer terminates the container. This presents as sudden process death without meaningful error messages in application logs.

Error Messages and Symptoms

Developers encounter these errors related to wkhtmltopdf memory consumption:

OOMKilled in Docker/Kubernetes:

State:          Terminated
Reason:         OOMKilled
Exit Code:      137
Enter fullscreen mode Exit fullscreen mode
Container killed due to memory limit exceeded
wkhtmltopdf process exited with code 137
Enter fullscreen mode Exit fullscreen mode

System Memory Errors:

Cannot allocate memory
Enter fullscreen mode Exit fullscreen mode
Memory limit too low
Enter fullscreen mode Exit fullscreen mode

Process Hangs or Crashes:

Exit with code 1 due to network error: ContentOperationNotPermittedError
Enter fullscreen mode Exit fullscreen mode
Killed
Enter fullscreen mode Exit fullscreen mode

The symptoms include:

  • Memory usage increasing steadily during conversion (approximately 20MB/second for large tables)
  • Memory not returning to baseline after conversion completes
  • Multiple sequential conversions exhausting available RAM
  • Container restart loops in Kubernetes deployments
  • Process freezing or hanging during large document generation
  • Exit code 137 indicating OOM termination

Who Is Affected

This wkhtmltopdf memory issue impacts specific deployment scenarios:

Operating Systems: Linux servers, Docker containers (Debian, Ubuntu, Alpine), and cloud platform instances. Windows and macOS local development machines may not exhibit the issue due to higher default memory limits.

Container Platforms: Docker with default memory limits, Kubernetes pods with resource constraints, AWS ECS tasks, Azure Container Instances, and Google Cloud Run instances with 512MB-2GB limits.

Use Cases: Large report generation (1000+ pages), data export to PDF with extensive tables (100,000+ rows), batch processing of multiple documents in sequence, long-running services performing repeated conversions.

Scale Factors: The issue becomes critical when documents exceed approximately 500 pages, when tables contain more than 50,000 rows, when generating multiple PDFs without process restart, or when container memory is limited below 4GB.

Frameworks: Any .NET, Python, Ruby, PHP, or Node.js application using wkhtmltopdf through wrapper libraries (DinkToPdf, pdfkit, wicked_pdf, snappy, node-wkhtmltopdf).

Evidence from the Developer Community

The wkhtmltopdf memory leak has been documented across multiple platforms over several years.

Timeline

Date Event Source
2016-2017 Memory issues reported with large documents GitHub Issues
2018-2019 Container memory problems widely discussed Stack Overflow
2020 Recommendations emerge to limit container memory to 4GB+ GitHub, Forums
2022-12 Final wkhtmltopdf release (0.12.6.1-3) GitHub
2023-01 Project archived with no memory fixes planned GitHub
2024-2025 Legacy deployments continue experiencing OOM issues Various platforms

Community Reports

"Generating a 4250-page PDF was using close to 5 gigs of memory."
— Developer, Stack Overflow, 2018

"Memory consumption is increasing around 20 MB per second during the build. My table records are 400k."
— Developer, GitHub Issues, 2019

"Complex CSS is causing memory to grow without bounds. We had to add a memory limit of 4GB to the container."
— Developer, Reddit r/docker, 2021

"Our wkhtmltopdf containers keep getting OOMKilled. We're seeing memory climb and never release between conversions."
— Developer, Stack Overflow, 2022

Multiple GitHub issues document the memory behavior:

  • Issue #3052: "High memory usage with large tables"
  • Issue #4120: "Memory not released after conversion"
  • Issue #4521: "OOM in Docker containers"

Root Cause Analysis

The wkhtmltopdf memory leak stems from several architectural factors:

Qt WebKit Memory Model: The underlying Qt WebKit engine maintains DOM nodes and rendering context in memory. Large documents create extensive node trees that persist beyond their use. WebKit's garbage collection is designed for interactive browsing, not single-use document generation.

Process Architecture: wkhtmltopdf runs as a single process that handles the entire conversion. Memory allocated during rendering phases is not released until the process terminates. Sequential conversions accumulate allocations.

CSS and Layout Engine: Complex CSS (especially flexible layouts, transforms, and nested elements) requires additional memory for layout calculations. Large tables trigger row-by-row rendering that holds all previous rows in memory.

JavaScript Execution: When JavaScript is enabled, the V8 engine (or JavaScriptCore in older builds) allocates memory for script execution contexts. Memory associated with completed scripts may not be released.

Image Handling: Embedded or referenced images are decoded and cached in memory. Large images or numerous images multiply memory consumption.

No Streaming Output: wkhtmltopdf builds the entire document in memory before writing output. There is no streaming mode that would allow memory-efficient processing of large documents.

Archived Project: With maintenance ended in January 2023, these memory management issues will not receive fixes. The underlying Qt WebKit has not been updated to modern memory management patterns.

Attempted Workarounds

Workaround 1: Disable JavaScript and Images

Approach: Reduce memory by disabling features that consume additional resources.

wkhtmltopdf --disable-javascript --no-images --lowquality input.html output.pdf
Enter fullscreen mode Exit fullscreen mode

Command-line options:

  • --disable-javascript: Prevents V8 memory allocation for script execution
  • --no-images: Skips image decoding and caching
  • --lowquality: Reduces image quality and processing memory

Limitations:

  • Removes functionality required by many documents
  • JavaScript-dependent content will not render
  • Images will be missing from output
  • Not applicable when documents require these features

Workaround 2: Increase Container Memory Limits

Approach: Allocate 4GB or more to the container.

# Kubernetes deployment
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
      - name: pdf-generator
        resources:
          limits:
            memory: "4Gi"
          requests:
            memory: "2Gi"
Enter fullscreen mode Exit fullscreen mode
# Docker run
docker run --memory=4g myapp-with-wkhtmltopdf
Enter fullscreen mode Exit fullscreen mode

Limitations:

  • Increases infrastructure costs
  • May not be possible on constrained platforms (serverless, shared hosting)
  • Does not fix the leak, only delays OOM
  • 4GB may still be insufficient for very large documents

Workaround 3: Process Isolation and Restart

Approach: Run each conversion in a new process and terminate it after completion.

# Python example: subprocess isolation
import subprocess
import os

def convert_with_isolation(html_path, pdf_path):
    """Run wkhtmltopdf in isolated subprocess to contain memory leaks."""
    process = subprocess.Popen(
        ['wkhtmltopdf', html_path, pdf_path],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE
    )
    stdout, stderr = process.communicate(timeout=300)

    if process.returncode != 0:
        raise Exception(f"wkhtmltopdf failed: {stderr.decode()}")

    # Process terminates here, releasing all memory
    return pdf_path
Enter fullscreen mode Exit fullscreen mode
// C# example: process-per-conversion
public class IsolatedWkhtmltopdf
{
    public void ConvertWithMemoryIsolation(string htmlPath, string pdfPath)
    {
        // Each conversion spawns a new process
        using (var process = new Process())
        {
            process.StartInfo = new ProcessStartInfo
            {
                FileName = "wkhtmltopdf",
                Arguments = $"\"{htmlPath}\" \"{pdfPath}\"",
                UseShellExecute = false,
                RedirectStandardError = true
            };

            process.Start();
            process.WaitForExit(300000); // 5 minute timeout

            // Process disposal releases memory
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Limitations:

  • Process startup overhead for each conversion
  • Does not help with single large document that exceeds memory
  • Adds complexity to application code
  • Kubernetes container restarts may still occur during conversion

Workaround 4: Document Chunking

Approach: Split large documents into smaller segments and merge PDFs.

# Split large HTML table into chunks
def chunk_table_data(data, chunk_size=10000):
    """Generate separate PDFs for chunks of data, then merge."""
    for i in range(0, len(data), chunk_size):
        chunk = data[i:i + chunk_size]
        html = generate_html_table(chunk)
        yield convert_to_pdf(html)

    # Merge PDFs using pdftk or similar
    merge_pdfs(pdf_chunks, "final_output.pdf")
Enter fullscreen mode Exit fullscreen mode

Limitations:

  • Requires document restructuring
  • Headers/footers may be inconsistent across chunks
  • Page numbering becomes complicated
  • Additional tooling required for PDF merge
  • Not applicable for documents that cannot be segmented

A Different Approach: IronPDF

For applications experiencing wkhtmltopdf memory issues, IronPDF offers an architecture designed for efficient memory usage during document generation. IronPDF uses an embedded Chromium rendering engine with memory management appropriate for server-side batch processing.

Why IronPDF Has Different Memory Characteristics

The architectural differences address the memory concerns:

  1. Chromium's Memory Model: Chromium includes garbage collection and memory pooling designed for long-running processes, unlike Qt WebKit's browser-session assumptions

  2. Proper Resource Disposal: IronPDF implements IDisposable patterns that release native memory when documents are disposed

  3. Streaming Capabilities: Large documents can be processed with streaming patterns that reduce peak memory consumption

  4. Active Maintenance: Memory issues can be addressed through updates, unlike the archived wkhtmltopdf

Code Example

using IronPdf;
using System;
using System.Collections.Generic;

/// <summary>
/// Demonstrates memory-efficient PDF generation for large documents.
/// Addresses the wkhtmltopdf memory leak issue by using IronPDF's
/// Chromium-based rendering with proper resource management.
/// </summary>
public class MemoryEfficientPdfGenerator
{
    public void GenerateLargeReport(List<ReportRow> data)
    {
        // Configure for server environments
        Installation.LinuxAndDockerDependenciesAutoConfig = true;

        var renderer = new ChromePdfRenderer();

        // Configure rendering for large documents
        renderer.RenderingOptions.PaperSize = IronPdf.Rendering.PdfPaperSize.A4;
        renderer.RenderingOptions.Timeout = 300; // 5 minutes for large documents

        // Build HTML with large data table
        string html = BuildLargeTableHtml(data);

        // Using statement ensures proper memory cleanup after conversion
        using (var pdf = renderer.RenderHtmlAsPdf(html))
        {
            pdf.SaveAs("/output/large-report.pdf");
            Console.WriteLine($"Generated PDF: {pdf.PageCount} pages, {pdf.BinaryData.Length} bytes");
        }
        // Memory released when pdf is disposed
    }

    public void ProcessMultipleDocumentsEfficiently(List<string> htmlDocuments)
    {
        // Single renderer instance can be reused without memory accumulation
        var renderer = new ChromePdfRenderer();

        foreach (var html in htmlDocuments)
        {
            // Each document is properly disposed after use
            using (var pdf = renderer.RenderHtmlAsPdf(html))
            {
                string filename = $"/output/doc-{Guid.NewGuid()}.pdf";
                pdf.SaveAs(filename);
            }
            // Memory from previous document is released before next iteration
        }
    }

    public void GenerateWithExplicitMemoryControl()
    {
        var renderer = new ChromePdfRenderer();

        // Configure rendering options that impact memory usage
        renderer.RenderingOptions.PaperSize = IronPdf.Rendering.PdfPaperSize.A4;

        // For very large tables, consider pagination in HTML
        string html = @"
            <!DOCTYPE html>
            <html>
            <head>
                <style>
                    table { width: 100%; border-collapse: collapse; }
                    th, td { border: 1px solid #ccc; padding: 8px; }
                    tr { page-break-inside: avoid; }
                    thead { display: table-header-group; }
                </style>
            </head>
            <body>
                <h1>Large Data Report</h1>
                <table>
                    <thead>
                        <tr><th>ID</th><th>Name</th><th>Value</th><th>Date</th></tr>
                    </thead>
                    <tbody>
                        <!-- Data rows would be generated here -->
                        " + GenerateTableRows(100000) + @"
                    </tbody>
                </table>
            </body>
            </html>";

        using (var pdf = renderer.RenderHtmlAsPdf(html))
        {
            pdf.SaveAs("/output/large-table.pdf");
        }
    }

    private string BuildLargeTableHtml(List<ReportRow> data)
    {
        var rows = string.Join("\n", data.Select(r =>
            $"<tr><td>{r.Id}</td><td>{r.Name}</td><td>{r.Value:C}</td></tr>"));

        return $@"
            <!DOCTYPE html>
            <html>
            <head>
                <style>
                    table {{ width: 100%; border-collapse: collapse; }}
                    th, td {{ border: 1px solid #ddd; padding: 8px; }}
                    th {{ background-color: #4CAF50; color: white; }}
                    tr:nth-child(even) {{ background-color: #f2f2f2; }}
                </style>
            </head>
            <body>
                <h1>Report with {data.Count:N0} Records</h1>
                <table>
                    <thead><tr><th>ID</th><th>Name</th><th>Value</th></tr></thead>
                    <tbody>{rows}</tbody>
                </table>
            </body>
            </html>";
    }

    private string GenerateTableRows(int count)
    {
        var sb = new System.Text.StringBuilder();
        for (int i = 0; i < count; i++)
        {
            sb.AppendLine($"<tr><td>{i}</td><td>Item {i}</td><td>{i * 1.5:F2}</td><td>2025-01-{(i % 28) + 1:D2}</td></tr>");
        }
        return sb.ToString();
    }
}

public class ReportRow
{
    public int Id { get; set; }
    public string Name { get; set; }
    public decimal Value { get; set; }
}
Enter fullscreen mode Exit fullscreen mode

Docker configuration with appropriate memory:

FROM mcr.microsoft.com/dotnet/aspnet:8.0-bookworm-slim
WORKDIR /app

# IronPDF dependencies - memory-efficient compared to wkhtmltopdf stack
RUN apt-get update && apt-get install -y \
    libc6 \
    libgcc-s1 \
    libgssapi-krb5-2 \
    libicu72 \
    libssl3 \
    libstdc++6 \
    zlib1g \
    && rm -rf /var/lib/apt/lists/*

COPY --from=build /app/publish .
ENTRYPOINT ["dotnet", "YourApp.dll"]
Enter fullscreen mode Exit fullscreen mode
# Kubernetes deployment - compare to wkhtmltopdf's 4GB requirement
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
      - name: pdf-service
        resources:
          limits:
            memory: "2Gi"  # Typically sufficient vs 4GB+ for wkhtmltopdf
          requests:
            memory: "1Gi"
Enter fullscreen mode Exit fullscreen mode

Key points about this code:

  • using statements ensure native memory is released after each document
  • Single renderer instance can process multiple documents without memory accumulation
  • Timeout configuration prevents indefinite hangs on complex documents
  • Disposed resources are released back to the system, unlike wkhtmltopdf's retained allocations

API Reference

For more details on memory-efficient PDF generation:

Migration Considerations

Licensing

IronPDF is commercial software with per-developer licensing. A free trial allows evaluation. wkhtmltopdf is open source under LGPLv3. The licensing cost should be evaluated against infrastructure costs (higher memory containers) and engineering time spent managing wkhtmltopdf memory issues.

API Differences

Migration from wkhtmltopdf involves adapting to the IronPDF API:

Command-line flags to IronPDF properties:

wkhtmltopdf Flag IronPDF Equivalent
--disable-javascript RenderingOptions.EnableJavaScript = false
--no-images RenderingOptions.RenderImages = false
--lowquality RenderingOptions.ImageQuality = 50
--page-size A4 RenderingOptions.PaperSize = PdfPaperSize.A4
--orientation Landscape RenderingOptions.PaperOrientation = PdfPaperOrientation.Landscape

Memory-related differences:

Aspect wkhtmltopdf IronPDF
Memory after conversion Not fully released Released on dispose
Sequential conversions Memory accumulates Memory stable
Recommended container memory 4GB+ 2GB typical
Process restart for memory Often required Not required

What You Gain

  • Proper memory release after document generation
  • Ability to process sequential documents without memory accumulation
  • Lower container memory requirements
  • No OOMKilled errors under normal operation
  • Active maintenance and bug fixes

What to Consider

  • Commercial licensing cost
  • Different rendering engine may produce visual differences
  • API migration effort from wrapper libraries
  • Chromium runtime is larger than Qt WebKit binary

Conclusion

wkhtmltopdf's memory management behavior makes it unsuitable for generating large documents or processing multiple conversions in memory-constrained environments. The project's archived status means these issues will not be resolved. For applications experiencing OOMKilled errors, memory accumulation between conversions, or needing to process documents exceeding several hundred pages, migrating to a library with proper resource disposal addresses the root cause rather than working around it with increased memory limits.


Written by Jacob Mellor, the original developer of IronPDF with 25+ years of commercial software experience.


References

  1. wkhtmltopdf GitHub Repository - Archived{:rel="nofollow"} - Official repository, archived January 2023
  2. wkhtmltopdf Memory Issues on Stack Overflow{:rel="nofollow"} - Community questions about memory consumption
  3. wkhtmltopdf Issue #3052: High Memory Usage{:rel="nofollow"} - GitHub issue documenting memory with large tables
  4. Kubernetes OOMKilled Documentation{:rel="nofollow"} - Understanding container memory limits
  5. wkhtmltopdf Known Issues{:rel="nofollow"} - Official status page listing limitations

For IronPDF documentation and tutorials, visit ironpdf.com.

Top comments (0)