IronSoftware

Posted on Apr 2

Fixing Memory Growth in PDFTron and Apryse SDK Applications (Fixed)

#dotnet #csharp

Developers using PDFTron (now Apryse) SDK frequently encounter a frustrating problem: memory that grows continuously during document processing and never gets released. Applications that work fine with a few documents start crashing with "bad memory allocation" errors when processing batches. Web viewers consume more RAM with each document opened until the browser tab crashes. This article examines why these memory issues occur and explores a different architectural approach that avoids them.

The Problem

The PDFTron/Apryse SDK is built on native C++ libraries with wrappers for .NET, Java, Python, Ruby, and JavaScript. This architecture creates a fundamental tension: developers writing in managed languages like C# expect the garbage collector to handle memory cleanup, but the underlying native memory requires explicit disposal calls that the garbage collector cannot manage.

When a .NET developer creates a PDFDoc object, calls methods on it, and lets it go out of scope, the managed wrapper gets collected but the native memory often remains allocated. This pattern repeats across PDFDraw, ElementBuilder, ElementWriter, TextExtractor, and other core classes. In a batch processing scenario where thousands of documents flow through the system, memory accumulates until the process crashes.

The StreamingPDFConversion API exhibits particularly problematic behavior. This API, designed for converting images and other formats to PDF, has documented memory leaks that persist across multiple SDK versions. Apryse acknowledged in their version 10.11.0 changelog that they "fixed a memory leak that could occur during conversion from TIFF to PDF with Convert.StreamingPDFConversion() or Convert.UniversalConversion()."

Error Messages and Symptoms

Developers report various symptoms depending on their platform and use case:

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.

Error code: Out of Memory (browser crash)

bad allocation errors during PDF to image conversion

System.IO.IOException: Stream was too long

The WebViewer JavaScript component shows similar patterns. Memory allocated when loading documents does not get released when documents are closed, leading to progressive degradation in browser tab performance.

Who Is Affected

This memory growth issue spans the entire Apryse/PDFTron product line:

Operating Systems: Windows, Linux, macOS, iOS, Android

SDK Platforms: .NET Framework, .NET Core/.NET 5+, Java, Python, Ruby, Go, PHP, JavaScript/WebViewer

Affected Versions: Reports span from version 5.1.10 (2019) through version 10.1 (2023) and beyond, with some fixes appearing in 10.11.0 (2024) and 11.8.0 (2025)

Use Cases:

Batch document conversion (converting hundreds or thousands of files)
Long-running web applications with PDF viewing
PDF merging operations (combining many documents into one)
Image to PDF conversion via StreamingPDFConversion
Mobile applications where users view multiple documents in a session

The issue is particularly acute in cloud environments like AWS Lambda where memory limits are strict and disk caching may not be available as a workaround.

Evidence from the Developer Community

The PDFTron memory leak problem has been reported consistently for over fifteen years across multiple platforms and languages.

Timeline

Date	Event	Source
2009-02-01	First reports of bad allocation errors during batch PDF to image conversion	Apryse Community{:rel="nofollow"}
2019-07-01	WebViewer memory leak identified in loadScript.js and webviewer-ui.js	GitHub Issue #317{:rel="nofollow"}
2021-03-01	iOS PTDocumentController memory leak confirmed by PDFTron on iOS 14	Google Groups{:rel="nofollow"}
2022-05-01	React WebViewer memory leak reported	Apryse Community{:rel="nofollow"}
2022-11-01	.NET SDK memory leak in Save() method identified	Apryse Community{:rel="nofollow"}
2023-06-01	StreamingPDFConversion memory leak confirmed	Apryse Community{:rel="nofollow"}
2024-07-24	StreamingPDFConversion memory leak fixed in v10.11.0	Apryse Changelog
2025-10-08	Additional memory leaks fixed in v11.8.0	Apryse Changelog

Community Reports

"Ever-growing memory usage with StreamingPDFConversion. Our app was crashing with errors related to bad memory allocation. Memory profiling was done with a simple script and it appeared to potentially be an issue in the SDK."
— Developer report, Apryse Community, June 2023

"~10mb memory leak each time the PDF viewer was loaded. Rather catastrophic memory leak. In src/helpers/loadScript.js there is a window.addEventListener('message'...) call that isn't properly unsubscribed."
— Developer report, GitHub Issue #317, July 2019

"Memory continues to grow as they view documents. Overriding viewWillDisappear and manually calling closeDocument seems to help a LOT, but the memory still grows pretty quickly."
— iOS developer, Apryse Community, March 2021

A developer attempting to merge 8,000 PDF files reported "memory is exceeding what is available on the system" with "roughly linear correlation in memory usage while merging PDFs." Another developer working with a 175MB document containing 7,600 pages experienced browser crashes with out of memory errors.

The issue also manifests when converting HTML to PDF. One developer reported that "HTML to PDF conversion fails when trying to create a PDF with over 3000 pages from an HTML string. Even with the timeout set to approximately 3 hours and 20 minutes (1200000ms), the conversion fails after about an hour."

Root Cause Analysis: Why PDFTron Memory Leak Occurs

Multiple architectural factors contribute to the persistent memory issues in Apryse/PDFTron:

Native Memory Management: The SDK wraps native C++ libraries. In managed languages, developers expect garbage collection to handle cleanup. However, native memory allocated by the C++ layer cannot be freed by the garbage collector. The SDK requires explicit Dispose() or Close() calls on most objects, but this is not intuitive for developers accustomed to automatic memory management.

Event Listener Leaks in WebViewer: The JavaScript WebViewer component attaches event listeners that are not properly cleaned up when instances are destroyed. The GitHub issue identified specific leaks in window.addEventListener('message'...) calls and closures over options objects in window.WebViewer.l and window.WebViewer.workerTransportPromise.

iOS-Specific Deallocation Issues: On iOS 14, Apryse confirmed that PTDocumentController was not being deallocated when removed from the view hierarchy. This platform-specific bug caused memory to grow with each document viewed.

Linear Memory Growth in Batch Operations: When processing multiple documents, the SDK does not automatically release memory between operations. Without intermediate saves and explicit disposal, memory grows proportionally with the number of documents processed.

Stream Handling Issues: The Save method in the .NET SDK uses unsafe code with TRN_PDFDocSaveMemoryBuffer. When saving to streams, particularly with large documents, this can trigger Stream was too long exceptions when the accumulated data exceeds internal limits.

Attempted Workarounds for Apryse SDK Memory Issues

The developer community has documented various approaches to mitigate the PDFTron WebViewer memory and SDK memory issues, though none fully resolve the underlying problems.

Workaround 1: Explicit Disposal of All Objects

Approach: Wrap every PDFTron object in a using statement or explicitly call Dispose() after use.

// Apryse/PDFTron pattern requiring explicit disposal
using (var doc = new PDFDoc("input.pdf"))
{
    using (var draw = new PDFDraw())
    {
        draw.SetDPI(92);
        for (int i = 1; i <= doc.GetPageCount(); i++)
        {
            using (var page = doc.GetPage(i))
            {
                draw.Export(page, $"page_{i}.png");
            }
        }
    }
} // Must ensure all objects are disposed

Limitations:

Requires remembering to dispose every object, including nested ones
Missing even one disposal in a loop can cause cumulative leaks
Not intuitive for developers coming from fully managed environments
Some leaks occur in SDK internals that developers cannot control

Workaround 2: Enable Disk Caching

Approach: Configure the SDK to use disk-based caching instead of keeping everything in memory.

PDFNet.SetDefaultDiskCachingEnabled(true);

Limitations:

Creates temporary files that need cleanup
Slower than in-memory operations
Not available in all environments (AWS Lambda, some containerized deployments)
Does not fix all memory leak sources

Workaround 3: Batch Processing with Intermediate Saves

Approach: Process documents in small batches, saving results and disposing objects between batches.

// Process in batches of 100 documents
var files = Directory.GetFiles(inputDir, "*.pdf");
for (int batch = 0; batch < files.Length; batch += 100)
{
    var batchFiles = files.Skip(batch).Take(100);
    ProcessBatch(batchFiles);
    GC.Collect();
    GC.WaitForPendingFinalizers();
}

Limitations:

Significantly more complex code
Slower overall processing
Requires managing intermediate files
Still may not prevent all memory growth within each batch

Workaround 4: WebViewer Instance Reuse

Approach: Instead of creating new WebViewer instances, reuse existing ones by loading new documents into them.

Limitations:

Requires architectural changes to the application
May not fit all use cases
Still shows memory growth over time, just slower

A Different Approach: IronPDF

For developers who have exhausted workaround options or cannot afford the complexity of managing native memory manually, switching to a library with a different architecture may be the practical solution.

IronPDF uses an embedded Chromium rendering engine that operates within the managed .NET environment. The library implements the standard IDisposable pattern, and its memory management integrates with .NET's garbage collection in a way that does not require developers to manually track and dispose every internal object.

Why IronPDF Handles Memory Differently

The architectural difference is fundamental. Rather than wrapping a native C++ library, IronPDF renders documents using Chromium processes that are managed by the library. When a document operation completes and objects go out of scope, the memory can be reclaimed through normal .NET garbage collection.

For batch operations, IronPDF does not exhibit the linear memory growth pattern seen with PDFTron. Each document is processed, resources are released, and the next document starts from a clean state.

Code Example

The following example demonstrates batch HTML-to-PDF conversion with proper resource handling:

using IronPdf;
using System;
using System.Collections.Generic;
using System.IO;

/// <summary>
/// Demonstrates batch PDF generation without memory accumulation.
/// Each document is processed independently with automatic resource cleanup.
/// </summary>
public class BatchPdfGenerator
{
    public void ConvertHtmlFilesToPdf(IEnumerable<string> htmlFilePaths, string outputDirectory)
    {
        // ChromePdfRenderer implements IDisposable and manages its own lifecycle
        using var renderer = new ChromePdfRenderer();

        // Configure rendering options once
        renderer.RenderingOptions.PaperSize = IronPdf.Rendering.PdfPaperSize.A4;
        renderer.RenderingOptions.MarginTop = 25;
        renderer.RenderingOptions.MarginBottom = 25;

        int processedCount = 0;

        foreach (string htmlPath in htmlFilePaths)
        {
            try
            {
                // Read HTML content
                string htmlContent = File.ReadAllText(htmlPath);

                // Render to PDF - memory is managed automatically
                using var pdf = renderer.RenderHtmlAsPdf(htmlContent);

                // Generate output filename
                string outputPath = Path.Combine(
                    outputDirectory,
                    Path.GetFileNameWithoutExtension(htmlPath) + ".pdf"
                );

                // Save to disk
                pdf.SaveAs(outputPath);

                processedCount++;

                // No manual memory cleanup required between documents
                // The 'using' statement ensures proper disposal
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Error processing {htmlPath}: {ex.Message}");
            }
        }

        Console.WriteLine($"Processed {processedCount} documents.");
    }
}

For merging multiple documents, IronPDF provides a straightforward approach that does not accumulate memory linearly:

using IronPdf;
using System.Collections.Generic;
using System.Linq;

/// <summary>
/// Merges multiple PDF files without the linear memory growth
/// seen when merging large numbers of documents with some libraries.
/// </summary>
public class PdfMerger
{
    public void MergeDocuments(IEnumerable<string> pdfPaths, string outputPath)
    {
        // Load all PDFs - IronPDF handles memory efficiently
        var documents = pdfPaths
            .Select(path => PdfDocument.FromFile(path))
            .ToList();

        try
        {
            // Merge all documents
            var merged = PdfDocument.Merge(documents);

            // Save the result
            merged.SaveAs(outputPath);

            // Dispose the merged document
            merged.Dispose();
        }
        finally
        {
            // Clean up source documents
            foreach (var doc in documents)
            {
                doc.Dispose();
            }
        }
    }
}

Key differences in this code:

Standard IDisposable pattern works as expected
No native memory leaks from missed disposal calls
Memory does not grow linearly during batch processing
No need for explicit GC calls or batch processing workarounds

API Reference

For detailed documentation on the classes and methods used:

ChromePdfRenderer - HTML and URL to PDF conversion
Merge or Split PDFs - Combining multiple documents
HTML to PDF Tutorial - Complete guide to HTML conversion

Migration Considerations

Switching PDF libraries is not a trivial decision. Developers should evaluate several factors.

Licensing

IronPDF is commercial software. Licenses are available per-developer and include free trial periods for evaluation. Pricing information is available on the IronPDF website. For teams currently using Apryse's commercial license, the cost comparison may be straightforward. For those using PDFTron's legacy free tier (if applicable to their use case), there is a licensing cost to consider.

API Differences

The APIs differ significantly. PDFTron uses a lower-level API with explicit element builders and writers. IronPDF uses a higher-level API centered on the ChromePdfRenderer and PdfDocument classes. Migration requires rewriting document generation code, not just swapping class names.

Example comparison:

// PDFTron approach
using (var doc = new PDFDoc())
using (var eb = new ElementBuilder())
using (var ew = new ElementWriter())
{
    var page = doc.PageCreate();
    ew.Begin(page);
    var element = eb.CreateTextBegin(font, 12);
    ew.WriteElement(element);
    // ... many more low-level operations
}

// IronPDF approach
var renderer = new ChromePdfRenderer();
var pdf = renderer.RenderHtmlAsPdf("<html><body>Content</body></html>");
pdf.SaveAs("output.pdf");

What You Gain

Standard .NET memory management without native memory quirks
No need for explicit disposal of every internal object
Batch processing without linear memory growth
Chrome-based rendering for accurate HTML/CSS support

What to Consider

Learning curve for a new API
Testing required to verify output matches expectations
IronPDF uses Chromium, which has its own resource footprint
Some advanced PDFTron features may not have direct equivalents

Conclusion

The PDFTron/Apryse memory leak issue stems from fundamental architectural decisions around native memory management. While workarounds exist, they add complexity and do not fully solve the problem. For development teams spending significant time debugging memory issues in production, evaluating an alternative library with managed memory handling may be more cost-effective than continuing to work around SDK limitations.

Jacob Mellor is CTO at Iron Software, where he leads technical development and built the original IronPDF library. He has over 25 years of experience developing commercial software tools.

References

Possible memory leak with StreamingPDFConversion{:rel="nofollow"} - Apryse Community forum thread documenting streaming conversion memory issues
Possible memory leak in PDFTron .NET SDK{:rel="nofollow"} - .NET SDK memory leak investigation
WebViewer Memory Leaks - GitHub Issue #317{:rel="nofollow"} - Detailed analysis of JavaScript memory leaks
React: PDFTron Memory Leak{:rel="nofollow"} - WebViewer memory issues in React applications
Possible memory leak - iOS{:rel="nofollow"} - iOS-specific memory growth issues
PDFTron WebViewer Out of Memory{:rel="nofollow"} - Large file handling crashes
Running out of memory when merging PDFs{:rel="nofollow"} - Batch merging memory issues
Memory Management - Google Groups{:rel="nofollow"} - iOS PTDocumentController deallocation issue
How to keep memory under control - Java{:rel="nofollow"} - Early reports of batch processing issues

For the latest IronPDF documentation and tutorials, visit ironpdf.com.

DEV Community

Fixing Memory Growth in PDFTron and Apryse SDK Applications (Fixed)

The Problem

Error Messages and Symptoms

Who Is Affected

Evidence from the Developer Community

Timeline

Community Reports

Root Cause Analysis: Why PDFTron Memory Leak Occurs

Attempted Workarounds for Apryse SDK Memory Issues

Workaround 1: Explicit Disposal of All Objects

Workaround 2: Enable Disk Caching

Workaround 3: Batch Processing with Intermediate Saves

Workaround 4: WebViewer Instance Reuse

A Different Approach: IronPDF

Why IronPDF Handles Memory Differently

Code Example

API Reference

Migration Considerations

Licensing

API Differences

What You Gain

What to Consider

Conclusion

References

Top comments (0)