Developers using Aspose.HTML or Aspose.PDF to convert HTML documents to PDF encounter out-of-memory exceptions even with moderately complex content. The library's memory consumption can spike to 2GB or more during rendering, causing production failures without warning. This issue has been documented since 2020 and continues to affect deployments. This article examines the root cause and presents an alternative with more predictable memory characteristics.
The Problem
When converting HTML to PDF using Aspose.HTML or Aspose.PDF's HTML conversion features, the library allocates memory in an uncontrolled manner. Documents that render instantly in a browser can consume gigabytes of RAM during Aspose's conversion process.
The issue is particularly severe when:
- HTML contains complex CSS layouts
- Multiple images are embedded or referenced
- Tables have many rows or columns
- The document spans many pages
- Multiple conversions run concurrently in a web application
Memory allocation grows rapidly during the conversion and may not be released promptly after completion, compounding the problem in production environments.
Error Messages and Symptoms
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at Aspose.Html.Converters.Converter.ConvertHTML(HTMLDocument document, PdfSaveOptions options, ICreateStreamProvider provider)
From developer reports:
ConvertHTML(htmlDocument, saveOptions, streamProvider); throws an out of memory exception
because it uses 2GB of RAM
Symptoms include:
- Application memory climbing to 2GB+ during conversion
- Conversion succeeding on small documents but failing on larger ones
- Cascading failures when multiple requests trigger OOM simultaneously
- Container kills in memory-limited environments
Memory Profiler Analysis
When analyzing the issue with memory profiling tools like dotMemory or Visual Studio Diagnostic Tools, the following patterns emerge:
Snapshot during HTML-to-PDF conversion:
=====================================
Total Managed Heap: 1.8 GB
- Large Object Heap: 1.2 GB
- System.Byte[]: 890 MB (image buffers)
- System.String: 180 MB (HTML content copies)
- System.Char[]: 130 MB (CSS parsing)
- Generation 2: 450 MB
- Internal layout objects
- Font cache entries
- Generation 0/1: 150 MB
- Temporary parsing objects
Native Memory (untracked by GC): ~400 MB
- Image decoding buffers
- Font rasterization cache
The memory profile reveals that image buffers and internal string copies account for the majority of allocations. These are not released during the conversion, and the Large Object Heap becomes fragmented.
Who Is Affected
This issue impacts production deployments using Aspose's HTML conversion:
Operating Systems: Windows and Linux, though memory limits are often stricter on containerized Linux deployments.
Affected Versions: Reports span from version 20.8 through current versions.
Use Cases: Report generation systems, invoice creation, document automation pipelines, any application converting user-provided or dynamically generated HTML.
Environments: Azure App Service, AWS ECS/Lambda, Kubernetes, Docker, and any environment with memory limits.
Evidence from the Developer Community
Timeline
| Date | Event | Source |
|---|---|---|
| 2020-03-18 | Out of memory rendering HTML reported | Aspose Forums |
| 2020-05-15 | Issue escalated, marked under investigation | Aspose Forums |
| 2020-07-28 | Issue still unresolved, developer reports production impact | Aspose Forums |
Community Reports
"ConvertHTML(htmlDocument, saveOptions, streamProvider); throws an out of memory exception because it uses 2gb of ram."
— Developer, Aspose Forums, March 2020"As we had this problem in our production so for me is important in which time the problem can be resolved because it is a blocking error."
— Developer, Aspose Forums, May 2020
Official Response
The Aspose team acknowledged the issue:
"We regret to share that the issue is not yet resolved. However, it is under the phase of investigation and requires more time to get fixed. We have recorded your concerns and escalated the issue to next level."
— Aspose Support, July 2020
HTML Patterns That Trigger High Memory Usage
Certain HTML patterns cause disproportionate memory consumption in Aspose's converter:
Large Data Tables
Tables with hundreds of rows cause memory to scale non-linearly:
<!-- This pattern causes excessive memory allocation -->
<table>
<thead><tr><th>Col1</th><th>Col2</th><th>Col3</th></tr></thead>
<tbody>
<!-- 500+ rows causes 2GB+ memory -->
<tr><td>Data</td><td>Data</td><td>Data</td></tr>
<!-- ... repeated hundreds of times ... -->
</tbody>
</table>
Embedded Base64 Images
Inline images multiply memory usage:
<!-- Each embedded image is decoded and held in memory -->
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..." />
<!-- Multiple embedded images compound the problem -->
Complex CSS Selectors
Deep selector chains increase style calculation memory:
/* Deep selectors increase memory during style resolution */
.container .wrapper .content .section .item .inner .text p span {
color: #333;
}
Print Stylesheets with Media Queries
Complex @media print rules trigger additional layout calculations.
Memory Usage by HTML Pattern
| HTML Pattern | Typical Memory Usage |
|---|---|
| Simple text (10 pages) | 200-400 MB |
| Data table (100 rows) | 400-600 MB |
| Data table (500 rows) | 1.2-1.8 GB |
| 10 embedded images (1MB each) | 800 MB - 1.2 GB |
| Complex CSS with nested selectors | +200-400 MB overhead |
| Print media queries | +100-200 MB overhead |
Root Cause Analysis
Aspose's HTML-to-PDF conversion does not use a browser rendering engine. Instead, it implements its own HTML parser and layout engine. This custom implementation has different memory characteristics than browser-based rendering:
- Document Model Loading: The entire HTML document is parsed into memory before rendering begins
- CSS Calculation: Style calculations are performed on the full document tree
- Layout Computation: Layout passes may require multiple iterations for complex CSS
- Image Processing: Images are decoded and held in memory during rendering
- Font Loading: Font data is loaded for each font family used
These operations compound in ways that browser engines have optimized over decades but custom implementations have not. A document that Chrome renders in 50MB might consume 2GB in Aspose's converter.
The issue is architectural rather than a simple bug. Reducing memory consumption would require fundamental changes to how the converter processes documents.
Attempted Workarounds
Workaround 1: Increase Memory Limits
Approach: Configure the application or container with more available memory.
<!-- App.config or Web.config -->
<configuration>
<runtime>
<gcAllowVeryLargeObjects enabled="true" />
</runtime>
</configuration>
# Kubernetes
resources:
limits:
memory: 4Gi
Limitations:
- Increases infrastructure costs
- Does not solve the root cause
- Memory usage is unbounded; larger documents still fail
- May cause other applications on the same host to be memory-starved
Workaround 2: Split Large Documents
Approach: Break HTML into smaller chunks and convert separately.
// Convert in chunks, then merge PDFs
List<byte[]> chunks = new List<byte[]>();
foreach (var htmlChunk in SplitHtml(fullHtml))
{
chunks.Add(ConvertChunk(htmlChunk));
}
byte[] merged = MergePdfs(chunks);
Limitations:
- Complex implementation
- Breaks page numbering, headers, footers
- Tables and other elements cannot span chunks
- Significant development effort
Workaround 3: Queue with Limited Concurrency
Approach: Process conversions one at a time to prevent memory accumulation.
private static SemaphoreSlim _conversionSemaphore = new SemaphoreSlim(1, 1);
public async Task<byte[]> ConvertWithLimit(string html)
{
await _conversionSemaphore.WaitAsync();
try
{
return Convert(html);
}
finally
{
_conversionSemaphore.Release();
GC.Collect(); // Attempt to free memory
}
}
Limitations:
- Reduces throughput significantly
- Conversions queue up during peak load
- Memory may still accumulate if GC doesn't release native resources
A Different Approach: IronPDF
IronPDF uses an embedded Chromium browser engine with process isolation, providing predictable memory behavior that differs fundamentally from Aspose's architecture.
Why IronPDF Handles Memory Differently
IronPDF's rendering happens in a separate Chromium subprocess. This architecture provides several memory advantages:
- Process Isolation: Chromium's memory is separate from the .NET application
- OS Memory Management: The subprocess memory is managed by the operating system
- Clean Termination: When rendering completes, subprocess memory is fully released
- Battle-Tested Engine: Chromium's memory management has been optimized for years
The result is predictable memory consumption that scales with document complexity in a linear, manageable way.
Code Example
using IronPdf;
public class HtmlConverter
{
public byte[] ConvertHtmlToPdf(string html)
{
var renderer = new ChromePdfRenderer();
// These options affect rendering quality, not memory consumption
renderer.RenderingOptions.MarginTop = 20;
renderer.RenderingOptions.MarginBottom = 20;
// Render HTML - memory usage is predictable and bounded
using var pdf = renderer.RenderHtmlAsPdf(html);
return pdf.BinaryData;
}
public async Task<byte[]> ConvertComplexReportAsync(ReportData data)
{
var renderer = new ChromePdfRenderer();
// Enable JavaScript for complex rendering
renderer.RenderingOptions.EnableJavaScript = true;
renderer.RenderingOptions.RenderDelay = 1000; // Wait for charts to render
// Generate complex HTML with charts, tables, images
string html = GenerateReportHtml(data);
using var pdf = renderer.RenderHtmlAsPdf(html);
return pdf.BinaryData;
}
private string GenerateReportHtml(ReportData data)
{
return $@"
<!DOCTYPE html>
<html>
<head>
<style>
body {{ font-family: Arial, sans-serif; }}
table {{ width: 100%; border-collapse: collapse; }}
th, td {{ border: 1px solid #ddd; padding: 8px; text-align: left; }}
th {{ background-color: #4a90a4; color: white; }}
tr:nth-child(even) {{ background-color: #f9f9f9; }}
</style>
</head>
<body>
<h1>{data.Title}</h1>
<table>
<thead>
<tr>
<th>Column 1</th>
<th>Column 2</th>
<th>Column 3</th>
</tr>
</thead>
<tbody>
{GenerateTableRows(data.Rows)}
</tbody>
</table>
</body>
</html>";
}
private string GenerateTableRows(IEnumerable<RowData> rows)
{
return string.Join("", rows.Select(r =>
$"<tr><td>{r.Col1}</td><td>{r.Col2}</td><td>{r.Col3}</td></tr>"));
}
}
Key points about this code:
- Memory usage does not spike unpredictably
- Large documents with many rows complete without OOM errors
- The
usingstatement ensures proper cleanup - No special configuration needed for memory management
API Reference
For more details on the methods used:
Migration Considerations
Licensing
- IronPDF is commercial software with perpetual licensing
- Free trial available for evaluation
- Licensing information
API Differences
- Aspose:
Converter.ConvertHTML()with HTMLDocument objects - IronPDF:
ChromePdfRenderer.RenderHtmlAsPdf()with HTML strings - Migration involves replacing conversion calls, not changing HTML templates
What You Gain
- Predictable, bounded memory consumption
- Same HTML renders regardless of document size
- No need to split documents or limit concurrency
What to Consider
- Chromium binaries add to deployment size
- Different licensing model
- Slightly different API surface
Conclusion
Aspose's HTML-to-PDF conversion can exhaust memory on moderately complex documents due to its custom rendering implementation. For applications where memory predictability is important—especially containerized and serverless deployments—a Chromium-based converter provides the stability that custom HTML parsers cannot match.
Written by Jacob Mellor, CTO at Iron Software.
References
- Aspose Forum Thread #210253{:rel="nofollow"} - Out of memory when rendering HTML
For the latest IronPDF documentation and tutorials, visit ironpdf.com.
Top comments (0)