IronSoftware

Posted on Feb 11

MigraDoc HTML Conversion: What the Library Actually Does

#csharp #dotnet

Developers searching for MigraDoc HTML to PDF conversion capabilities often discover a fundamental mismatch between their expectations and reality. MigraDoc is a document definition and generation library built on top of PDFsharp, but it does not parse HTML or CSS. When developers attempt to convert HTML content using MigraDoc, they encounter errors, unexpected output, or simply find no methods that accept HTML input.

This article explains what MigraDoc actually provides, documents the confusion around HTML conversion, and presents alternatives for developers who need to convert HTML content into PDF documents.

The Problem

MigraDoc appears in search results and forum discussions alongside other PDF generation libraries, leading developers to assume it offers similar capabilities. The library's documentation describes it as creating "complex documents just like you do it with Microsoft Word," which suggests rich document formatting. However, MigraDoc implements a programmatic document object model (DOM), not an HTML converter.

When developers attempt to use MigraDoc for HTML-to-PDF conversion, they encounter these issues:

No RenderHtml() or ConvertHtml() methods exist in the API
HTML strings passed to text methods appear as literal HTML markup in the output
CSS styles have no effect since MigraDoc uses its own formatting system
Web page URLs cannot be rendered or converted

The fundamental architecture difference is that MigraDoc requires developers to construct documents using C# method calls, defining each paragraph, table, and style through code. HTML and CSS, being markup languages, are not recognized by MigraDoc's parser.

Common Error Patterns

Developers attempting HTML conversion with MigraDoc typically try approaches like this:

// This does NOT work - HTML renders as literal text
var document = new Document();
var section = document.AddSection();
section.AddParagraph("<h1>Invoice</h1><p>Thank you for your order.</p>");

The output contains the literal string <h1>Invoice</h1><p>Thank you for your order.</p> rather than formatted content. MigraDoc treats HTML tags as plain text because it has no HTML parser.

Similarly, attempting to apply CSS styling produces no effect:

// CSS has no effect in MigraDoc
var paragraph = section.AddParagraph("Invoice");
// There is no way to apply "font-size: 24px" or "color: #333"
// Styling must use MigraDoc's own formatting objects

Who Is Affected

This limitation impacts developers in several scenarios:

Teams with existing HTML templates: Organizations that have invested in HTML/CSS templates for invoices, reports, or correspondence cannot directly use these with MigraDoc. The templates must be completely rewritten as MigraDoc code.

Web application developers: Projects that generate HTML dynamically and need PDF output must either implement a separate MigraDoc document builder or switch to a library that supports HTML rendering.

CMS and content management systems: Applications storing rich text content as HTML cannot use MigraDoc without building a custom HTML-to-MigraDoc converter.

Migration projects: Teams migrating from other PDF libraries that support HTML find MigraDoc requires a fundamentally different approach to document generation.

Evidence from the Developer Community

The confusion about MigraDoc HTML support is well-documented across developer forums and support channels.

Timeline

Date	Event	Source
2018-02-19	Stack Overflow question on HTML string conversion	Stack Overflow{:rel="nofollow"}
2019-08-15	Question about parsing HTML and adding to MigraDoc	Stack Overflow{:rel="nofollow"}
2020-10-14	Forum thread on HTML invoice conversion	PDFsharp Forum{:rel="nofollow"}
2024-09-06	Reddit discussion recommending MigraDoc for HTML conversion	Reddit r/csharp{:rel="nofollow"}

Community Reports

"Neither MigraDoc nor PDFsharp parse HTML. It is up to you to parse the HTML and replace it with MigraDoc calls like AddFormattedText()."
— Thomas Hoevel (PDFsharp developer), Stack Overflow, February 2018

This response from one of the library maintainers directly addresses the misconception. The answer received 7 upvotes and has been viewed over 8,000 times, indicating the question is frequently encountered.

Another developer asked about creating PDFs from HTML strings in .NET Core:

"I need to create Pdf files from Html strings from an API on .Net Core. The library must be free (Not payments or anything related). I found that PDFSharp was a good library for this but it is kind of dead for .Net Core right now."
— Stack Overflow user, March 2020

The accepted answer confirms that PDFsharp (and by extension MigraDoc) does not support HTML conversion, recommending instead a combination of libraries or alternative approaches.

On Reddit, a 2024 thread titled "Recommend a free library to convert HTML to PDF" shows developers continuing to suggest MigraDoc despite its lack of HTML support:

"Check out PDFSharp along with MigraDoc which is a part of the same project."

This recommendation, while well-intentioned, perpetuates the misconception. Other commenters in the thread noted that additional libraries like HtmlRenderer.PdfSharp are required for any HTML conversion, and even then, only limited HTML and CSS features are supported.

What MigraDoc Actually Does

MigraDoc is a document definition library that provides a programmatic API for constructing PDF documents. It works by creating an in-memory document model that can be rendered to PDF using PDFsharp.

Core Capabilities

MigraDoc provides:

A document object model (DOM) for defining document structure in C#
Paragraph and text formatting with fonts, colors, and styles
Table creation with cell merging, borders, and alignment
Image embedding and positioning
Page headers, footers, and page numbering
Sections with different page layouts
Styles similar to Word document styles
Charts and basic graphics

MigraDoc Code Example

Here is how MigraDoc expects developers to create documents:

using MigraDoc.DocumentObjectModel;
using MigraDoc.Rendering;

public class MigraDocInvoiceExample
{
    public void CreateInvoice()
    {
        // Create a new document
        var document = new Document();

        // Define styles
        var style = document.Styles["Normal"];
        style.Font.Name = "Arial";
        style.Font.Size = 10;

        var headingStyle = document.Styles.AddStyle("Heading1", "Normal");
        headingStyle.Font.Size = 18;
        headingStyle.Font.Bold = true;
        headingStyle.ParagraphFormat.SpaceAfter = "6pt";

        // Add a section
        var section = document.AddSection();

        // Add heading - must use MigraDoc API, not HTML
        var heading = section.AddParagraph("Invoice #12345");
        heading.Style = "Heading1";

        // Add body text
        section.AddParagraph("Thank you for your order.");

        // Create table
        var table = section.AddTable();
        table.Borders.Width = 0.5;

        // Define columns
        table.AddColumn("3cm");
        table.AddColumn("8cm");
        table.AddColumn("3cm");

        // Add header row
        var headerRow = table.AddRow();
        headerRow.Shading.Color = Colors.LightGray;
        headerRow.Cells[0].AddParagraph("Qty");
        headerRow.Cells[1].AddParagraph("Description");
        headerRow.Cells[2].AddParagraph("Price");

        // Add data row
        var dataRow = table.AddRow();
        dataRow.Cells[0].AddParagraph("2");
        dataRow.Cells[1].AddParagraph("Widget Pro");
        dataRow.Cells[2].AddParagraph("$49.99");

        // Render to PDF
        var renderer = new PdfDocumentRenderer(true);
        renderer.Document = document;
        renderer.RenderDocument();
        renderer.PdfDocument.Save("invoice.pdf");
    }
}

This code produces a formatted invoice, but requires explicit C# calls for every element. There is no way to pass an HTML string and have MigraDoc interpret it.

What MigraDoc Cannot Do

MigraDoc does not:

Parse HTML markup of any kind
Interpret CSS stylesheets or inline styles
Execute JavaScript
Render web pages from URLs
Convert existing HTML templates
Support CSS layout models (Flexbox, Grid)
Handle responsive or adaptive layouts

The PDFsharp FAQ explicitly addresses this limitation:

"Can I use PDFsharp to convert HTML or RTF to PDF? No, not 'out of the box', and we do not plan to write such a converter in the near future."

Since MigraDoc is built on PDFsharp, this limitation applies equally to both libraries.

When MigraDoc Is Appropriate

MigraDoc excels in scenarios where programmatic document construction is preferred:

Report generation with dynamic data: When documents are constructed from database queries and business logic, MigraDoc's API allows fine-grained control over layout and formatting.

Document templates defined in code: Teams that prefer defining document structure in C# rather than HTML can use MigraDoc's style and document definition features.

Scenarios requiring precise pagination control: MigraDoc provides explicit page break control and widow/orphan handling that can be difficult to achieve with HTML-to-PDF conversion.

Open-source licensing requirements: MigraDoc is MIT licensed, making it suitable for projects with strict licensing requirements that preclude commercial libraries.

When You Need HTML Conversion

MigraDoc is not appropriate when:

HTML templates already exist and must be preserved
CSS styling is required for formatting
Web content must be captured as PDF
Dynamic HTML content from a CMS needs rendering
Modern CSS features (Flexbox, Grid) are used in layouts
JavaScript execution affects document content

For these scenarios, a library with actual HTML rendering capability is required.

Third-Party Extension Attempts

The demand for MigraDoc HTML support has led developers to create extension libraries. The existence of these projects confirms both the demand and the fact that MigraDoc lacks native support.

MigraDoc.Extensions

Developer Ben Foster created MigraDoc.Extensions{:rel="nofollow"}, which adds basic HTML and Markdown conversion to MigraDoc documents.

The project description states:

"The biggest feature provided by this library is the ability to convert from HTML and Markdown to PDF, via MigraDoc's Document Object Model."

However, this extension has significant limitations:

Supports only a subset of HTML tags (p, strong, em, span, ul, ol, li)
Limited CSS support through inline styles only
No support for tables, images, or complex layouts
Requires manual integration and maintenance
Has not been updated for recent MigraDoc versions
Many CSS properties are simply ignored

Sample code using MigraDoc.Extensions:

using MigraDoc.Extensions.Html;

var html = "<p>This is <strong>bold</strong> and <em>italic</em> text.</p>";
section.AddHtml(html);

This works for simple text formatting but fails for anything more complex. Developers attempting to use this extension for invoices, reports, or documents with tables quickly encounter its limitations.

HtmlRenderer.PdfSharp

Another approach is using HtmlRenderer.PdfSharp, which bypasses MigraDoc entirely and renders HTML directly to PDFsharp's drawing surface. However, this library only supports HTML 4.01 and CSS Level 2, excluding modern CSS features like Flexbox and Grid.

A Different Approach: IronPDF

For developers who need actual HTML-to-PDF conversion, IronPDF provides a different architectural approach. Rather than requiring manual document construction, IronPDF embeds a Chromium browser engine that renders HTML and CSS exactly as a web browser would.

Why IronPDF Handles HTML Natively

IronPDF uses the same rendering engine as Google Chrome. This means:

Full HTML5 support including semantic elements
Complete CSS3 support including Flexbox and Grid
JavaScript execution for dynamic content
Web fonts and custom typography
Modern layout techniques
Responsive design rendering

The architectural difference is fundamental. MigraDoc requires developers to translate HTML concepts into its own API. IronPDF accepts HTML directly and renders it using battle-tested browser technology.

Code Example

Converting an HTML invoice to PDF with IronPDF:

using IronPdf;

public class IronPdfHtmlConversion
{
    public void ConvertHtmlToPdf()
    {
        // IronPDF accepts HTML directly - no translation required
        string htmlContent = @"
            <!DOCTYPE html>
            <html>
            <head>
                <style>
                    body { font-family: Arial, sans-serif; padding: 40px; }
                    h1 { color: #333; font-size: 24px; margin-bottom: 20px; }
                    .invoice-table {
                        width: 100%;
                        border-collapse: collapse;
                        margin-top: 20px;
                    }
                    .invoice-table th, .invoice-table td {
                        border: 1px solid #ddd;
                        padding: 12px;
                        text-align: left;
                    }
                    .invoice-table th {
                        background-color: #f5f5f5;
                        font-weight: bold;
                    }
                    .total-row {
                        font-weight: bold;
                        background-color: #fafafa;
                    }
                    /* Flexbox layout - works in IronPDF, not in MigraDoc */
                    .header-row {
                        display: flex;
                        justify-content: space-between;
                        align-items: center;
                        margin-bottom: 30px;
                    }
                </style>
            </head>
            <body>
                <div class='header-row'>
                    <h1>Invoice #12345</h1>
                    <span>Date: January 20, 2026</span>
                </div>
                <p>Thank you for your order.</p>
                <table class='invoice-table'>
                    <thead>
                        <tr>
                            <th>Qty</th>
                            <th>Description</th>
                            <th>Price</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td>2</td>
                            <td>Widget Pro</td>
                            <td>$49.99</td>
                        </tr>
                        <tr>
                            <td>1</td>
                            <td>Premium Support</td>
                            <td>$99.00</td>
                        </tr>
                        <tr class='total-row'>
                            <td colspan='2'>Total</td>
                            <td>$198.98</td>
                        </tr>
                    </tbody>
                </table>
            </body>
            </html>";

        // Create renderer and convert HTML to PDF
        var renderer = new ChromePdfRenderer();

        // Optional: Configure PDF output settings
        renderer.RenderingOptions.MarginTop = 10;
        renderer.RenderingOptions.MarginBottom = 10;

        // Render HTML to PDF - CSS Flexbox is fully supported
        var pdf = renderer.RenderHtmlAsPdf(htmlContent);

        // Save the PDF
        pdf.SaveAs("invoice.pdf");
    }
}

Key differences from MigraDoc:

HTML and CSS are passed directly to the renderer
No manual translation of HTML concepts to API calls
Flexbox layout in the header renders correctly
CSS styling applies exactly as in a browser
Existing HTML templates work without modification

Converting Web Pages

IronPDF can also convert live web pages:

using IronPdf;

public class WebPageConversion
{
    public void ConvertUrlToPdf()
    {
        var renderer = new ChromePdfRenderer();

        // Convert a URL directly to PDF
        var pdf = renderer.RenderUrlAsPdf("https://example.com/invoice/12345");

        pdf.SaveAs("web-invoice.pdf");
    }
}

MigraDoc has no equivalent capability. Web page conversion requires a browser engine, which MigraDoc does not include.

API Reference

For more details on IronPDF's HTML rendering capabilities:

ChromePdfRenderer - Main class for HTML-to-PDF conversion
RenderHtmlAsPdf - Convert HTML strings to PDF
RenderUrlAsPdf - Convert web pages to PDF
HTML to PDF Tutorial - Step-by-step guide

Migration Considerations

API Differences

The fundamental approach differs between MigraDoc and IronPDF:

Aspect	MigraDoc	IronPDF
Document input	C# API calls	HTML/CSS strings
Styling	MigraDoc Format objects	CSS stylesheets
Layout engine	Custom MigraDoc engine	Chromium browser engine
HTML support	None native	Full HTML5
CSS support	None	Full CSS3
JavaScript	Not supported	Supported
Learning curve	MigraDoc-specific API	Standard HTML/CSS

Licensing

MigraDoc is open-source under the MIT license and free for commercial use.

IronPDF is commercial software with a free trial for development and testing. Production use requires a license. Pricing information is available at ironpdf.com/licensing.

What You Gain

Switching to IronPDF for HTML conversion provides:

Ability to use existing HTML templates without modification
Full CSS3 support including Flexbox, Grid, and custom properties
JavaScript execution for dynamic content generation
Consistent rendering matching Chrome browser output
Reduced development time since HTML skills transfer directly
Web page and URL conversion capabilities

What to Consider

IronPDF requires a commercial license for production deployment
Chromium engine increases package size compared to MigraDoc
Learning IronPDF's specific API for advanced features takes time
Applications already using MigraDoc successfully may not need to change

Conclusion

MigraDoc is a document generation library that constructs PDFs through C# code, not an HTML converter. Developers who need to convert HTML content to PDF should evaluate libraries specifically designed for HTML rendering, such as IronPDF, rather than attempting workarounds with MigraDoc.

For projects that prefer programmatic document construction and do not require HTML input, MigraDoc remains a capable choice. For projects with HTML templates, CSS styling requirements, or web content rendering needs, a library with native HTML support provides a more direct solution.

Jacob Mellor is CTO at Iron Software and originally built IronPDF. He has spent over 25 years developing commercial software.

References

Convert HTML string to PDF with MigraDoc - Stack Overflow{:rel="nofollow"} - Primary source confirming MigraDoc lacks HTML parsing
How to parse HTML text and add it to MigraDoc document - Stack Overflow{:rel="nofollow"} - Developer discussion on HTML parsing challenges
Can PDFSharp create Pdf file from a Html string in .Net Core? - Stack Overflow{:rel="nofollow"} - Confirms PDFsharp family limitations
Convert HTML To PDF Help - PDFsharp Forum{:rel="nofollow"} - Official forum discussion
Introduction to MigraDoc - Official Documentation{:rel="nofollow"} - Official MigraDoc capabilities description
PDFsharp FAQ{:rel="nofollow"} - Official FAQ confirming no HTML conversion
MigraDoc.Extensions - GitHub{:rel="nofollow"} - Third-party extension attempting HTML support
Recommend a free library to convert HTML to PDF - Reddit{:rel="nofollow"} - Community discussion showing misconceptions

For the latest IronPDF documentation and tutorials, visit ironpdf.com.

DEV Community