DEV Community

IronSoftware
IronSoftware

Posted on

MigraDoc HTML Conversion: What the Library Actually Does

Developers searching for MigraDoc HTML to PDF conversion capabilities often discover a fundamental mismatch between their expectations and reality. MigraDoc is a document definition and generation library built on top of PDFsharp, but it does not parse HTML or CSS. When developers attempt to convert HTML content using MigraDoc, they encounter errors, unexpected output, or simply find no methods that accept HTML input.

This article explains what MigraDoc actually provides, documents the confusion around HTML conversion, and presents alternatives for developers who need to convert HTML content into PDF documents.

The Problem

MigraDoc appears in search results and forum discussions alongside other PDF generation libraries, leading developers to assume it offers similar capabilities. The library's documentation describes it as creating "complex documents just like you do it with Microsoft Word," which suggests rich document formatting. However, MigraDoc implements a programmatic document object model (DOM), not an HTML converter.

When developers attempt to use MigraDoc for HTML-to-PDF conversion, they encounter these issues:

  • No RenderHtml() or ConvertHtml() methods exist in the API
  • HTML strings passed to text methods appear as literal HTML markup in the output
  • CSS styles have no effect since MigraDoc uses its own formatting system
  • Web page URLs cannot be rendered or converted

The fundamental architecture difference is that MigraDoc requires developers to construct documents using C# method calls, defining each paragraph, table, and style through code. HTML and CSS, being markup languages, are not recognized by MigraDoc's parser.

Common Error Patterns

Developers attempting HTML conversion with MigraDoc typically try approaches like this:

// This does NOT work - HTML renders as literal text
var document = new Document();
var section = document.AddSection();
section.AddParagraph("<h1>Invoice</h1><p>Thank you for your order.</p>");
Enter fullscreen mode Exit fullscreen mode

The output contains the literal string <h1>Invoice</h1><p>Thank you for your order.</p> rather than formatted content. MigraDoc treats HTML tags as plain text because it has no HTML parser.

Similarly, attempting to apply CSS styling produces no effect:

// CSS has no effect in MigraDoc
var paragraph = section.AddParagraph("Invoice");
// There is no way to apply "font-size: 24px" or "color: #333"
// Styling must use MigraDoc's own formatting objects
Enter fullscreen mode Exit fullscreen mode

Who Is Affected

This limitation impacts developers in several scenarios:

Teams with existing HTML templates: Organizations that have invested in HTML/CSS templates for invoices, reports, or correspondence cannot directly use these with MigraDoc. The templates must be completely rewritten as MigraDoc code.

Web application developers: Projects that generate HTML dynamically and need PDF output must either implement a separate MigraDoc document builder or switch to a library that supports HTML rendering.

CMS and content management systems: Applications storing rich text content as HTML cannot use MigraDoc without building a custom HTML-to-MigraDoc converter.

Migration projects: Teams migrating from other PDF libraries that support HTML find MigraDoc requires a fundamentally different approach to document generation.

Evidence from the Developer Community

The confusion about MigraDoc HTML support is well-documented across developer forums and support channels.

Timeline

Date Event Source
2018-02-19 Stack Overflow question on HTML string conversion Stack Overflow{:rel="nofollow"}
2019-08-15 Question about parsing HTML and adding to MigraDoc Stack Overflow{:rel="nofollow"}
2020-10-14 Forum thread on HTML invoice conversion PDFsharp Forum{:rel="nofollow"}
2024-09-06 Reddit discussion recommending MigraDoc for HTML conversion Reddit r/csharp{:rel="nofollow"}

Community Reports

"Neither MigraDoc nor PDFsharp parse HTML. It is up to you to parse the HTML and replace it with MigraDoc calls like AddFormattedText()."
— Thomas Hoevel (PDFsharp developer), Stack Overflow, February 2018

This response from one of the library maintainers directly addresses the misconception. The answer received 7 upvotes and has been viewed over 8,000 times, indicating the question is frequently encountered.

Another developer asked about creating PDFs from HTML strings in .NET Core:

"I need to create Pdf files from Html strings from an API on .Net Core. The library must be free (Not payments or anything related). I found that PDFSharp was a good library for this but it is kind of dead for .Net Core right now."
— Stack Overflow user, March 2020

The accepted answer confirms that PDFsharp (and by extension MigraDoc) does not support HTML conversion, recommending instead a combination of libraries or alternative approaches.

On Reddit, a 2024 thread titled "Recommend a free library to convert HTML to PDF" shows developers continuing to suggest MigraDoc despite its lack of HTML support:

"Check out PDFSharp along with MigraDoc which is a part of the same project."

This recommendation, while well-intentioned, perpetuates the misconception. Other commenters in the thread noted that additional libraries like HtmlRenderer.PdfSharp are required for any HTML conversion, and even then, only limited HTML and CSS features are supported.

What MigraDoc Actually Does

MigraDoc is a document definition library that provides a programmatic API for constructing PDF documents. It works by creating an in-memory document model that can be rendered to PDF using PDFsharp.

Core Capabilities

MigraDoc provides:

  • A document object model (DOM) for defining document structure in C#
  • Paragraph and text formatting with fonts, colors, and styles
  • Table creation with cell merging, borders, and alignment
  • Image embedding and positioning
  • Page headers, footers, and page numbering
  • Sections with different page layouts
  • Styles similar to Word document styles
  • Charts and basic graphics

MigraDoc Code Example

Here is how MigraDoc expects developers to create documents:

using MigraDoc.DocumentObjectModel;
using MigraDoc.Rendering;

public class MigraDocInvoiceExample
{
    public void CreateInvoice()
    {
        // Create a new document
        var document = new Document();

        // Define styles
        var style = document.Styles["Normal"];
        style.Font.Name = "Arial";
        style.Font.Size = 10;

        var headingStyle = document.Styles.AddStyle("Heading1", "Normal");
        headingStyle.Font.Size = 18;
        headingStyle.Font.Bold = true;
        headingStyle.ParagraphFormat.SpaceAfter = "6pt";

        // Add a section
        var section = document.AddSection();

        // Add heading - must use MigraDoc API, not HTML
        var heading = section.AddParagraph("Invoice #12345");
        heading.Style = "Heading1";

        // Add body text
        section.AddParagraph("Thank you for your order.");

        // Create table
        var table = section.AddTable();
        table.Borders.Width = 0.5;

        // Define columns
        table.AddColumn("3cm");
        table.AddColumn("8cm");
        table.AddColumn("3cm");

        // Add header row
        var headerRow = table.AddRow();
        headerRow.Shading.Color = Colors.LightGray;
        headerRow.Cells[0].AddParagraph("Qty");
        headerRow.Cells[1].AddParagraph("Description");
        headerRow.Cells[2].AddParagraph("Price");

        // Add data row
        var dataRow = table.AddRow();
        dataRow.Cells[0].AddParagraph("2");
        dataRow.Cells[1].AddParagraph("Widget Pro");
        dataRow.Cells[2].AddParagraph("$49.99");

        // Render to PDF
        var renderer = new PdfDocumentRenderer(true);
        renderer.Document = document;
        renderer.RenderDocument();
        renderer.PdfDocument.Save("invoice.pdf");
    }
}
Enter fullscreen mode Exit fullscreen mode

This code produces a formatted invoice, but requires explicit C# calls for every element. There is no way to pass an HTML string and have MigraDoc interpret it.

What MigraDoc Cannot Do

MigraDoc does not:

  • Parse HTML markup of any kind
  • Interpret CSS stylesheets or inline styles
  • Execute JavaScript
  • Render web pages from URLs
  • Convert existing HTML templates
  • Support CSS layout models (Flexbox, Grid)
  • Handle responsive or adaptive layouts

The PDFsharp FAQ explicitly addresses this limitation:

"Can I use PDFsharp to convert HTML or RTF to PDF? No, not 'out of the box', and we do not plan to write such a converter in the near future."

Since MigraDoc is built on PDFsharp, this limitation applies equally to both libraries.

When MigraDoc Is Appropriate

MigraDoc excels in scenarios where programmatic document construction is preferred:

Report generation with dynamic data: When documents are constructed from database queries and business logic, MigraDoc's API allows fine-grained control over layout and formatting.

Document templates defined in code: Teams that prefer defining document structure in C# rather than HTML can use MigraDoc's style and document definition features.

Scenarios requiring precise pagination control: MigraDoc provides explicit page break control and widow/orphan handling that can be difficult to achieve with HTML-to-PDF conversion.

Open-source licensing requirements: MigraDoc is MIT licensed, making it suitable for projects with strict licensing requirements that preclude commercial libraries.

When You Need HTML Conversion

MigraDoc is not appropriate when:

  • HTML templates already exist and must be preserved
  • CSS styling is required for formatting
  • Web content must be captured as PDF
  • Dynamic HTML content from a CMS needs rendering
  • Modern CSS features (Flexbox, Grid) are used in layouts
  • JavaScript execution affects document content

For these scenarios, a library with actual HTML rendering capability is required.

Third-Party Extension Attempts

The demand for MigraDoc HTML support has led developers to create extension libraries. The existence of these projects confirms both the demand and the fact that MigraDoc lacks native support.

MigraDoc.Extensions

Developer Ben Foster created MigraDoc.Extensions{:rel="nofollow"}, which adds basic HTML and Markdown conversion to MigraDoc documents.

The project description states:

"The biggest feature provided by this library is the ability to convert from HTML and Markdown to PDF, via MigraDoc's Document Object Model."

However, this extension has significant limitations:

  • Supports only a subset of HTML tags (p, strong, em, span, ul, ol, li)
  • Limited CSS support through inline styles only
  • No support for tables, images, or complex layouts
  • Requires manual integration and maintenance
  • Has not been updated for recent MigraDoc versions
  • Many CSS properties are simply ignored

Sample code using MigraDoc.Extensions:

using MigraDoc.Extensions.Html;

var html = "<p>This is <strong>bold</strong> and <em>italic</em> text.</p>";
section.AddHtml(html);
Enter fullscreen mode Exit fullscreen mode

This works for simple text formatting but fails for anything more complex. Developers attempting to use this extension for invoices, reports, or documents with tables quickly encounter its limitations.

HtmlRenderer.PdfSharp

Another approach is using HtmlRenderer.PdfSharp, which bypasses MigraDoc entirely and renders HTML directly to PDFsharp's drawing surface. However, this library only supports HTML 4.01 and CSS Level 2, excluding modern CSS features like Flexbox and Grid.

A Different Approach: IronPDF

For developers who need actual HTML-to-PDF conversion, IronPDF provides a different architectural approach. Rather than requiring manual document construction, IronPDF embeds a Chromium browser engine that renders HTML and CSS exactly as a web browser would.

Why IronPDF Handles HTML Natively

IronPDF uses the same rendering engine as Google Chrome. This means:

  • Full HTML5 support including semantic elements
  • Complete CSS3 support including Flexbox and Grid
  • JavaScript execution for dynamic content
  • Web fonts and custom typography
  • Modern layout techniques
  • Responsive design rendering

The architectural difference is fundamental. MigraDoc requires developers to translate HTML concepts into its own API. IronPDF accepts HTML directly and renders it using battle-tested browser technology.

Code Example

Converting an HTML invoice to PDF with IronPDF:

using IronPdf;

public class IronPdfHtmlConversion
{
    public void ConvertHtmlToPdf()
    {
        // IronPDF accepts HTML directly - no translation required
        string htmlContent = @"
            <!DOCTYPE html>
            <html>
            <head>
                <style>
                    body { font-family: Arial, sans-serif; padding: 40px; }
                    h1 { color: #333; font-size: 24px; margin-bottom: 20px; }
                    .invoice-table {
                        width: 100%;
                        border-collapse: collapse;
                        margin-top: 20px;
                    }
                    .invoice-table th, .invoice-table td {
                        border: 1px solid #ddd;
                        padding: 12px;
                        text-align: left;
                    }
                    .invoice-table th {
                        background-color: #f5f5f5;
                        font-weight: bold;
                    }
                    .total-row {
                        font-weight: bold;
                        background-color: #fafafa;
                    }
                    /* Flexbox layout - works in IronPDF, not in MigraDoc */
                    .header-row {
                        display: flex;
                        justify-content: space-between;
                        align-items: center;
                        margin-bottom: 30px;
                    }
                </style>
            </head>
            <body>
                <div class='header-row'>
                    <h1>Invoice #12345</h1>
                    <span>Date: January 20, 2026</span>
                </div>
                <p>Thank you for your order.</p>
                <table class='invoice-table'>
                    <thead>
                        <tr>
                            <th>Qty</th>
                            <th>Description</th>
                            <th>Price</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td>2</td>
                            <td>Widget Pro</td>
                            <td>$49.99</td>
                        </tr>
                        <tr>
                            <td>1</td>
                            <td>Premium Support</td>
                            <td>$99.00</td>
                        </tr>
                        <tr class='total-row'>
                            <td colspan='2'>Total</td>
                            <td>$198.98</td>
                        </tr>
                    </tbody>
                </table>
            </body>
            </html>";

        // Create renderer and convert HTML to PDF
        var renderer = new ChromePdfRenderer();

        // Optional: Configure PDF output settings
        renderer.RenderingOptions.MarginTop = 10;
        renderer.RenderingOptions.MarginBottom = 10;

        // Render HTML to PDF - CSS Flexbox is fully supported
        var pdf = renderer.RenderHtmlAsPdf(htmlContent);

        // Save the PDF
        pdf.SaveAs("invoice.pdf");
    }
}
Enter fullscreen mode Exit fullscreen mode

Key differences from MigraDoc:

  • HTML and CSS are passed directly to the renderer
  • No manual translation of HTML concepts to API calls
  • Flexbox layout in the header renders correctly
  • CSS styling applies exactly as in a browser
  • Existing HTML templates work without modification

Converting Web Pages

IronPDF can also convert live web pages:

using IronPdf;

public class WebPageConversion
{
    public void ConvertUrlToPdf()
    {
        var renderer = new ChromePdfRenderer();

        // Convert a URL directly to PDF
        var pdf = renderer.RenderUrlAsPdf("https://example.com/invoice/12345");

        pdf.SaveAs("web-invoice.pdf");
    }
}
Enter fullscreen mode Exit fullscreen mode

MigraDoc has no equivalent capability. Web page conversion requires a browser engine, which MigraDoc does not include.

API Reference

For more details on IronPDF's HTML rendering capabilities:

Migration Considerations

API Differences

The fundamental approach differs between MigraDoc and IronPDF:

Aspect MigraDoc IronPDF
Document input C# API calls HTML/CSS strings
Styling MigraDoc Format objects CSS stylesheets
Layout engine Custom MigraDoc engine Chromium browser engine
HTML support None native Full HTML5
CSS support None Full CSS3
JavaScript Not supported Supported
Learning curve MigraDoc-specific API Standard HTML/CSS

Licensing

MigraDoc is open-source under the MIT license and free for commercial use.

IronPDF is commercial software with a free trial for development and testing. Production use requires a license. Pricing information is available at ironpdf.com/licensing.

What You Gain

Switching to IronPDF for HTML conversion provides:

  • Ability to use existing HTML templates without modification
  • Full CSS3 support including Flexbox, Grid, and custom properties
  • JavaScript execution for dynamic content generation
  • Consistent rendering matching Chrome browser output
  • Reduced development time since HTML skills transfer directly
  • Web page and URL conversion capabilities

What to Consider

  • IronPDF requires a commercial license for production deployment
  • Chromium engine increases package size compared to MigraDoc
  • Learning IronPDF's specific API for advanced features takes time
  • Applications already using MigraDoc successfully may not need to change

Conclusion

MigraDoc is a document generation library that constructs PDFs through C# code, not an HTML converter. Developers who need to convert HTML content to PDF should evaluate libraries specifically designed for HTML rendering, such as IronPDF, rather than attempting workarounds with MigraDoc.

For projects that prefer programmatic document construction and do not require HTML input, MigraDoc remains a capable choice. For projects with HTML templates, CSS styling requirements, or web content rendering needs, a library with native HTML support provides a more direct solution.


Jacob Mellor is CTO at Iron Software and originally built IronPDF. He has spent over 25 years developing commercial software.


References

  1. Convert HTML string to PDF with MigraDoc - Stack Overflow{:rel="nofollow"} - Primary source confirming MigraDoc lacks HTML parsing
  2. How to parse HTML text and add it to MigraDoc document - Stack Overflow{:rel="nofollow"} - Developer discussion on HTML parsing challenges
  3. Can PDFSharp create Pdf file from a Html string in .Net Core? - Stack Overflow{:rel="nofollow"} - Confirms PDFsharp family limitations
  4. Convert HTML To PDF Help - PDFsharp Forum{:rel="nofollow"} - Official forum discussion
  5. Introduction to MigraDoc - Official Documentation{:rel="nofollow"} - Official MigraDoc capabilities description
  6. PDFsharp FAQ{:rel="nofollow"} - Official FAQ confirming no HTML conversion
  7. MigraDoc.Extensions - GitHub{:rel="nofollow"} - Third-party extension attempting HTML support
  8. Recommend a free library to convert HTML to PDF - Reddit{:rel="nofollow"} - Community discussion showing misconceptions

For the latest IronPDF documentation and tutorials, visit ironpdf.com.

Top comments (0)