IronSoftware

Posted on Jan 5

Merge and Split PDFs in C# .NET 10

#csharp #dotnet

Merging and splitting PDFs enables document workflow automation — combining invoice batches for accounting, splitting contracts for individual signing, extracting specific pages for review, consolidating reports from multiple sources. I've built document processing systems that merge thousands of PDFs nightly (customer statements combining multiple billing periods), split legal filings into individual exhibits, and extract signature pages from executed contracts for archival workflows.

The challenge is that PDF libraries make these operations unnecessarily complex. Stack Overflow's top-voted answer for merging PDFs (from 2009) recommends iTextSharp's PdfCopy and PdfImportedPage classes — requiring 40+ lines of boilerplate managing readers, writers, and page copying loops. The accepted answer hasn't been updated since, despite iTextSharp's AGPL license trap and its successor iText7 changing APIs entirely. Newer developers implement this outdated code, hit licensing issues, and scramble to find alternatives.

PDFSharp requires manually managing PdfDocument.Open(), iterating pages with foreach, calling PdfPage.Clone(), and managing disposal. Spire.PDF uses PdfDocumentBase.MergeFiles() for merging but lacks intuitive splitting — you read page counts, create new documents, import pages individually. These libraries work but require understanding internal PDF structure, memory management, and disposal patterns.

IronPDF uses single-method operations: PdfDocument.Merge(pdf1, pdf2) combines files, pdf.CopyPage(0) extracts one page, pdf.CopyPages(1, 3) extracts a range. No loops, no manual disposal, no page import complexity. The library handles memory management, font subsetting, annotation copying, and form field preservation automatically.

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var invoice1 = PdfDocument.FromFile("invoice-january.pdf");
var invoice2 = PdfDocument.FromFile("invoice-february.pdf");
var invoice3 = PdfDocument.FromFile("invoice-march.pdf");

var quarterly = PdfDocument.Merge(invoice1, invoice2, invoice3);
quarterly.SaveAs("Q1-invoices.pdf");

This merges three monthly invoices into a single quarterly PDF. The Merge() method accepts any number of PdfDocument parameters, combining them sequentially. If you're merging PDFs generated from HTML, pass [ChromePdfRenderer](https://ironpdf.com/blog/videos/how-to-render-html-string-to-pdf-in-csharp-ironpdf/) results directly without saving intermediate files.

What NuGet Packages Do I Need?

Install IronPDF via Package Manager Console:

Install-Package IronPdf

Or .NET CLI:

dotnet add package IronPdf

IronPDF includes merge and split functionality in the core library. No additional packages for page manipulation.

How Do I Merge Multiple PDFs?

Merging combines PDFs sequentially — first document's pages appear first, second document's pages follow, etc.

var pdf1 = PdfDocument.FromFile("cover-letter.pdf");
var pdf2 = PdfDocument.FromFile("resume.pdf");
var pdf3 = PdfDocument.FromFile("references.pdf");

var jobApplication = PdfDocument.Merge(pdf1, pdf2, pdf3);
jobApplication.SaveAs("complete-application.pdf");

This creates a single PDF containing cover letter, then resume, then references — ready for job application submission.

For dynamic merging (unknown file count at compile time):

var pdfFiles = Directory.GetFiles("invoices/", "*.pdf");
var pdfs = pdfFiles.Select(f => PdfDocument.FromFile(f)).ToArray();

var merged = PdfDocument.Merge(pdfs);
merged.SaveAs("all-invoices.pdf");

This loads all PDFs from a directory, converts to PdfDocument[], and merges. Useful for batch operations where PDF count varies.

I've used this pattern in automated billing systems merging customer invoices — the system generates PDFs throughout the month, then merges them into monthly statements at month-end.

Can I Merge HTML-Generated PDFs Without Saving Intermediate Files?

Yes. Render HTML to PDF, then merge the in-memory PdfDocument objects:

var renderer = new ChromePdfRenderer();

var reportHtml = "<h1>Sales Report</h1><p>Revenue: $1.2M</p>";
var chartHtml = "<h1>Charts</h1><img src='chart.png' />";
var summaryHtml = "<h1>Summary</h1><p>Overall positive trend.</p>";

var reportPdf = renderer.RenderHtmlAsPdf(reportHtml);
var chartPdf = renderer.RenderHtmlAsPdf(chartHtml);
var summaryPdf = renderer.RenderHtmlAsPdf(summaryHtml);

var fullReport = PdfDocument.Merge(reportPdf, chartPdf, summaryPdf);
fullReport.SaveAs("complete-report.pdf");

This generates three sections from HTML, merges them into one PDF, saving only the final merged file. Eliminates intermediate file I/O — faster and cleaner for automated report generation.

I generate executive dashboards this way — HTML templates render charts and tables, merge into comprehensive reports, email as single PDF attachments.

How Do I Split a PDF into Separate Pages?

Extract individual pages using CopyPage():

var contract = PdfDocument.FromFile("multi-party-contract.pdf");

// Extract page 1 (index 0) - signature page for Party A
var partyAPage = contract.CopyPage(0);
partyAPage.SaveAs("party-a-signature.pdf");

// Extract page 2 (index 1) - signature page for Party B
var partyBPage = contract.CopyPage(1);
partyBPage.SaveAs("party-b-signature.pdf");

Page indices are zero-based: first page is index 0, second is 1, etc. CopyPage() returns a new PdfDocument containing only that page.

For batch page extraction:

var largePdf = PdfDocument.FromFile("100-page-manual.pdf");

for (int i = 0; i < largePdf.PageCount; i++)
{
    var singlePage = largePdf.CopyPage(i);
    singlePage.SaveAs($"page-{i + 1}.pdf");
}

This splits a 100-page manual into 100 individual PDF files. Useful for distributing specific pages (chapter extracts, individual forms) without sharing entire documents.

How Do I Extract a Range of Pages?

Use CopyPages() to extract multiple consecutive pages:

var report = PdfDocument.FromFile("annual-report.pdf");

// Extract pages 5-10 (indices 4-9, inclusive)
var executiveSummary = report.CopyPages(4, 9);
executiveSummary.SaveAs("executive-summary.pdf");

// Extract pages 20-30
var financials = report.CopyPages(19, 29);
financials.SaveAs("financial-section.pdf");

Both indices are inclusive and zero-based. CopyPages(4, 9) extracts pages at indices 4, 5, 6, 7, 8, 9 — six pages total (pages 5-10 in human numbering).

I've split regulatory filings this way — extracting specific sections (financials, disclosures, exhibits) from master documents for targeted distribution to different stakeholders.

Can I Combine Multiple Pages Into a Grid Layout?

Yes, using CombinePages() to create thumbnail sheets or print layouts:

var slides = PdfDocument.FromFile("presentation.pdf");

// Combine into 2x2 grid (4 slides per page)
int pageWidth = 210;  // A4 width in mm
int pageHeight = 297; // A4 height in mm
int rows = 2;
int columns = 2;

var handout = slides.CombinePages(pageWidth, pageHeight, rows, columns);
handout.SaveAs("presentation-handout.pdf");

This takes a presentation's slides and creates handout pages with 4 slides per page in a 2x2 grid. Dimensions are in millimeters. Common page sizes: A4 (210×297mm), Letter (216×279mm).

For 6-up layouts (2 rows, 3 columns):

var gridPdf = slides.CombinePages(279, 216, 2, 3);
gridPdf.SaveAs("6-per-page.pdf");

I use this for creating thumbnail overviews of large documents — previewing 100-page manuals as 25-page thumbnail sheets where each page shows 4 thumbnails.

How Does IronPDF Compare to iTextSharp for Merging?

iTextSharp (now iText7) requires managing readers, writers, and page copying manually:

// iTextSharp approach (complex)
using (var output = new FileStream("merged.pdf", FileMode.Create))
using (var document = new Document())
using (var writer = new PdfCopy(document, output))
{
    document.Open();

    foreach (var file in files)
    {
        using (var reader = new PdfReader(file))
        {
            for (int i = 1; i <= reader.NumberOfPages; i++)
            {
                writer.AddPage(writer.GetImportedPage(reader, i));
            }
        }
    }
}

This is 15+ lines for basic merging. You manage disposal (using statements), iterate pages manually (for loop), import pages (GetImportedPage), and track indices (1-based, not 0-based).

IronPDF accomplishes the same in 2 lines:

var pdfs = files.Select(f => PdfDocument.FromFile(f)).ToArray();
var merged = PdfDocument.Merge(pdfs);

I migrated a document processing system from iTextSharp specifically because maintenance became costly. Developers couldn't modify merge logic without breaking disposal patterns. IronPDF eliminated this complexity — declarative operations replaced imperative loops.

Additionally, iTextSharp/iText7 uses AGPL licensing — modifying their library or using in SaaS requires commercial licenses. IronPDF's licensing is clearer for commercial use.

What About Splitting PDFs with iTextSharp?

iTextSharp splitting also requires manual page management:

// iTextSharp splitting (verbose)
var reader = new PdfReader("input.pdf");
for (int i = 1; i <= reader.NumberOfPages; i++)
{
    Document doc = new Document();
    PdfCopy copy = new PdfCopy(doc, new FileStream($"page-{i}.pdf", FileMode.Create));
    doc.Open();
    copy.AddPage(copy.GetImportedPage(reader, i));
    doc.Close();
}

Compare to IronPDF:

for (int i = 0; i < pdf.PageCount; i++)
{
    pdf.CopyPage(i).SaveAs($"page-{i + 1}.pdf");
}

IronPDF is clearer — no document lifecycle management, no reader/writer coordination, no manual page import.

What Common Issues Should I Watch For?

Page index confusion: IronPDF uses 0-based indexing (first page = 0), while humans count from 1. When extracting "page 5", use CopyPage(4). For ranges, "pages 10-20" is CopyPages(9, 19).

Memory with large PDFs: Merging many large PDFs loads all into memory simultaneously. For hundreds of PDFs, merge in batches:

var batch1 = PdfDocument.Merge(pdfs.Take(50).ToArray());
var batch2 = PdfDocument.Merge(pdfs.Skip(50).Take(50).ToArray());
var final = PdfDocument.Merge(batch1, batch2);

This processes 50 PDFs at a time, reducing peak memory usage.

Form fields and annotations: Merge() preserves form fields and annotations from source PDFs. If merging contracts with signature fields, the merged PDF retains those fields. Be aware that field names must be unique — merging two PDFs with identically named fields may cause conflicts.

File locking: PdfDocument.FromFile() locks the file until the PdfDocument is disposed. If processing fails mid-operation, files may remain locked. Use proper disposal:

using (var pdf = PdfDocument.FromFile("document.pdf"))
{
    // Operations here
}
// File released after using block

I've debugged production issues where batch merge jobs failed halfway through, leaving files locked and preventing subsequent runs. Proper using statements and error handling prevent this.

Quick Reference

Operation	Code	Use Case
Merge 2 PDFs	`PdfDocument.Merge(pdf1, pdf2)`	Combine documents
Merge many PDFs	`PdfDocument.Merge(pdfArray)`	Batch operations
Merge from HTML	`Merge(renderer.RenderHtmlAsPdf(html1), ...)`	Avoid intermediate files
Extract one page	`pdf.CopyPage(0)`	Single page extraction
Extract page range	`pdf.CopyPages(4, 9)`	Section extraction
Split into individual	`for (i=0; i<pdf.PageCount; i++) pdf.CopyPage(i).SaveAs(...)`	All pages separate
Grid layout	`pdf.CombinePages(210, 297, 2, 2)`	Thumbnail sheets
Page count	`pdf.PageCount`	Loop boundaries

Key Principles:

Merge() accepts any number of PdfDocument parameters or an array
Page indices are 0-based (first page = index 0)
CopyPage() extracts one page, CopyPages(start, end) extracts inclusive range
Both methods return new PdfDocument objects — original remains unchanged
CombinePages() creates grid layouts with dimensions in millimeters
IronPDF is dramatically simpler than iTextSharp's reader/writer/copy pattern
Use using statements or explicit disposal to prevent file locking
Merge in batches for hundreds of PDFs to manage memory

The complete merge and split guide includes examples for advanced scenarios like selective page merging and bookmark preservation.

Written by Jacob Mellor, CTO at Iron Software. Jacob created IronPDF and leads a team of 50+ engineers building .NET document processing libraries.

DEV Community