IronSoftware

Posted on Nov 26

Work with PDF MemoryStream in C#

#csharp #dotnet #pdf #tutorial

Our API returned PDFs as byte arrays. We needed to process them without hitting the file system. Temporary files cluttered our Azure App Service. Disk I/O became a bottleneck.

MemoryStream solved this. Here's how to load and save PDFs in memory.

How Do I Load a PDF from MemoryStream?

Pass the byte array to the PDF Document constructor:

using IronPdf;
// Install via NuGet: Install-Package IronPdf

byte[] pdfBytes = GetPdfFromDatabase(); // Your data source

var pdf = PdfDocument.FromBinary(pdfBytes);

// Modify the PDF
pdf.MetaData.Title = "Updated Document";

pdf.SaveAs("output.pdf");

The PDF loads entirely in memory. No temporary files created.

Why Use MemoryStream for PDFs?

Performance: Avoids disk I/O overhead
Serverless compatibility: AWS Lambda and Azure Functions restrict file system access
Security: Temporary files can leak sensitive data
Scalability: Reduces disk space constraints in containerized environments
API integration: Direct handling of PDFs from HTTP responses

I use MemoryStream for all PDF processing in our Kubernetes cluster. No file system dependency.

How Do I Save a PDF to MemoryStream?

Use ToBytes() or Stream():

var renderer = new ChromePdfRenderer();
var pdf = renderer.RenderHtmlAsPdf("<h1>Invoice</h1>");

// Get as byte array
byte[] pdfBytes = pdf.BinaryData;

// Or get as Stream
using var stream = pdf.Stream;

Send pdfBytes to a database, upload to blob storage, or return from an API.

Can I Load PDFs from Network Streams?

Yes. Read the stream into a byte array first:

using var httpClient = new HttpClient();
var response = await httpClient.GetAsync("https://example.com/document.pdf");

byte[] pdfData = await response.Content.ReadAsByteArrayAsync();

var pdf = PdfDocument.FromBinary(pdfData);

This works for any HTTP endpoint returning PDF content.

How Do I Load PDFs from a Database?

Assuming your database stores PDFs as varbinary or BLOB:

// Example with Entity Framework
var document = await dbContext.Documents
    .FirstOrDefaultAsync(d => d.Id == documentId);

var pdf = PdfDocument.FromBinary(document.PdfData);

// Modify and save back
pdf.MetaData.Author = "Updated Author";
document.PdfData = pdf.BinaryData;
await dbContext.SaveChangesAsync();

No file system interaction. PDFs stay in memory throughout the pipeline.

Can I Convert FileStream to PdfDocument?

Yes, read the stream:

using var fileStream = File.OpenRead("existing.pdf");
using var memoryStream = new MemoryStream();

await fileStream.CopyToAsync(memoryStream);
byte[] pdfBytes = memoryStream.ToArray();

var pdf = PdfDocument.FromBinary(pdfBytes);

Or use PdfDocument.FromFile() if you have file access:

var pdf = PdfDocument.FromFile("existing.pdf");

How Do I Upload PDFs to Azure Blob Storage?

Generate PDF in memory and upload directly:

using Azure.Storage.Blobs;

var renderer = new ChromePdfRenderer();
var pdf = renderer.RenderHtmlAsPdf("<h1>Report</h1>");

var blobClient = new BlobClient(connectionString, containerName, "report.pdf");

using var stream = new MemoryStream(pdf.BinaryData);
await blobClient.UploadAsync(stream, overwrite: true);

No temporary file needed. PDF goes directly from memory to blob storage.

Can I Return PDFs from ASP.NET Core APIs?

Yes, use FileStreamResult or FileContentResult:

[HttpGet("download")]
public IActionResult DownloadPdf()
{
    var renderer = new ChromePdfRenderer();
    var pdf = renderer.RenderHtmlAsPdf("<h1>Document</h1>");

    return File(pdf.BinaryData, "application/pdf", "document.pdf");
}

The client receives the PDF as a download. No server-side file created.

How Do I Merge PDFs from Multiple Sources?

Load each PDF from memory and merge:

byte[] pdf1Bytes = await GetPdfFromApi1();
byte[] pdf2Bytes = await GetPdfFromDatabase();

var pdf1 = PdfDocument.FromBinary(pdf1Bytes);
var pdf2 = PdfDocument.FromBinary(pdf2Bytes);

var merged = PdfDocument.Merge(pdf1, pdf2);

byte[] mergedBytes = merged.BinaryData;

All operations happen in memory.

What's the Performance Impact?

Memory usage: PDFs load entirely into RAM. A 10MB PDF consumes ~10MB memory.
Speed: MemoryStream is faster than file I/O (no disk seeks)
Garbage collection: Large PDFs increase GC pressure. Dispose objects promptly.

For large PDFs (>100MB), consider streaming directly to disk if memory is constrained.

I process thousands of PDFs daily this way. Works well for documents under 50MB.

How Do I Handle Large PDFs?

For very large files, use streaming:

var pdf = PdfDocument.FromFile("large.pdf");

// Process page by page if possible
for (int i = 0; i < pdf.PageCount; i++)
{
    var singlePage = pdf.CopyPage(i);
    // Process this page
}

Or enable streaming mode if your PDF library supports it.

Can I Load PDFs from Base64 Strings?

Yes, decode first:

string base64Pdf = "JVBERi0xLjQKJeLjz9MK..."; // Base64 encoded PDF

byte[] pdfBytes = Convert.FromBase64String(base64Pdf);
var pdf = PdfDocument.FromBinary(pdfBytes);

Common when receiving PDFs from JSON APIs.

How Do I Cache PDFs in Memory?

Use MemoryCache for temporary storage:

using Microsoft.Extensions.Caching.Memory;

private readonly IMemoryCache _cache;

public byte[] GetOrGeneratePdf(string cacheKey)
{
    if (!_cache.TryGetValue(cacheKey, out byte[] pdfBytes))
    {
        var renderer = new ChromePdfRenderer();
        var pdf = renderer.RenderHtmlAsPdf("<h1>Cached Document</h1>");

        pdfBytes = pdf.BinaryData;

        _cache.Set(cacheKey, pdfBytes, TimeSpan.FromMinutes(30));
    }

    return pdfBytes;
}

Avoid re-generating the same PDF repeatedly.

What About Concurrent Access?

MemoryStream and byte arrays are thread-safe for read operations. For concurrent writes, use locking or immutable patterns:

private static readonly object _lock = new object();

lock (_lock)
{
    var pdf = PdfDocument.FromBinary(sharedBytes);
    // Modify safely
}

Or create separate PDF instances per thread.

Do I Need to Dispose MemoryStream?

IronPDF handles disposal internally. If you create your own streams:

using (var stream = new MemoryStream(pdfBytes))
{
    // Work with stream
}
// Disposed automatically

Use using statements or await using for async scenarios.

Written by Jacob Mellor, CTO at Iron Software. Jacob created IronPDF and leads a team of 50+ engineers building .NET document processing libraries.

DEV Community