DEV Community

jelizaveta
jelizaveta

Posted on

How to Download a PDF from a URL in C#

How to Download a PDF from a URL in C

In everyday development, we often need to retrieve resources from the internet, especially PDF documents. Whether it is automatically backing up online reports, batch-downloading electronic invoices, or fetching dynamically generated contract files, efficiently and reliably saving remote PDFs locally is a very practical skill.

This article explains how to use the Spire.PDF for .NET library with C# to download a PDF document from a specified URL and save it locally. Spire.PDF provides a rich set of PDF processing features beyond just downloading and saving files.

Prerequisites

First, you need to install Spire.PDF for .NET in your project. You can do this via the NuGet Package Manager Console:

Install-Package Spire.PDF
Enter fullscreen mode Exit fullscreen mode

Or via the .NET CLI:

dotnet add package Spire.PDF
Enter fullscreen mode Exit fullscreen mode

This library supports .NET Framework 4.0 and above, .NET Core 3.1, .NET 5.0, and later versions.

Implementation Code

Below is the complete code example:

using System.IO;
using System.Net;
using Spire.Pdf;

namespace DownloadPdfFromUrl
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create a PdfDocument object
            PdfDocument doc = new PdfDocument();

            // Create a WebClient object for downloading web resources
            WebClient webClient = new WebClient();

            // Download PDF data from the URL into a memory stream
            using (MemoryStream ms = new MemoryStream(
                webClient.DownloadData("http://www.example.com/sample.pdf")))
            {
                // Load PDF data from the stream into the PdfDocument object
                doc.LoadFromStream(ms);
            }

            // Save the PDF document to a local file
            doc.SaveToFile("result.pdf", FileFormat.PDF);

            // Release resources
            webClient.Dispose();
            doc.Close();
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Code Explanation

1. Creating a PdfDocument Object

PdfDocument is the core class of Spire.PDF, representing a PDF document instance. It is used to hold and manipulate the PDF data downloaded from the internet.

2. Using WebClient to Download Data

WebClient is a simple HTTP download class in .NET. The DownloadData method returns a byte[], which represents the raw binary content of the PDF file.

3. Using MemoryStream as a Bridge

Wrapping the byte array into a MemoryStream allows us to use the doc.LoadFromStream(ms) method. This avoids the inefficient process of saving the file to disk before reading it again, enabling in-memory processing.

4. Loading and Saving the PDF

The LoadFromStream method parses the memory stream into a usable PDF document. Finally, SaveToFile persists the document to local storage with the filename result.pdf.

Notes

  • Exception Handling : In production environments, it is recommended to add try-catch blocks to handle network timeouts, invalid URLs, PDF format errors, and other exceptions.
  • Memory Management : Both WebClient and PdfDocument implement the IDisposable interface, so resources should be properly released. In the example, MemoryStream is handled with a using statement, but it is also recommended to explicitly dispose of webClient and doc, or wrap them in using blocks as well.
  • Asynchronous Version : For large files, consider using WebClient.DownloadDataTaskAsync or switching to HttpClient with async methods to avoid blocking the UI thread.
  • URL Validity : Ensure the URL directly points to a PDF file rather than a redirect page.

Extended Applications

With Spire.PDF, you can perform additional operations immediately after downloading a PDF, such as:

  • Extracting text or images
  • Merging multiple PDF files
  • Adding watermarks or headers/footers
  • Converting PDFs to images or Word format

Summary

This article demonstrated how to download a PDF from a URL and save it locally using C# and Spire.PDF for .NET. The entire process is simple and efficient, requiring only a few lines of core code.

Spire.PDF is not only a document loading and saving tool but also a powerful PDF processing library worth exploring further.

Top comments (0)