How to Download a PDF from a URL in C
In everyday development, we often need to retrieve resources from the internet, especially PDF documents. Whether it is automatically backing up online reports, batch-downloading electronic invoices, or fetching dynamically generated contract files, efficiently and reliably saving remote PDFs locally is a very practical skill.
This article explains how to use the Spire.PDF for .NET library with C# to download a PDF document from a specified URL and save it locally. Spire.PDF provides a rich set of PDF processing features beyond just downloading and saving files.
Prerequisites
First, you need to install Spire.PDF for .NET in your project. You can do this via the NuGet Package Manager Console:
Install-Package Spire.PDF
Or via the .NET CLI:
dotnet add package Spire.PDF
This library supports .NET Framework 4.0 and above, .NET Core 3.1, .NET 5.0, and later versions.
Implementation Code
Below is the complete code example:
using System.IO;
using System.Net;
using Spire.Pdf;
namespace DownloadPdfFromUrl
{
class Program
{
static void Main(string[] args)
{
// Create a PdfDocument object
PdfDocument doc = new PdfDocument();
// Create a WebClient object for downloading web resources
WebClient webClient = new WebClient();
// Download PDF data from the URL into a memory stream
using (MemoryStream ms = new MemoryStream(
webClient.DownloadData("http://www.example.com/sample.pdf")))
{
// Load PDF data from the stream into the PdfDocument object
doc.LoadFromStream(ms);
}
// Save the PDF document to a local file
doc.SaveToFile("result.pdf", FileFormat.PDF);
// Release resources
webClient.Dispose();
doc.Close();
}
}
}
Code Explanation
1. Creating a PdfDocument Object
PdfDocument is the core class of Spire.PDF, representing a PDF document instance. It is used to hold and manipulate the PDF data downloaded from the internet.
2. Using WebClient to Download Data
WebClient is a simple HTTP download class in .NET. The DownloadData method returns a byte[], which represents the raw binary content of the PDF file.
3. Using MemoryStream as a Bridge
Wrapping the byte array into a MemoryStream allows us to use the doc.LoadFromStream(ms) method. This avoids the inefficient process of saving the file to disk before reading it again, enabling in-memory processing.
4. Loading and Saving the PDF
The LoadFromStream method parses the memory stream into a usable PDF document. Finally, SaveToFile persists the document to local storage with the filename result.pdf.
Notes
-
Exception Handling : In production environments, it is recommended to add
try-catchblocks to handle network timeouts, invalid URLs, PDF format errors, and other exceptions. -
Memory Management : Both
WebClientandPdfDocumentimplement theIDisposableinterface, so resources should be properly released. In the example,MemoryStreamis handled with ausingstatement, but it is also recommended to explicitly dispose ofwebClientanddoc, or wrap them inusingblocks as well. -
Asynchronous Version : For large files, consider using
WebClient.DownloadDataTaskAsyncor switching toHttpClientwith async methods to avoid blocking the UI thread. - URL Validity : Ensure the URL directly points to a PDF file rather than a redirect page.
Extended Applications
With Spire.PDF, you can perform additional operations immediately after downloading a PDF, such as:
- Extracting text or images
- Merging multiple PDF files
- Adding watermarks or headers/footers
- Converting PDFs to images or Word format
Summary
This article demonstrated how to download a PDF from a URL and save it locally using C# and Spire.PDF for .NET. The entire process is simple and efficient, requiring only a few lines of core code.
Spire.PDF is not only a document loading and saving tool but also a powerful PDF processing library worth exploring further.
Top comments (0)