DEV Community: Zubin Ajmera

What Contributes to Slow PDF Rendering?

Zubin Ajmera — Thu, 20 Mar 2025 16:35:57 +0000

PDFs usually render pretty quickly. But sometimes, they really start taxing your hardware and it takes a while for them to show up. In this blog post, I’ll briefly cover how rendering a PDF works and then go into some of the reasons why this process might be slow.

How Does Rendering a PDF Work?

Rendering a PDF is like executing a rendering programming language. Each page in the PDF contains one or more streams of (usually) compressed data that instructs reader applications what to render at each point on the page.

This PDF content stream can contain references to other objects, like XObjects and images, which also have to be processed while the PDF is rendering.

Now that you know how this works, let’s look at a handful of scenarios where this is slow.

Large Images

Some PDFs contain really high-resolution images. Before they can be rendered, they have to be loaded from the PDF and decoded into memory. Depending on how big they are, this can take up to tens of hundreds of megabytes. In environments where memory size is constrained, like on mobile platforms, this can cause problems.

Images can also be encoded in various formats that may be quicker or slower. In our experience, JPEG2000 pictures are usually the slowest.

Lots of Path Operations

Another big cause of slowdown is that of vector path operations. Imagine a PDF floorplan made up of hundreds of thousands of little lines. Each of these lines needs to be read from the content stream and rendered on the screen.

When using vector graphics, the expectation is that they render pixel-perfect and don’t start pixelating. This means we can’t render them once and cache them — we have to render them for each zoom level we encounter.

Broken but Recoverable PDFs

Sometimes PDFs are broken. Lots of PDF software, including PSPDFKit, has support for recovering broken PDFs. One issue that can cause severe performance issues is if the cross-reference table gets damaged.

The cross-reference table is used to quickly access objects in the PDF document. Without it, we’d have to look through an entire file to find objects like pages or images. This would be very slow in large files.

If the table is damaged, we go through the whole file once and remember the byte offset of each object so we can look it up afterward.

While this is usually a quick process with small PDFs, if your PDF file is a hundred megabytes or larger, it means we have to read through the complete file. This is still better than having to do the same each time we need to read an object.

Named Destinations

PDFs can contain a lookup table — called Named Destinations — that allows you to link to different parts of a document.

This provides a way to quickly and efficiently look up named destinations by grouping the names in specific sections. This way, you don’t have to go through the whole table but can just go through the section you need.

However, these lookup tables are sometimes broken. They have to be laid out in a PDF in a specific way, and if they aren’t, names might end up in the wrong part, which can slow everything down.

When you click on a link that has a named destination as a target, Nutrient has to go through the lookup table and find the proper destination.

If we can’t find the named destination, it might mean the link is invalid. Alternatively, it could mean that the named destination table was set up incorrectly, and instead of being able to look things up quickly, we have to fall back to going through the whole table (which can be massive).

We do this because our first priority is to find the correct destination, and we’ve encountered many documents with incorrect tables and customers still need the links to go to the correct destination.

Summary

This post outlines some of the biggest problems when it comes to rendering performance, but there’s no need to worry! We at PSPDFKit always try our best to mitigate these issues by using lots of tricks we’ve learned over the years.

Whenever you're ready, I'm here to help. We've specific PDF SDK solutions that could be helpful to your applications, requirements, and use-case. If PDFs are core to your apps, explore how it could fit into your workflow. (Trusted by thousands of developers in companies like Dropbox, IBM, Disney, and more), and used by ~1 billion end users in more than 150 different countries.

How to use iOS Data Protection (with recent changes)

Zubin Ajmera — Tue, 18 Mar 2025 10:26:13 +0000

Modern iOS devices support Data Protection, which secures user data using built-in encryption hardware. As of iOS 17, significant enhancements have been made, and understanding these updates is crucial for app developers.

This post covers how apps can use iOS Data Protection to secure files, highlighting recent changes in the frameworks and file encryption methods.

Every file stored on an iOS device falls under one of four Data Protection classes, which dictate when the file can be read or written to.

These protection types can be set using entitlements or programmatically, and it’s recommended to protect users’ data as securely as possible. From best to worst, these classes are:

NSFileProtectionComplete— Files are only accessible when the device is unlocked. In iOS 17, improvements in this protection class ensure faster performance and enhanced encryption algorithms. This is the most secure option.

NSFileProtectionCompleteUnlessOpen — While the device is locked, files that are already open can still be accessed. This is useful for background tasks.

NSFileProtectionCompleteUntilFirstUserAuthentication— The file is accessible after the user enters their passcode once post-boot, even if the device is locked.

NSFileProtectionNone — Files can be accessed at any time. This option should be avoided unless absolutely necessary.

In iOS 17, file protection is now further streamlined through updated system APIs, providing better integration with background tasks, and improved performance when handling sensitive data during app lifecycle events.

When trying to access a protected file outside its allowed conditions, the operation fails with a relevant error code, as before (NSFileReadNoPermissionError, NSFileWriteNoPermissionError).

Setting default protection levels

As of iOS 17, the process to configure Data Protection entitlements has remained largely the same, though improvements have been made in how provisioning profiles handle Data Protection settings. When setting default file protection, developers can define it in the entitlements file (com.apple.developer.default-data-protection) or programmatically.

NSFileProtectionCompleteUntilFirstUserAuthentication — This remains the default since iOS 7.

NSFileProtectionComplete — You can set this as the default using Xcode’s Capabilities tab. However, developers should note that changing the entitlement post-installation has become more reliable in iOS 17, but it still doesn’t apply to existing files. Developers can now more easily manage Data Protection for shared containers programmatically due to improved APIs.

Programmatic file protection

To ensure full control over file protection, developers can programmatically set protection levels.

For single files

Use this for new files:

try data.write(to: fileURL, options: .completeFileProtection)

For directories

On iOS 17, when creating a new directory, files within that directory inherit the parent directory’s protection level. The FileManager API can be used as before:

try FileManager.default.createDirectory(at: directoryURL, withIntermediateDirectories: true, attributes: [.protectionKey: FileProtectionType.complete])

iOS 17 update — Atomic writes

iOS 17 introduced new behavior with the .atomic option when creating files. Now, files created with .atomic inherit the protection level of the directory they’re created in, which addresses the issue present in earlier versions where files would inherit protection from the temporary directory instead of the target directory.

Managing file protection for existing files in a directory

When you set the protection level of a directory, it doesn’t automatically apply to existing files in that directory. You’ll need to update the protection level for each individual file. This can be done using FileManager.DirectoryEnumerator or NSDirectoryEnumerator:

guard let directoryEnumerator = FileManager.default.enumerator(at: directoryURL, includingPropertiesForKeys: [.fileProtectionKey], options: [], errorHandler: { url, error -> Bool in
    print(error)
    return true
}) else {
    print("Could not create directory enumerator at \(directoryURL.path)")
    return
}

for case let fileURL as URL in directoryEnumerator {
    do {
        try fileURL.setResourceValue(URLFileProtection.complete, forKey: .fileProtectionKey)
    } catch {
        print("Failed to set protection level for file \(fileURL.path): \(error)")
    }
}

Compliance and data privacy

As security and compliance requirements evolve, iOS 17 strengthens encryption methods in line with global data privacy regulations. Developers should always prioritize protecting user data, especially sensitive information, by using the most secure protection levels supported by iOS.

Conclusion

Data Protection on iOS 17 provides enhanced encryption methods and improved API performance, making it easier for developers to ensure user data remains private.

The updated guidelines make iOS an even more secure platform for storing sensitive information. Implementing file protection in your apps is straightforward, but crucial, to stay compliant with modern security standards.

For further reading, see Apple’s latest platform security documentation.

FAQ

What is iOS Data Protection?
iOS Data Protection is a feature that secures user data by using built-in encryption hardware, ensuring files are protected when a device is locked.

How many types of Data Protection are there in iOS?
There are four Data Protection types in iOS: NSFileProtectionComplete, NSFileProtectionCompleteUnlessOpen, NSFileProtectionCompleteUntilFirstUserAuthentication, and NSFileProtectionNone.

Can I change the protection level for existing files?
Yes, you can change the protection level for existing files programmatically using APIs like FileManager or NSURL.

Does Data Protection work in iOS Simulator?
No, Data Protection features do not function in iOS Simulator; testing should be conducted on actual devices.

How can I set a default protection level for my app?
You can set a default protection level in your app’s entitlements file in Xcode, specifically under the Data Protection capability.

"Is my document a valid PDF?"

Zubin Ajmera — Tue, 11 Mar 2025 09:08:10 +0000

PDF documents are widely used for their ability to faithfully represent and preserve information. However, determining whether a document has an invalid PDF format is crucial for ensuring it can be correctly processed.

In this post, we’ll cover the basics of identifying an invalid PDF format and see how Nutrient handles such cases.

What Is a PDF?

From a technical perspective, a PDF is a file format with a special syntax that must be adhered to. Conceptually, it represents data whose integrity we want to preserve across different systems.

Understanding this distinction is vital when checking if a PDF document has an invalid PDF format. A file might have valid PDF syntax but still be considered invalid if it has other issues.

How Can a PDF Become Invalid?

A PDF can be deemed invalid for several reasons, such as:

No pages — A PDF without page information is invalid.

Encryption — Encrypted PDFs are considered invalid until decrypted.

Missing header — A valid PDF must include a header defining the specification version within the first 1,024 bytes. Missing this header renders the PDF invalid.

The PDF specification doesn’t explicitly detail how to determine an invalid PDF format, which leaves software vendors to use their judgment.

Nutrient, for example, deems a PDF invalid if it’s encrypted or if it fails certain internal checks.

The important thing to note is that the official PDF specification doesn’t provide explicit checks for software to know how a PDF can be determined to be invalid. In the first section, Scope, it states:

This standard does not specify the following:

specific processes for converting paper or electronic documents to the PDF format;
specific technical design, user interface or implementation or operational details of rendering;
specific physical methods of storing these documents such as media and storage conditions;
methods for validating the conformance of PDF files or readers;
required computer hardware and/or operating system.

This leaves a gaping hole for PDF software vendors, and it requires that they use their best judgement to determine in which instances a PDF file can be considered invalid.

In our case, within the context of Nutrient, we also deem a PDF invalid if it’s encrypted, due to the fact that you effectively can’t interact with it until it’s unlocked.

Things can get even more complicated if we consider other file format standards related to PDF. One such example of this is PDF/A, which is another ISO standard that’s specialized in the archiving and long-term preservation of electronic documents.

PDF/A comprises a set of really specific ways in which data needs to be laid out to accomplish its goal. Because of that, a whole new level of complexity is added for us to be able to determine whether or not a PDF/A is valid.

As a result, there are even specialized tools, such as the Isartor Test Suite and veraPDF, that are tasked with developing tests that can be used as a starting point for creating validation software for this specific format.

Understanding PDF Files and File Format

PDF (Portable Document Format) files are widely used for sharing and exchanging documents due to their compatibility, security, and stability.

A PDF file is a self-contained document that can include text, images, vector graphics, and other media, allowing it to be viewed and printed consistently across different devices and platforms.

The PDF file format is based on the PostScript language and is designed to be platform-independent. This means users can share and view PDF files without worrying about compatibility issues, as the formatting remains intact regardless of the operating system or application used.

Structure of a PDF File

A valid PDF file consists of three main components:

Header: This section contains metadata about the PDF file, such as its version, creator, and other properties.

Body: The body holds the actual content of the PDF, including text, images, fonts, and other elements that make up the document.

Trailer: The trailer provides information about the overall structure of the PDF file, including pointers to the start of the body and additional metadata.

When a PDF file is created or edited, its internal structure is updated to reflect any changes made, ensuring that the document remains consistent and retains its formatting.

Using Nutrient to Validate PDFs

At Nutrient, we take a rather pragmatic approach to checking if we can work with a file as a PDF or not. Internally, Nutrient performs a series of checks to determine if a PDF is valid:

Is this even a PDF? — We look for the %PDF- directive in the file header. If this is missing, we abort any subsequent operations, as we can’t rely on the file to contain PDF syntax.

Is the file large enough to be a valid PDF? — We check the total file size to see if it’s larger than the size of the header (%PDF) and the end-of-file marker (%%EOF) added together. If this test fails, the file is automatically deemed invalid.

Do we have an end-of-file marker at all? — We’ll try to load the last 1,024 bytes of the file to look for an %%EOF marker. Not having an %%EOF marker makes the file an invalid PDF.

Does the file contain more PDF syntax after %%EOF? — If this is the case, then we’re dealing with a malformed file, and trying to perform any other operations with it would be a waste of resources, so we say this case is also grounds to deem a PDF invalid.

From an end user perspective, it’s easy to see if a PDF is valid or not: if it is, you’ll see it displayed onscreen. If it’s not, you’ll see a message like the one below.

If you’d like to do a manual check on a document before even attempting to present it, you can do so as follows:

SWIFT

OBJECTIVE-C

let url = // Document URL.
let document = PSPDFDocument(url: url)

// Check if the document is valid before continuing.
guard document.isValid else {
    // Perform appropriate cleanup actions.
    return
}

Calling PSPDFDocument.isValid will lazily load the document. If the document is valid and we were able to parse it correctly, then the document’s pages will be available to us.

How to Check if a Document is a Valid PDF
Open the Document in a PDF Viewer or Editor: Use software like Adobe Acrobat, Foxit Reader, or a web-based PDF viewer. If the file opens without issues, it’s likely a valid PDF.

Check the File Extension: Ensure the file has a .pdf extension. However, note that the extension alone doesn’t guarantee the file is a valid PDF.

Check the File Size: A valid PDF file usually has a reasonable file size based on its content. Extremely small files (e.g., a few bytes) may be suspicious.

Check the File Structure: A valid PDF should have a defined structure, including a header, body, and trailer. This can be checked using specialized tools or text editors that can read binary files.

Check for Errors: If you see error messages or warnings when trying to open the file, it may be corrupted or invalid. Common errors include “File is damaged” or “Unsupported PDF version.”

Repair the PDF: If you encounter an invalid PDF format error, you can try using a PDF repair tool to fix it. Tools like Adobe Acrobat’s “Preflight” or online services can help.

Convert the File: Sometimes, converting the PDF to another format (like Word or image formats) can help if you just need the content.

Update Your PDF Viewer: Ensure your PDF viewer or editor is up-to-date to support the latest PDF specifications and formats.

Conclusion

As we saw in this post, there are multiple aspects to consider when determining whether or not a PDF is valid. Given the broad field of applications the PDF format has, it can be very difficult to come to an agreement of what exactly constitutes a “valid” PDF.

At Nutrient, we interpret the PDF specification as closely as we can to make sure we can deliver the reliability our customers expect from us. Nevertheless, as with many aspects of dealing with PDF technologies, this is an ongoing effort, and we’ll always be looking to improve the ways in which we can provide the best experience possible.

FAQs

What are the most common signs of an invalid PDF format?
Common signs of an invalid PDF format include missing pages, corrupted content, an unreadable file header, or an absence of the end-of-file marker (%%EOF). If the PDF fails to open or display correctly, it may be due to these issues.

How can I check if a PDF is encrypted?
A PDF is considered encrypted if it requires a password to open or if it isn’t accessible until decrypted. Most PDF readers will prompt for a password if encryption is present. In Nutrient, encrypted PDFs are treated as invalid until they’re decrypted.

What tools can I use to validate a PDF file?
Tools like Nutrient can validate PDF files by checking the file header, end-of-file marker, and overall file size. For specific formats like PDF/A, specialized tools such as the Isartor Test Suite and veraPDF are available for more detailed validation.

Can an invalid PDF format be repaired?
Repairing an invalid PDF can be challenging, depending on the issue. Some tools may attempt to fix structural problems, but if the file is severely corrupted or malformed, it may not be recoverable.

How can I prevent my PDFs from becoming invalid?
To prevent PDFs from becoming invalid, ensure that you use reliable software for creating and handling PDFs, adhere to proper PDF standards, and verify the integrity of your files before distribution.

How to fill a PDF form in React

Zubin Ajmera — Tue, 04 Mar 2025 09:22:31 +0000

Filling out PDF forms programmatically can be a useful feature in many applications. In this tutorial, you’ll learn how to fill a PDF form using the pdf-lib library within a React project. The pdf-lib library provides a straightforward way to create, modify, and save PDF documents.

Prerequisites

Before beginning, make sure you have the following installed:

Node.js and npm (Node Package Manager)

A text editor of your choice

Setting up the project

Create a new React project by running the following command in your terminal:

npx create-react-app pdf-form-filler

Change into the project directory:

cd pdf-form-filler

Install the pdf-lib library by running the following command:

npm install pdf-lib

Loading and filling the PDF form

In this step, you’ll load the PDF form and fill it with the desired values. Open the src/App.js file in your text editor and replace its content with the following code:

import { PDFDocument } from 'pdf-lib';

const App = () => {
    const fillForm = async () => {
        // Step 1: Load the PDF form.
        const formUrl = 'https://pdf-lib.js.org/assets/dod_character.pdf';
        const formPdfBytes = await fetch(formUrl).then((res) =>
            res.arrayBuffer(),
        );
        const pdfDoc = await PDFDocument.load(formPdfBytes);

        // Step 2: Retrieve the form fields.
        const form = pdfDoc.getForm();
        const nameField = form.getTextField('CharacterName 2');
        const ageField = form.getTextField('Age');
        const heightField = form.getTextField('Height');
        const weightField = form.getTextField('Weight');
        const eyesField = form.getTextField('Eyes');
        const skinField = form.getTextField('Skin');
        const hairField = form.getTextField('Hair');

        // Step 3: Set values for the form fields.
        nameField.setText('Mario');
        ageField.setText('24 years');
        heightField.setText(`5' 1"`);
        weightField.setText('196 lbs');
        eyesField.setText('blue');
        skinField.setText('white');
        hairField.setText('brown');

        // Step 4: Save the modified PDF.
        const pdfBytes = await pdfDoc.save();

        // Step 5: Create a `Blob` from the PDF bytes,
        const blob = new Blob([pdfBytes], { type: 'application/pdf' });

        // Step 6: Create a download URL for the `Blob`.
        const url = URL.createObjectURL(blob);

        // Step 7: Create a link element and simulate a click event to trigger the download.
        const link = document.createElement('a');
        link.href = url;
        link.download = 'filled_form.pdf';
        link.click();
    };

    return (
        <div>
            <h1>PDF Form Filler</h1>
            <button onClick={fillForm}>Fill Form</button>
        </div>
    );
};

export default App;

Let’s break down the steps involved:

Load the PDF form — You start by fetching the PDF form using its URL and loading it into a PDF document using the PDFDocument.load method.

Retrieve the form fields— Next, you get a reference to the form within the PDF document using pdfDoc.getForm(). You then retrieve individual form fields using their field names.

Set values for the form fields — After obtaining references to the form fields, you can set the desired values using the setText method provided by each form field.

Save the modified PDF — Once the form fields are filled, you save the modified PDF using pdfDoc.save() to generate the updated PDF document with the filled form fields.

Create a Blob from the PDF bytes — You create a Blob object from the generated PDF bytes using the Blob constructor. The Blob represents the PDF data as a file-like object.

Create a download URL for the Blob — You create a download URL for the Blob using URL.createObjectURL. This URL allows you to download the PDF file.

Trigger the download — Finally, you create a link element (), dynamically set its href attribute to the download URL, specify the desired file name in the download attribute, and simulate a click event on the link to trigger the download of the filled PDF form.

Running the application

To run the application, execute the following command in the project directory:

npm start

Downsides of using pdf-lib

While the pdf-lib library is a powerful tool for working with PDF documents, it does have a few downsides and limitations:

Limited UI interactivity — The pdf-lib library primarily focuses on programmatic manipulation of PDF documents. It doesn’t provide built-in user interface (UI) components or interactivity for form filling. As a result, users cannot directly type into the form fields within the application’s UI; the form fields need to be programmatically filled using code, as demonstrated in the example.

Lack of real-time updates — Since the form fields are filled programmatically on the server or within the application code, there’s no real-time update or synchronization with the UI. If the user wants to see their input reflected in the filled form fields, they need to generate and download the updated PDF file.

Form filling with Nutrient

Nutrient offers comprehensive form filling capabilities, providing both a user-friendly interface and programmable options.

UI form filling — Nutrient’s prebuilt UI components allow users to easily navigate and interact with PDF forms. They can fill in text fields, select options from dropdown menus, and interact with checkboxes and radio buttons. Check out the demo to see it in action.

Programmatic form filling — Nutrient Web SDK offers versatile programmatic form filling options:

Document Engine — Easily persist, restore, and synchronize form field values across devices without building this ability yourself.

XFDF — Exchange form field data with other PDF readers and editors seamlessly.

Instant JSON — Efficiently export and import changes made to form fields.

Manual API — Have full control over extracting, saving, and manipulating form field values.

In addition, also provides an option for creating PDF forms:

PDF Form Creator — Simplify PDF form creation with a point-and-click UI. You can create PDF forms from scratch using an intuitive UI or via the API. Convert static forms into fillable forms, or modify existing forms by letting your users create, edit, and remove form fields in a PDF.

Conclusion

In this tutorial, you learned how to fill a PDF form in a React project using the pdf-lib library. It covered the steps required to load a PDF form, retrieve form fields, set values for the fields, save the modified PDF, and trigger the download of the filled form.

You also read about the limitations of using pdf-lib for form filling and learned how Nutrient offers a better alternative.

To learn more about Nutrient Web SDK, start your free trial. Or, launch our demo to see our viewer in action.

FAQ

Here are a few frequently asked questions about form filling in React.

What is the purpose of filling a PDF form programmatically in React?
Filling a PDF form programmatically allows developers to automate the completion of documents, which can save time and reduce errors.

Which libraries are commonly used to fill PDF forms in React?
Libraries like pdf-lib and Nutrient are popular for working with PDF forms in React projects.

Can I fill PDF forms with user input in real time?
With pdf-lib, form fields are filled programmatically, so real-time input isn’t visible in the UI; you need to generate and download a PDF to see changes.

**Does Nutrient offer more advanced form filling capabilities than pdf-lib?
**Yes, Nutrient offers a more robust suite of features, including UI-based form filling and additional programmatic controls.

Is it possible to save and synchronize form data across devices?
Yes, Document Engine allows you to persist and sync form data across devices, which can be useful for collaborative workflows.

_Whenever you're ready, I'm here to help. We've specific PDF SDK solutions that could be helpful to your applications, requirements, and use-case. If PDFs are core to your apps, explore how it could fit into your workflow. (Trusted by thousands of developers in companies like Dropbox, IBM, Disney, and more), and used by ~1 billion end users in more than 150 different countries._

Why Does PDF Use Floats and Word Use EMUs? The Surprising Answer

Zubin Ajmera — Tue, 25 Feb 2025 10:19:02 +0000

When dealing with Microsoft Office documents, it’s common to encounter many different units of length, most of which are constrained to integer values. In contrast, PDFs almost always specify real (as in a mathematical real number) values.

Furthermore, in PDFs, there are only a few different contexts where the same number has a different meaning.

This post will give some examples of the units of length in Word documents and discuss why the choice of integers vs. real values might make sense for each domain.

Units of Length in a Word Document

This next section will cover units of length used in Office and Word documents.

Integer-Only by Default

By default, Word will store integer-only values for the following:

point

This is the smallest unit of measure from typography with a de facto standard size of 1/72 inch. The space between a table border and cell content is specified in points.

half point

This is, as the name says, half a point. Word uses it for the font size. Fun fact: Word will inform you that 11.6 is not a valid number when entered as a font point size, whereas 11.5 is fine and can be used.

eighth point

The Office Open XML (OOXML) standard loves to divide points down even further. You can specify the width of page borders in eighths of a point.

twip

Line heights get a little more precision than font sizes or page border widths. They can be specified in twentieths of a point.

pixel

You can even specify the dimensions of some elements on the page in pixels. Depending on the context, this can have a different meaning.

English Metric Unit (EMU)

This is the base unit for DrawingML, which deals with shapes and the positioning of graphical elements. It’s defined as 1 EMU = 1/914400 imperial inch = 1/360000 cm.

We’ll discuss EMUs in a moment.

Additional Units Not Used by Default

The OOXML standard permits specifying real values for some of the above units, and the following units can also be specified with real values, although Word will usually not generate documents containing them.

pica

Typographic unit of measure corresponding to approximately 1/6 of an inch. There are three different measures in use today (American, French, and PostScript). A pica is further divided into 12 points. Want to specify the height of a shape? Why not use picas?

inch

This is the imperial inch, which is based on the international yard, which in turn is defined based on the metric system and defined exactly as 1 inch = 25.4 mm.

These are boring measures based on the metric system. 1 cm = 10 mm.

These are the same as in CSS: em refers to the height of the element’s font, and ex refers to the height of the letter x of the element’s font.

English Metric Unit

This is what it boils down to: one unit to rule them all (well, almost all).

Let’s just quote the standard here: “The EMU was created in order to be able to evenly divide in both English and Metric units, in order to avoid rounding errors during the calculation.

The usage of EMUs also facilitates a more seamless system switch and interoperability between different locales utilizing different units of measurement. EMUs define an integer based, high precision coordinate system.”

Although it doesn’t work out for every unit, most integer input values can be converted into EMUs without rounding (and therefore without loss of information) using only integer-based data types:

Integers Everywhere

The pattern of using integers and not real numbers to specify values continues for other measures too, for example:

thousands of an arcminute

Image rotations are specified in this unit.

thousands of a percent

If a percentage needs to be specified, just multiply it by 1,000 and forget the rest.

Integers vs. Reals

While Word and the whole Office suite prefer integer-only values, the PDF specification and documents almost always use real numbers to specify values for layout elements like font size, positioning, etc. Why is that? It comes down to decimal vs. binary and rounding.

Fractions and Numeral Systems

We humans are fond of and mostly use the decimal system, while computers mostly use the binary number system. This has some implications because different numeral systems can’t represent the same set of numbers using only a finite number of digits.

For example, just as we can’t represent 1/3 in the decimal system using a finite number of digits, computers can’t represent 1/10 accurately using a finite number of bits.

This means if we would, for example, specify the line height of a paragraph as 0.1 points (because who needs the fine print, right?) and put 10 lines on the page, then Word will need to save this information in the document.

File formats often use the decimal system for storing numbers, as does OOXML, even though computers don’t normally use this number system for calculations. So Word would store the line height as 0.1 in the decimal system in the document.

But if a word processor reads that back in, it has to convert this number into binary, which will involve some rounding, and it won’t be stored in memory as exactly 0.1, but as slightly more or less.

This in turn means that the accumulated height of the 10 lines will be either slightly more or slightly less than 1. That’s a problem for Word documents, as there might only be space for, let’s say, exactly 1 (in whatever unit). And depending on the height of the 10 lines, the paragraph might need to break and continue on the next page — or not.

Rounding and Order of Operations

So we need to round numbers because of fractions, but it gets worse, as the order we’re executing operations in will also affect the outcome. For example, 2*(1/2) will equal 0, while (2*1)/2 will equal 1 on almost all computers using integers. For floating-point systems, this gets worse because even additions may need rounding: 1e30-(1e30+1e-30) will yield 0, while (1e30-1e30)+1e-30 will yield 1e-30 using floats on current hardware.

Parsing Reals Is Really Hard

With IEEE 754, almost all modern CPUs should give the same results when doing floating-point math. But there’s another issue: It turns out parsing reals to floating-point numbers is really hard.

You want your reals parser to yield the exact same value on all platforms and CPU architectures given the same string. Amazingly, .NET Core struggled with this until at least .NET Core 3.0.

If we look at rendering floating-point numbers as strings, it’s surprising again how difficult this problem is.

You can use an algorithm called Grisu3, which can quickly convert 99.5 percent of floating-point numbers to strings, but you have to revert to an older, slower algorithm called Dragon4 for the remaining tricky cases.

On top of that, the rendered string should yield the exact same number when it’s parsed again.

Reals for PDF

In contrast to Word documents, PDFs don’t need to lay out content. In general, there’s a clearly defined way of deriving each element’s position, and if the position or size of that element turns out slightly different on different platforms, it has no consequences on the rest of the document.

Conclusion

The zoo of different units of length in Word is probably there for historical reasons as, for example, just using EMUs everywhere would be sufficient. Floating-point numbers introduce a lot of pitfalls if reproducibility is required.

Reproducibility is important for Word and Office documents, as minor differences in reconstructing the elements of a document can heavily influence the appearance of the rest of the document. As such, the preference of integer-only values makes sense in this domain.

For PDFs, just using floating-point numbers is fine, and not having to deal with, for example, building transformation matrices from integers is much easier.

Optical Character Recognition in Scanned PDFs

Zubin Ajmera — Tue, 18 Feb 2025 14:41:22 +0000

The ability to physically scan documents to a computer and store them in an electronic format is amazing. No longer do we have to keep reams of paperwork in filing cabinets for years after they were produced.

However, scanning documents does come with drawbacks:

There’s no ability to search text.

There isn’t a way to highlight text, underline text, etc.

Users are unable to copy and paste from documents.

It’s hard to redact information quickly and efficiently.

This is where Optical Character Recognition (OCR) comes in.

What Is OCR?

To define OCR, let’s look at the definition from Wikipedia:

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast).

Essentially, OCR is a process our brain performs every day when we read. OCR can also be performed by machines to interpret text from an image in order to enhance its functionality.

But why is this important?

As mentioned in the introduction, more and more analog material is being digitized, and usually in an image-like format. These image formats hold no information about the text contained within the image, meaning a layer of information is missing.

By using OCR technology, you can create a digitized textual layer of information, which allows users to perform text-based software operations.

Why Use OCR on a PDF?

Because of the flexibility of the PDF format, it’s long been a go-to standard for the digitization of various sources. For example, when you scan a document, you often have the option to save it as a PDF. Sometimes it’s even the default.

When a document is scanned, you’re essentially taking a picture. Scanning software then embeds the image directly into a PDF for further use.

The issue with this is that PDFs have no textual information, which means you’ll miss out on a variety of PDF features that enhance the way data can later be analyzed.

OCR in PDF Use Cases

To give an example of how OCR might come into play with PDFs, I’ve written up a few potential use cases.

Searching an Archived Document

Scanning documents to save them in an electronic format is a great way of filing away data.

But, as a hypothetical example, what happens when you’re trying to look for all documents relating to your Tesla car purchase? Well, you could open every document on your device and look through to find the related content. Or you could use a search tool to scan through every document for the word “Tesla.”

If a document is scanned, there’s no way to search for the word “Tesla” because the PDF only holds an image representation of the text and not any textual content.

But if you perform OCR on your documents, you can create a text layer for searching at a later date. This would allow you to search for the word “Tesla” and find all the relevant documents in no time.

Marking Up a Document

Another useful feature of PDFs is the ability to mark up documents with annotations. With many readers, you can simply highlight, underline, add notes to text, and more.

Again, with a scanned PDF, you have no textual information to mark up, meaning you have no ability to use the features of a PDF reader, thereby making the content less interactive.

However, with a document where OCR has been performed, there’s text to work with, and you can then annotate it accordingly.

Redacting Sensitive Information

Going back to our Tesla document example, imagine now you want to hide the make of car you bought before you forward the relevant documents to someone else.

With a simple scanned image, this would make for a laborious task of scrolling through each document and creating redaction squares to cover the word “Tesla.”

If OCR has already been applied, you’ll have a text content layer that can be searched, and you can then extract the area from that search to easily create the redaction for all the instances found.

With a quick and easy command, it’s possible to redact all mentions of “Tesla,” thereby saving huge amounts of time.

Copying and Pasting Ability

What if you open a document and find an amazing quotable sentence, only to discover the document is just a scanned image? Well, no one wants to type the whole sentence out, do they?

With OCR applied to the document, you can highlight and copy from the document with any quality PDF reader, again, saving yourself the time and eliminating the margin of error that comes from rewriting it yourself.

How Does OCR Work?

Now that we’ve covered why you might need to use OCR, you may be wondering how it works. A machine is not equal to a human brain (for now at least), and you can’t hand a computer an image and say “read this.” Instead, algorithms have to be created to teach the computer how to read textual information from an image.

There are various ways of implementing OCR algorithms, but here we’ll describe the steps PSPDFKit takes to perform OCR.

Identifying Areas of Text

The first step is recognizing where text exists in an image. When humans look at an image, we’re able to see vibrant colors and detect edges of letters — even when the contrast is low or if there is a noise source like uneven light. For software to do the same, many filters and algorithms are used to “clean up” an image to help a computer better recognize text.

Various image manipulation techniques such as binarization, skew correction, and noise reduction filters are used to enhance the original image and reveal text in the document with the greatest contrast to allow the following steps to run with the highest quality.

With a clean image to work with, it’s much easier to detect possible text on a page and mark it for processing.

Reading the Text Lines

Once lines of text are found in the image, further analysis of these blocks determines the characters making up the text.

Recognizing characters is hard because they come in all shapes and sizes. Imagine how many fonts are out there, and then imagine how many ways a human can write a single character. The variations are near endless.

Using heuristics to predict each character is possible, but it’s also prone to errors. Instead, PSPDFKit uses Machine Learning (ML), which has the advantage of being able to analyze great quantities of data in order to make best guesses, much like humans do.

ML has the ability to look over a set of examples provided and make estimates in the future if they are similar to the examples seen before.

The ability of ML to analyze large amounts of data allows support for many combinations of text formats. Even when the exact combination is not accounted for in the original dataset, it’s possible to use the model to make a best guess of what is being represented. Again, this works similarly to how humans estimate.

Embedding Textual Information Back into a PDF

Now that we know what characters are written and the area they’re located in, it’s time to embed them back into the PDF. In the PDF specification, there’s a concept of the content of a page, which instructs PDF readers how and what text to draw.

But in the case of OCR, there’s no need to render the text. The image representing the text is already part of the PDF, and the reader should not write over it.

Because the PDF specification is so vast, it allows for invisible text to be placed in the content of a page so that rendered text doesn’t impair the view of the image behind the text.

The work doesn’t stop at making the text invisible though; the correct character needs to be represented on the correct area of the page.

Normally this is determined by the font used, with scaling dimensions applied to the font glyphs. The question is, “What font are we using?”

With OCR, a fake (or dummy) font is used to represent the characters present in the document at the scale and position required. Therefore, in the PDF, a special font that can represent any character is embedded so any reader can access the textual information PSPDFKit created!

Conclusion

From this blog post, you should now know how powerful OCR can make your simple scanned document. You should also have a high-level understanding of how OCR is performed.

Obviously, there are many more nitty-gritty details to achieving good OCR results, but you can leave those parts up to us.

How to Convert a PDF to an Image in Swift

Zubin Ajmera — Tue, 11 Feb 2025 06:44:53 +0000

Converting a PDF file to an image is a common use case for any app that displays PDFs.

An app might want to display the image representation of a PDF file’s page at different resolutions — some examples include a low-resolution image for a thumbnail, a medium-resolution image for a full-page display, and a high-resolution cropped image to show when zoomed in or for printing purposes.

This post will show you how to convert your PDF file to an image using Core Graphics, PDFKit, and PSPDFKit for iOS.

Apple has a rich history of supporting PDF files that dates back to its old NeXTSTEP days. As such, iOS has built-in support for reading and rendering PDF files. Starting with iOS 11, Apple introduced a framework to display and manipulate PDF files: PDFKit.

In turn, PDFKit is built on top of Core Graphics, which is a framework used primarily for lightweight 2D graphics rendering. For our use case, we can rely on the functionality of either one of these to convert a PDF file to an image. So let’s get started.

Core Graphics

Here’s how to render a page from a PDF file to an image using Core Graphics:

// Create a URL for the PDF file.
guard let path = Bundle.main.path(forResource: "filename", ofType: "pdf") else { return }
let url = URL(fileURLWithPath: path)

// Instantiate a `CGPDFDocument` from the PDF file's URL.
guard let document = CGPDFDocument(url as CFURL) else { return }

// Get the first page of the PDF document. Note that page indices start from 1 instead of 0.
guard let page = document.page(at: 1) else { return }

// Fetch the page rect for the page we want to render.
let pageRect = page.getBoxRect(.mediaBox)

// Optionally, specify a cropping rect. Here, we don’t want to crop so we keep `cropRect` equal to `pageRect`.
let cropRect = pageRect

let renderer = UIGraphicsImageRenderer(size: cropRect.size)
let img = renderer.image { ctx in
    // Set the background color.
    UIColor.white.set()
    ctx.fill(CGRect(x: 0, y: 0, width: cropRect.width, height: cropRect.height))

    // Translate the context so that we only draw the `cropRect`.
    ctx.cgContext.translateBy(x: -cropRect.origin.x, y: pageRect.size.height - cropRect.origin.y)

    // Flip the context vertically because the Core Graphics coordinate system starts from the bottom.
    ctx.cgContext.scaleBy(x: 1.0, y: -1.0)

    // Draw the PDF page.
    ctx.cgContext.drawPDFPage(page)
}

Here, we haven’t made use of the cropping rect, but we can use it for something like tiled rendering, which is where we split the PDF into multiple smaller images. Doing this has several benefits, such as faster rendering and lower memory usage when displaying a zoomed-in document.

One caveat of the UIGraphicsImageRenderer drawing API is that only a single page can be rendered at a time. If you want concurrent rendering, you’ll need to create multiple UIGraphicsImageRenderer instances for the same document and enqueue the rendering jobs in background queues.

PDFKit

We can use Apple’s PDFKit instead of Core Graphics for the same task. The code is almost the same as before, the only difference being that we use PDFPage’s draw(with box: to context:) API to do the actual drawing.

Here’s how the code looks:

// Create a URL for the PDF file.
guard let path = Bundle.main.path(forResource: "filename", ofType: "pdf") else { return }
let url = URL(fileURLWithPath: path)

// Instantiate a `CGPDFDocument` from the PDF file's URL.
guard let document = PDFDocument(url: url) else { return }

// Get the first page of the PDF document.
guard let page = document.page(at: 0) else { return }

// Fetch the page rect for the page we want to render.
let pageRect = page.bounds(for: .mediaBox)

let renderer = UIGraphicsImageRenderer(size: pageRect.size)
let img = renderer.image { ctx in
    // Set and fill the background color.
    UIColor.white.set()
    ctx.fill(CGRect(x: 0, y: 0, width: pageRect.width, height: pageRect.height))

    // Translate the context so that we only draw the `cropRect`.
    ctx.cgContext.translateBy(x: -pageRect.origin.x, y: pageRect.size.height - pageRect.origin.y)

    // Flip the context vertically because the Core Graphics coordinate system starts from the bottom.
    ctx.cgContext.scaleBy(x: 1.0, y: -1.0)

    // Draw the PDF page.
    page.draw(with: .mediaBox, to: ctx.cgContext)
}

PSPDFKit for iOS

The PDF manipulation API provided by Apple in Core Graphics and PDFKit is quite decent. But, if you need extra customization — such as adding custom drawing (e.g. watermarks) on a PDF page or controlling the page color — you’ll need to write your own code.

PSPDFKit offers a comprehensive PDF solution for iOS and other platforms, and first-class support.

The SDK comes with a fully featured PDF document viewer with a modern customizable user interface and a range of additional advanced features such as text extraction and search, full annotation and forms support, document editing, redaction, and much more.

To render a PDF as an image in PSPDFKit, you need to do the following:

// Create a URL for the PDF file.
guard let path = Bundle.main.path(forResource: "report", ofType: "pdf") else { return }
let url = URL(fileURLWithPath: path)

// Instantiate a `Document` from the PDF file's URL.
let document = Document(url: url)

let pageIndex: PageIndex = 0
guard let pageImageSize = document.pageInfoForPage(at: pageIndex)?.mediaBox.size else { return }

// Create a render request from your `Document`.
let request = MutableRenderRequest(document: document)
request.imageSize = pageImageSize
request.pageIndex = pageIndex

do {
    // Create a render task using the `MutableRenderRequest`.
    let task = try RenderTask(request: request)
    task.priority = .utility
    PSPDFKit.SDK.shared.renderManager.renderQueue.schedule(task)

    // The page is rendered as a `UIImage`.
    let image = try PSPDFKit.SDK.shared.cache.image(for: request, imageSizeMatching: [.allowLarger])
} catch {
    // Handle error.
}

The RenderTask API that we used in the above example is an asynchronous API that won’t block the main thread when rendering a PDF page to an image.

It includes an image cache by default, so multiple render calls for the same page are fetched directly from the cache. We also have a full featured guide article about how to use the RenderTask API, and it outlines the process of rendering PDF pages.

Additionally, we provide another API to render a PDF page to an image: imageForPage:at:size:clippedTo:annotations:options. One thing to keep in mind is that this method is a synchronous call that will block the main thread when it’s rendering a particularly large or complex PDF page.

Processing Rendered Images

When rendering a PDF page to an image, there are many ways to process the image before displaying or sharing it. We can add filters, transforms, and watermarks, as well as use a bunch of other image processing tools provided by Core Graphics to alter an image. In the following examples, we’ll showcase a few use cases where we do exactly that.

Using a Filter to Invert Image Colors
Inverting colors of an image or showing an image in grayscale is a common requirement for a lot of use cases — it could be to show an image in the all-too-popular Dark Mode, or for accessibility purposes, or just as a reading experience improvement for something like a sepia filter.

In the following example, we’ll show how to invert the color of an image.

The initial steps are almost identical to the simple page rendering example above:

// Create a URL for the PDF file.
guard let path = Bundle.main.path(forResource: "filename", ofType: "pdf") else { return }
let url = URL(fileURLWithPath: path)

// Instantiate a `CGPDFDocument` from the PDF file's URL.
guard let document = CGPDFDocument(url as CFURL) else { return }

// Get the first page of the PDF document. Note that page indices start from 1 instead of 0.
guard let page = document.page(at: 1) else { return }

// Fetch the page rect for the page we want to render.
let pageRect = page.getBoxRect(.mediaBox)

let renderer = UIGraphicsImageRenderer(size: pageRect.size)
let img = renderer.image { ctx in
    // Set and fill the background color.
    UIColor.white.set()
    ctx.fill(CGRect(x: 0, y: 0, width: pageRect.width, height: pageRect.height))

    // Translate the context.
    ctx.cgContext.translateBy(x: -pageRect.origin.x, y: pageRect.size.height - pageRect.origin.y)

    // Flip the context vertically because the Core Graphics coordinate system starts from the bottom.
    ctx.cgContext.scaleBy(x: 1.0, y: -1.0)

    // Draw the PDF page.
    ctx.cgContext.drawPDFPage(page)
}

Now, once we have the image object, we’ll create a filter and run the image through the filter:

// Create a `CIImage` from the input image.
let inputImage = CIImage(cgImage: img.cgImage!)

// Create an inverting filter and set its input image.
guard let filter = CIFilter(name: "CIColorInvert") else { return }
filter.setValue(inputImage, forKey: kCIInputImageKey)

// Get the output `CIImage` from the filter.
guard let outputCIImage = filter.outputImage else { return }

Finally, we convert the output CIImage into a UIImage to be able to use it:

// Convert the `CIImage` to `UIImage`
let outputImage = UIImage(ciImage: outputCIImage)

The images below show the result.

PSPDFKit directly supports adding CIFilters to a document’s render options. To get the same output as the example above, in PSPDFKit you need to do this:

guard let path = Bundle.main.path(forResource: "filename", ofType: "pdf") else { return }
let url = URL(fileURLWithPath: path)

// Instantiate a `Document` from the PDF file's URL.
let document = Document(url:url)
let pageInfo = document.pageInfoForPage(at: 0)!

// Create an inverting filter.
guard let filter = CIFilter(name: "CIColorInvert") else { return }

// Set the inverting filter as a rendering option.
document.updateRenderOptions(for: .all) { (options) in
    options.additionalCIFilters = [filter];
}

do {
    let image = try document.imageForPage(at: 0, size: pageInfo.size, clippedTo: .zero, annotations: nil, options: nil)
    print(image)
    // Use the image.
} catch {
    // Handle error.
}

Watermarks

Adding a watermark to a PDF page is another common use case. In the Watermarking a PDF on iOS blog post, we describe how to add either a temporary or a permanent watermark to a PDF file, and also how to perform the same tasks using PSPDFKit’s API.

For reference, here’s how to add a permanent watermark to a PDF using PSPDFKit:

// Create default configuration.
guard let configuration = Processor.Configuration(document: document) else {
    // Handle failure.
    abort()
}

configuration.drawOnAllCurrentPages { context, page, cropBox, _ in
    let text = "PSPDF Live Watermark"

    // Add text over the diagonal of the page.
    context.translateBy(x: 50, y: cropBox.size.height / 2)
    let attributes: [NSAttributedString.Key: Any] = [
        .font: UIFont.boldSystemFont(ofSize: 100),
        .foregroundColor: UIColor.red.withAlphaComponent(0.5)
    ]
    text.draw(with: cropBox, options: .usesLineFragmentOrigin, attributes: attributes, context: NSStringDrawingContext())
}

do {
    // Start the conversion from `document` to `processedDocumentURL`.
    let processor = Processor(configuration: configuration, securityOptions: nil)
    try processor.write(toFileURL: processedDocumentURL)
} catch {
    // Handle failure.
    abort()
}

Miscellaneous
Apart from the two use cases mentioned above, there are a couple of other interesting components provided by PSPDFKit that I want to highlight. These allow you to annotate images or perform optical character recognition (OCR) on your images.

Annotating Images: PSPDFKit for iOS provides the ImageDocument class, which makes the process of annotating images easy. All you need to do is pass your image to this class, and we handle the rest. We even simplified the PDF controller configuration by providing a prebuilt configuration that adjusts the UI so that it works great for images. To learn more, check out our Annotate Images guide.
Here’s a code snippet that shows how simple it is to open an image as a PDF document:

// Create a URL for the PDF file.
guard let path = Bundle.main.path(forResource: "img", ofType: "png") else { return }
let url = URL(fileURLWithPath: path)

// Instantiate an image document from the image's URL and present it.
let imageDocument = ImageDocument(imageURL: url)
let pdfViewController = PDFViewController(document: imageDocument, configuration: PDFConfiguration.image)
let pdfNavigationController = PDFNavigationController(rootViewController: pdfViewController)

// Present the view controller.
self.present(pdfNavigationController, animated: true)

OCR: When dealing with PDF documents, you’ll sometimes encounter documents with inaccessible text — for example, when working with scanned or photographed pages.

PSPDFKit’s OCR component unlocks inaccessible text and makes it available for selecting, copying, and searching. If you’d like to check out the OCR component, we also have a handy integration guide.

Conclusion

It is quite easy to render a PDF file to an image using Core Graphics, PDFKit, and PSPDFKit for iOS.

Along with rendering an image, we also covered several tasks related to images in PDFs such as:

Applying a filter to an image
Adding a watermark to a PDF file
Annotating images
Performing OCR

Hope this helps.

PDF Syntax 101: A Simple Guide to PDF Object Types and How They Work

Zubin Ajmera — Thu, 06 Feb 2025 10:16:54 +0000

We’ll cover some aspects of how a PDF is structured internally and provide an overview of some of the building blocks the PDF format consists of.

The PDF format is composed of various objects organized and indexed within the file, enabling efficient data retrieval and display in PDF viewers.

Everything about the structure of a PDF is covered in the PDF Specification, although sometimes the PDF spec might be a bit vague, or the actual behavior, even in Adobe’s products, might differ slightly in the actual implementation.

So when parsing a PDF, you’ll need to adjust for some edge cases and parse some things loosely, so as to not strictly reject everything that varies from the spec.

Since Nutrient already handles parsing and interpreting PDF files, even in the weirdest of edge cases, you don’t have to manually handle PDFs. But if you’re still interested in how a PDF looks under the hood and how the visual page representations are created, (be my guest and) read on.

Introduction to PDF File Format

The Portable Document Format (PDF) is a versatile file format developed by Adobe in the 1990s. Designed to represent documents in a manner independent of application software, hardware, and operating systems, PDF files have become a staple for sharing and archiving information.

The PDF file format ensures that documents appear the same on any device, preserving the layout, fonts, and graphics. This fixed-layout format is ideal for presenting information consistently, making it a popular choice for everything from official reports to eBooks.

The structure and content representation of PDF files are governed by a set of rules and conventions, ensuring uniformity and reliability across different platforms.

File structure

This is what a simple PDF with one page and the text “Hello PSPDFKit” looks like when shown in raw text:

%PDF-1.7
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages /Kids [ 3 0 R ] /Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R /MediaBox [ 0 0 595 842 ] /Resources 4 0 R /Contents 5 0 R >>
endobj
4 0 obj
<< /ProcSet[ /PDF /Text ] /Font <</Font1 << /Type /Font /Subtype /TrueType /BaseFont /Helvetica >> >> >>
endobj
5 0 obj
<< /Length 55 >>
stream
BT
 /Font1 35 Tf
 1 0 0 1 170 450 Tm
 (Hello PSPDFKit) Tj
ET
endstream
endobj

xref
0 6
0000000000 65535 f
0000000009 00000 n
0000000062 00000 n
0000000125 00000 n
0000000239 00000 n
0000000343 00000 n
trailer
<< /Root 1 0 R  /Size 14 >>
startxref
382
%%EOF

While this example might look like PDFs are text-based documents, this impression is false, since PDFs are binary documents.

Header and Trailer

A PDF file is composed of several key components: the header, body, cross-reference table, and trailer. The header, located at the beginning of the file, contains essential information about the PDF version and the document’s creator.

This sets the stage for the rest of the file’s structure. At the opposite end, the trailer plays a crucial role in organizing the document. It includes the location of the cross-reference table and the root object, which is pivotal for the document’s structure.

The cross-reference table itself is a vital part of the PDF file format, mapping object numbers to their specific locations within the file. This mapping allows for random access to objects, enabling quick retrieval and efficient navigation within the document.

PDF objects

A PDF consists of so-called objects that can have varying types, like null, Boolean, integer, real, name, string, array, dictionary, and stream.

These objects can be referenced either directly or indirectly in the file. Direct objects are placed inline where they are used, while indirect objects are referenced and placed somewhere else inside the document.

Direct object reference

Direct objects are constructed inline, directly in the place where they are used.

This snippet shows how to use a font as a direct object:

<< /ProcSet[ /PDF /Text ] /Font <</Font1<</BaseFont/Helvetica/Subtype/TrueType/Type/Font>> >> >>

Indirect object reference

Indirect objects are referenced and placed somewhere else inside the document. This requires PDF viewers to look the actual object up.

Indirect objects are defined in the PDF starting with their unique ID, an incrementing positive number, followed by a generative number, which is usually 0, along with the obj and endobj keywords.

This snippet shows how to define and use a font as an indirect object:

3 0 obj
<</Name/Font1/BaseFont/Helvetica/Subtype/TrueType/Type/Font>>
endobj

4 0 obj
<< /ProcSet[ /PDF /Text ] /Font  <</Font1 3 0 R >> >>
endobj

Document Catalog and Page Tree

At the heart of a PDF file lies the document catalog, the root object that serves as the gateway to the document’s contents. The document catalog contains references to other objects that define the structure and content of the PDF.

One of the most important structures referenced by the document catalog is the page tree. The page tree is a hierarchical structure that organizes the pages of the document.

Each page in the PDF is represented by a page object, a dictionary that includes references to the page’s contents, such as text, images, and annotations.

The page tree is typically implemented as a balanced tree, ensuring efficient access and navigation, but it can also be a simple array of pages in smaller documents.

Cross reference

Now the question arises: How does a PDF viewer look up where an indirect object is referenced? This is done via the cross reference table.

You might have noticed that at the bottom of the PDF is the startxref keyword. Since PDFs are read backward, from the bottom to the top, this keyword is defined at the bottom of the PDF rather than the top.

The number after startxref states at which byte the cross reference (xref) table starts:

startxref
382

The actual cross reference table defines the location for every object in the PDF:

xref
0 6
0000000000 65535 f
0000000009 00000 n
0000000062 00000 n
0000000125 00000 n
0000000239 00000 n
0000000343 00000 n

The first line shows that the table contains the declaration for six objects. In addition to the location of every object in the document, it’s necessary to have an empty 0 object at the top.

Since our example PDF has five objects, the cross reference table lists the location of six objects (including the empty 0 section). This makes it easy for PDF viewers to directly jump to the defined object without having to parse the entire document.

FAQ

Here are a few frequently asked questions about PDF syntax.

What is the internal structure of a PDF?
A PDF consists of objects like null, Boolean, integer, real, name, string, array, dictionary, and stream, which form the building blocks of the document structure. The document structure of a PDF includes these objects and highlights how they collectively form the logical outline and physical layout of the document.

What is an indirect object in a PDF?
Indirect objects are referenced and placed elsewhere in a PDF document and require PDF viewers to look them up.

How does a PDF viewer find objects in a PDF?
PDF viewers use a cross-reference table, which lists the locations of objects, enabling the viewer to jump directly to the object.

What does the cross-reference table do in a PDF?
The cross-reference table maps object locations in the PDF, making it easy for the viewer to access them without parsing the whole file.

How are PDFs read by viewers?
PDFs are typically read backward, from bottom to top, with the startxref keyword located at the bottom, indicating the starting point of the cross-reference table.

Why PDFium Remains the Most Trusted PDF Rendering Platform— Debunking the Myths

Zubin Ajmera — Tue, 04 Feb 2025 07:58:52 +0000

In recent years, myths about the security of open source technology, including PDFium, have circulated.

To clarify these misconceptions, today’s focus is on PDFium, a popular open source PDF rendering platform.

About PDFium

PDFium is a powerful and liberally licensed library designed for PDF rendering, inspection, manipulation, and creation. It’s widely used in various applications, including web browsers, document viewers, and editors.

It provides a comprehensive set of functions for working with PDF files, such as rendering pages and extracting text and images. Its versatility and robust feature set make it a great library for developers working with PDF files.

Purpose and intended audience
This article targets decision makers who are choosing a reliable PDF processing technology. Whether you’re developing a web, desktop, or mobile app, understanding the true capabilities of PDFium is crucial.

I’ll analyze some of the myths and misconceptions regarding the security of open source, as well as the open source technology used and trusted by literally billions of people (no, that’s not a typo) around the world.

My goal is to help you, the reader, come to your own conclusions about what really is fact versus what is fiction.

Myth #1 — Open source technology is insecure because the source code is open to the public

Have you ever heard anyone say: “Open source technology is insecure because all the source code is completely open to the public?”

Unfortunately, this is one of the biggest myths regarding open source technology, and it’s typically used by companies who’d rather spend their resources attacking their competition as opposed to innovating or contributing to a community.

So let’s analyze this argument a bit further.

Fact #1 — You’re probably already using open source technology and you don’t even know it

Just take a look at the latest statistics in Figure 1 below from the independent market research site statcounter.com regarding the global usage of web browsers.

Figure 1 — Browser market share worldwide (June 2022)

As you can see, Google Chrome dominates the market with a whopping 65 percent of the global market share. And when you add up all the statistics for the top four browsers (Chrome, Safari, Edge, and Firefox), you see that all the major browsers command a combined total of more than 91 percent of the global market.

And in case you were unaware, all of those web browsers are either fully open sourced, or they embed open source technology.

So now think about this conclusion personally: If you use Google Chrome, Apple Safari, Microsoft Edge, or Mozilla Firefox, you’re already using open source technology. That’s a fact.

Another major thing to consider: If your company (or business) standardizes on any of the web browsers above, then they’re standardizing on tools that are (or embed) open source technology.

Now take a look at Figure 2 below, illustrating how many people worldwide use open source web browsers, to see how ubiquitous and pervasive open source software is.

Figure 2 — Infographic: Worldwide usage of open source web browsers (June 2022)

At Nutrient, we adopted the use of the open source platform PDFium within our tools and APIs for developers. And with that, let’s address another myth.

Myth #2 — PDFium is an insecure PDF rendering engine

Now, without getting into the technical details of the various PDF specifications and how PDF toolkits (such as Nutrient) work, understand that PDF tools and toolkits are typically split into two parts.

One part reads and processes the text and binary information encapsulated inside a PDF document (this part is typically called the PDF parser). The other part is responsible for taking the parsed information (text, images, etc.) inside the PDF document and visualizing it for the user (this part is called the PDF renderer).

Figure 3 — The architecture for PDF tools such as Nutrient is split into two parts: PDF parsers and PDF renderers

Now, although we evidently demonstrated the widespread and ubiquitous use of open source technology, some may argue that, in particular, the open source PDF renderer PDFium is inherently insecure. So, again, let’s look at the facts.

Fact #2 — Major companies contribute to or use PDFium

I love arguing this point because I can let the facts speak for themselves. Guess what Google, Microsoft, Amazon, Dropbox, and (yes) Nutrient all have in common?

All of us are either contributors to the publicly available PDFium open source project, and/or we directly embed PDFium in the products we create for our end users. That’s a fact.

Google uses PDFium inside Chrome (the most widely used browser in the world).

Microsoft uses PDFium inside Edge (the default web browser in Windows 10 and 11).

Amazon uses PDFium inside Amazon Echo and Fire TV products.

Dropbox uses PDFium inside its client tools to preview files.

Figure 4 — Nutrient participates in a community of users and contributors to the open source PDFium project, alongside Google, Microsoft, Amazon, and Dropbox

Fact #3 — PDFium is an active and well-maintained open source project

As an active member of this vibrant and evolving community, Nutrient is passionate about and dedicated to the success, stability, and security of the open source PDFium project, which is continuously maintained and improved with new features that are channeled back to our customers.

Have you ever heard the phrase, “If you want to go FAST, then go alone, but if you want to go FAR, then go together?”

This is the mindset I instill in every employee at Nutrient, and it’s why we participate in the community of PDFium users and contributors.

In this community, each company has its own business case and reasoning for embedding PDFium within individual platforms, however, we’re jointly committed to the success of the project.

Conclusion

Hope this helps to get a good understanding of the common myths and why PDFium is a reliable pdf rendering platform you can trust.

FAQ

Here are a few frequently asked questions about PDFium

What makes PDFium a trusted platform for PDF rendering?
PDFium’s trustworthiness stems from its open source nature, contributions from major tech companies, and its integration in widely used products like Google Chrome and Microsoft Edge.

How does open source technology contribute to the security of PDFium?
Open source technology allows a global community of developers to continuously audit and enhance PDFium, ensuring vulnerabilities are quickly addressed and security is maintained.

What is the role of PDFium in popular web browsers?
PDFium powers the PDF rendering capabilities in browsers like Google Chrome and Microsoft Edge, providing reliable and efficient document handling.

Why is PDFium considered a well-maintained open source project?
PDFium is actively maintained by a broad community, including major corporations, which contributes to its stability, security, and ongoing feature improvements.

How does Nutrient utilize PDFium in its products?
Nutrient integrates PDFium to offer advanced PDF rendering and manipulation features, benefiting from PDFium’s reliability and ongoing development.

Sources
The list of companies that contribute to PDFium, which include Google, Microsoft, and Dropbox
The list of open source software used by Amazon Echo devices, which includes PDFium
The Dropbox.Tech blog discussing the performance of PDFium in Dropbox software
Microsoft Edge uses Chromium, which includes PDFium

How to Merge PDFs using PHP

Zubin Ajmera — Thu, 30 Jan 2025 09:42:54 +0000

In this quick post, I’ll show you how to combine multiple PDF files.

With Nutrient's API, you receive 100 credits with the free plan. Different operations on a document consume different amounts of credits, so the number of PDF documents you can generate may vary.

You’ll just need to create a free account to get access to your API key.

This post will be especially helpful for developers working with PHP in document-heavy workflows where users upload a large number of documents. This API will enable you to automate merging documents in your workflows.

A simple example could be an HR application where users upload resumes, cover letters, and references. By integrating a PDF merging API into the workflow, you’ll be able to automatically merge these documents and provide a consolidated document to your end users.

Nutrient API

Document merging is just one of the 30+ PDF API tools. You can combine our merging tool with other tools to create complex document processing workflows, such as:

Converting MS Office files and images into PDFs before merging

Performing OCR on several documents before merging

Merging, watermarking, and flattening PDFs

Once you create your account, you’ll be able to access all our PDF API tools.

Step 1 — Creating a Free Account

Create your free account here.

Once you’ve created your account, you’ll be welcomed by the page below, which shows an overview of your plan details.

As you can see in the bottom-left corner, you’ll start with 100 credits to process, and you’ll be able to access all our PDF API tools.

Step 2 — Obtaining the API Key

After you’ve verified your email, you can get your API key from the dashboard. In the menu on the left, click API Keys. You’ll see the following page, which is an overview of your keys:

Copy the Live API Key, because you’ll need this for the Merge PDF API.

Step 3 — Setting Up Files and Folders

Now, create a folder called merge_pdf and open it in a code editor. For this tutorial, you’ll use VS Code as your primary code editor. Next, create two folders inside merge_pdf and name them input_documents and processed_documents.

Then, in the root folder, merge_pdf, create a file called processor.php. This is the file where you’ll keep your code.

Step 4 — Writing the Code

Open the processor.php file and paste the code below into it: (get the full click-to-copy-and-paste code by visiting here.

ℹ️ Note: Make sure to replace YOUR_TOKEN_HERE with your API key.

Code Explanation

In the code above, you created and opened the php_result.pdf file under processed_documents. Then, you created a variable called instructions that contains all the information regarding the payload.

In the next part, you made a CURL POST request to your API endpoint, and in the parameters, first_half contains the first PDF, while second_half contains the second PDF.

Step 5 — Output

To execute the code, run the command below:

php processor.php

On successful execution, you’ll see a new processed file, php_result.pdf, which is located in the processed_documents folder.

The folder structure will look like this:

Final Words

You learned how to easily and seamlessly merge files for your PHP application into a single PDF using our Merge PDF API.

If you have a more complex use case, you can use other tools to add watermarks, perform OCR, and edit (split, flatten, delete, duplicate) documents — and you can even combine these tools.

Hope this helps. Whenever you're ready, I'm here to help. We've specific PDF SDK solutions that could be helpful to your applications, requirements, and use-case.

PDF Bookmarks vs. Outlines, What Developers Need to Know

Zubin Ajmera — Mon, 27 Jan 2025 14:13:13 +0000

Think of your favorite book in its physical format. Most books have a section that outlines the contents of the book with section titles and the pages where they can be found.

Being able to quickly glance at a document’s contents can really help with the experience of consuming content — especially when dealing with lengthy pieces of work.

In the same way, the PDF spec defines support for document outlines that let users navigate documents with ease and in a speedy manner, allowing them to jump from one section of a document to another one immediately.

One of the main characteristics of the document outline (also referred to as a table of contents) is that its structure resembles a tree of items.

That is, an outline item can have subitems. This allows for the outline to be able to present a rather detailed view of the contents of the entire document in a really convenient way.

Outlines and bookmarks

Outlines and bookmarks often confuse those who are not familiar with how PDFs work, as they’re pretty similar in both definition and how they function.

However, they do have subtle differences that can end up frustrating someone if they’re not taken into consideration.

Is a bookmark an outline element?

Fact #1: The PDF spec conflates outline elements and bookmarks. It states the following.

The outline consists of a tree-structured hierarchy of outline items (sometimes called bookmarks), which serve as a visual table of contents to display the document’s structure to the user.

As a matter of fact, Adobe’s software treats outline elements and bookmarks the same, which can be even more confusing.

When you open a PDF in Acrobat and click on the bookmark icon, what it shows you is actually the outline of the PDF.

When you add a bookmark to the document using Acrobat, what it really is doing is modifying the document’s outline to include the user-defined item.

Fact #2: The PDF spec contains no official way to support bookmarks, which means every PDF software vendor gets to decide how they implement bookmark support.

You can test this if you have a copy of Acrobat. Open a PDF in Acrobat, open the Bookmarks (Outline) panel, and add a new bookmark in the current page you’re on.

Save the PDF and then open the updated document in a third-party PDF reader (in this blog post I’ll be using Preview.app for the Mac, but your results shouldn’t vary too much if you’re using a different PDF viewer)

Click on the View Menu icon and select Bookmarks from the dropdown menu. You’ll be presented with an empty list.

But if you select Table of Contents from that menu, you’ll see the “bookmark” you created in Acrobat listed there.

You can also test it the other way around. Open the PDF first in Preview.app and press ⌘ + D to add a bookmark on the current page. Save the PDF and open it with Acrobat. The bookmark you added is nowhere to be found.

Philosophical differences

So what’s all this about? Why can’t PDF software vendors agree on a way to handle bookmarks in PDFs across the industry? Well, the answer is that everyone has their own idea of what a bookmark is.

Let’s go back to the example at the beginning of this post, where you were thinking of your favorite book in a physical format.

When you read a really interesting passage in that book, you might be inclined to highlight that page for future reference. You bookmark it.

Would you say that highlight should be part of the book’s table of contents?

For some people, it should. For others, it would be difficult to argue that everyone reading the same book in the future would be interested in the same passage that you highlighted.

What’s for sure, though, is that there’s no right answer. As books, PDF documents are used in a wide range of industries and applications, all with different objectives.

There’s no one-size-fits-all approach for how references to a page should be saved, so the interpretation is up for grabs for any vendor that wants to include some sort of bookmarking support in its software — if at all.

Nutrient views the outline as part of the document: It provides a hierarchical structure to it, along with a way for users to navigate its different sections in a nimble manner.

Bookmarks, on the other hand, are seen as information that’s laid on top of the document. They help the user, and they’re heavily context-dependent, without hierarchical order.

Unfortunately, this means that the bookmarking experience is certainly going to vary for the end user when they carry the same document across platforms and different vendor software (i.e. Acrobat on the Mac, but PDF Viewer on their iPad).

Nutrient tries to be a good citizen, so we’ve gone to great lengths to make sure that bookmarks created with Preview.app are available for our users, by storing them inside the PDF file in a format that can be shared across platforms.

Bookmarks created with Acrobat, on the other hand, are really outline elements, which we support. They won’t, however, be available as bookmarks on Nutrient.

Navigating a document with Nutrient

Nutrient can display document outlines via PSPDFOutlineViewController. This view is interactive, and it lets the user navigate the outline of a document.

Most outline items define a simple GoTo action that jumps to the appropriate page index when selected.

The PDF spec also defines a way to specify the location of the window on the page and the zoom level to be applied when the user is taken to the page associated with the outline item. The latter can be achieved via PDF Destinations, which are currently not supported by Nutrient.

Nutrient offers a way to programmatically access the outline of a document with the PSPDFOutlineParser and PSPDFOutlineElement classes.

An instance of PSPDFOutlineParser is automatically created for you when you query the document’s outline via -[PSPDFDocument outline].

For instance, consider the following document outline:

Quickstart Guide

Introduction

Getting Started

Integration - Swift 4 - Objective-C - More

…

Get a reference to the Objective-C item and execute its GoTo action:

The example above programmatically retrieves an outline element and executes its associated PDF action, mimicking what PSPDFOutlineViewController does when the user interacts with it.

You can read more about how Nutrient supports bookmarks on our blog.

Conclusion

When it comes to navigating a document, there’s no one-size-fits-all solution. Some people may prefer the hierarchical structure that an outline provides.

Others will mainly prefer interacting with one-off page markers that they can arrange how they please and store where they want.

However, it all comes down to offering a good experience for the final user. The approach we take at Nutrient strikes a nice balance that offers compatibility with one of the most popular applications on the planet, by supporting Preview.app bookmarks, but still respects the PDF specification by allowing the user to interact with “bookmarks” created with Acrobat.

FAQ

Here are a few frequently asked questions about PDF bookmarks and outlines.

What is the difference between bookmarks and outlines in a PDF?

Bookmarks are user-defined references that can vary by software, while outlines provide a hierarchical structure of a document’s contents.

How are outlines structured in PDFs?
Outlines in PDFs resemble a tree structure, allowing for subitems, and providing a detailed view of a document’s sections.

Why do different PDF viewers handle bookmarks differently?
Each PDF software vendor has its own interpretation of bookmarks, leading to inconsistencies in how they’re implemented across platforms.

Can bookmarks created in one PDF viewer appear in another?
Yes, but bookmarks may not always transfer between viewers, as their support can differ based on the software used.

How does Nutrient handle bookmarks and outlines?
Nutrient supports outlines as part of the document structure and provides compatibility for bookmarks created in certain applications.

Hope this helps. Whenever you're ready, I'm here to help. We've specific pdf sdk solutions that could be helpful to your applications and use-case.

The 6 Best PDF Generator APIs

Zubin Ajmera — Fri, 24 Jan 2025 08:01:02 +0000

In this post, I’ll provide a detailed breakdown of the most popular PDF generator APIs. To help you pick the best one for your use case, we’ll cover:

What a PDF generator API is
Common use cases
The selection criteria for PDF generator APIs
How pricing models are set up
The pros and cons of the most popular PDF generator APIs

Let's get started...

What is a PDF generator API?

A PDF generator API is a hosted service where you send structured data via an API call to a server and receive a PDF document back.

Traditionally, if you wanted your application to generate a PDF, you’d have to set up a server and integrate a PDF generation library on that server.

Using a PDF generator API simplifies that process. Now, you just need to add a small piece of code to your application to start generating PDFs.

Not only is using an API to generate PDFs easier to set up initially; it’s also easier to maintain.

You won’t have to deal with server management issues (keeping it secure, keeping the operating system up to date, fixing breakdowns, upgrading it to keep it fast, etc.) or keeping your PDF generation library updated.

However, foregoing a server and using a PDF generation API isn’t always the best solution. In situations where you want to control your data completely, it makes more sense to manage your own server.

You can use a self-hosted Document Engine that you add on your server to do this.

To sum up: If you want a simple, convenient way to generate PDFs for your application, using an API is the perfect solution. If you’re looking for total control over your data, or if you have a specific use case in mind, using your server may serve your needs better.

Benefits of using a PDF generation API

Using a PDF generation API can bring numerous benefits to your business. Below are some of the most significant advantages.

Increased efficiency — Automate the process of generating PDF documents, saving time and resources. By integrating a PDF generation API, you can streamline workflows and reduce manual effort, allowing your team to focus on more critical tasks.
Improved accuracy — Reduce errors and inconsistencies in your PDF documents with automated generation. Manual document creation can lead to mistakes, but a PDF generation API ensures your documents are accurate and consistent every time.
Enhanced customization — Easily create custom PDF templates and merge data to generate personalized documents. Whether you need to create invoices, reports, or certificates, a PDF generation API allows you to design stunning PDF documents that reflect your brand.
Scalability — Handle large volumes of PDF generation with ease, without compromising on performance. As your business grows, a PDF generation API can scale with you, ensuring you can generate PDF documents efficiently, regardless of volume.
Cost savings — Reduce costs associated with manual document generation and maintenance. By automating the process, you’ll save on labor costs and minimize the need for additional software or hardware.

Common use cases for PDF generator APIs

The following section outlines a few standard use cases for when a PDF generator API might practical or necessary.

Automating invoice generation
One of the most common PDF generation API use cases we see is from customers who need to automate their customers’ invoicing processes.

By integrating an API into your workflow, you can start passing billing or purchase information into an invoice template to automatically generate a PDF invoice. This can significantly speed up the process and reduce the risk of human error.

Streamlining report generation
If you have an application that generates custom reports for your users, you likely also need to provide them with PDF versions of those reports.

A PDF generation API lets you streamline this process by pushing data into a report template and creating the PDF report. For multiple reports, you can merge them into a consolidated document.

Creating financial and legal documents
Within financial and legal software, data is often captured in a custom workflow. Users are regularly required to fill out lengthy loan applications, insurance contracts, or homeownership agreements, and they’ll need copies of the completed contracts for their records.

With a PDF generation API, you can push a user’s data into standardized templates and create a PDF document that consolidates all their information.

Enhancing human resources documentation
By integrating your HR management and payroll software with a PDF generation API, you’ll be able to pull data directly from your databases and populate documents automatically. Easily generate payslips, employment contracts, employee reviews, employment policies, and other important documents.

Logistics
Documentation is at the heart of logistics operations. Bills of lading, consignment notes, packing lists, various certificates of origins, and invoices all need to be prepared, printed out, and shared with authorities and recipients of goods globally.

This process can be simplified by integrating a PDF generation API with logistics software, enabling all of this documentation to be generated automatically.

Education
An API can help you generate a certificate with your students’ names and completion dates for online courses or training programs.

It can be integrated with a learning management system (LMS) to support dynamically generating academic transcripts, report cards, program outlines, application forms, and more.

Key selection criteria for PDF generator APIs

There are a few factors to consider when choosing a PDF generator API, which we’ve outlined below.

Evaluating API accuracy
One of the most critical factors when evaluating PDF generation APIs is accuracy. In this context, accuracy is all about how closely the generated document matches your original template.

When you generate a document, does the API match the fonts defined in your template? Are images like your logo, product photos, and watermarks rendered correctly in the final PDF?

Assessing reliability and uptime
When integrating an API, reliability means your service isn’t interrupted. To evaluate reliability, focus on the following factors:

Test the API’s ability to process large documents.

Track and record any service disruptions or downtime.

Check if the provider offers a service-level agreement (SLA) for guaranteed uptime.

Monitor latency, or how quickly the API responds to requests.

Security considerations
When integrating a hosted API solution into your software or application, there are several security factors to consider:

Encryption — Confirm that the API provider encrypts your data in transit. Encrypting data in transit protects your data if communications are intercepted when data moves between your application and the API service used for generating your PDF.
Storing data — Investigate what happens to your data once you’ve passed it to the API. Does the API provider immediately delete that data from its servers, or does it purge it after a set time period?
Security certifications — Determine if the API provider is investing in security certifications. The most common include SOC, HIPAA, and GDPR certification.

Information
The most secure PDF generation option is a self-hosted option. At Nutrient, we offer a PDF processor you can host in your own environment.

Features
When it comes to evaluating features in a PDF generation API, there are two questions worth considering:

Does the API offer the features I need to accomplish my tasks?
Will this API meet any additional document processing capabilities I might have in the future?

Common PDF generator features

PDF generation APIs generally fall into two categories — those that convert HTML to PDF, and those that use templates to generate the PDF.

Here are some of the most common features offered by these APIs:

Adding a repeating header and footer
Out-of-the-box templates
The ability to convert from URLs
HTML-to-PDF conversion
Drag-and-drop editors

Future-proof APIs

If you plan to provide additional PDF processing capabilities in the future, consider finding an API that offers a wider set of features.

Working with one API solution for all document processing will often save setup time and have a total lower cost. The most popular document processing API features are:

eSigning API — Adding electronic signatures to PDFs
OCR — Extracting text from an image and converting to PDF
Editing — Deleting, merging, adding, and rotating PDF pages
Conversion — Converting MS Office and image files to PDF

Key features to look for in a PDF generation API

When selecting a PDF generation API, consider the following key features:

Template support — Look for an API that supports reusable PDF templates and allows for easy customization. This feature enables you to create consistent and professional-looking documents quickly.
Data merge — Ensure the API can merge JSON data with templates to generate dynamic PDF documents. This capability is crucial for creating personalized documents that pull data from various sources.
PDF conversion— Choose an API that supports PDF conversion from various file formats, such as HTML, images, and more. This flexibility allows you to convert different types of content into PDF files seamlessly.
Security — Opt for an API that prioritizes data security and provides features like encryption and access controls. Protecting sensitive information is essential, especially when dealing with financial or legal documents.
Integration — Select an API with easy integration options, such as REST APIs, SDKs, and tutorials. A well-documented API with robust integration options ensures a smooth implementation process.

Pricing models for PDF generator APIs

The most common PDF generation API packages start between 500 and 2,500 documents but can range from 50 documents all the way up to 1,000,000.

When exploring different pricing options, you’ll typically find the following:

Free tier — Provides you with a small number of free API calls per month.
Monthly/annual subscription — Provides several packages with varying amounts of API calls that are replenished every month.
Pay-as-you-go option — Lets you buy a set number of API calls that you can use over a specified time period.

Document-based pricing
This is the simplest payment model. You’ll be charged per document processed and won’t have to consider file size, datasets being merged, or different API actions being called.

Credits consumed by file size
Some API vendors require more credits for processing large files. For example, every megabyte of data could consume one credit.

In this scenario, it can be hard to predict how many credits you need, because you have to understand the size of documents you’re processing every month.

Credits consumed by dataset requests
Some API providers will charge you for each dataset contained in your request. If you plan on using an API with this model, it’s crucial to understand how many datasets you’ll be pulling data from to accurately calculate pricing.

Credits consumed by API action
Some solutions consume a credit each time an API action is performed. If you’re only using the API to perform one action per document, this pricing is straightforward.

However, if you have a workflow with multiple processing actions, it’ll be harder to predict the total cost of the solution. For example, if your workflow involves generating a PDF and then overlaying it with a watermark, you’ll need to spend two credits.

Paying for processing time
With this model, you purchase processing time per month. If each demand you make takes a minute to process, you’ll need to purchase 100 minutes to process 100 documents in a month.

This allows API vendors to limit use and charge more for larger files and more complex processes. However, this method isn’t that great for the user, as it’s difficult to estimate how much processing time you’ll need.

Top 6 best PDF generator APIs

Now that you have an overview of what a PDF generator API is, how it’s used, how to choose one, and how costs are determined, we’ve provided an overview of the six best PDF generator APIs available.

1. Nutrient DWS API

This might seem a bit of a brag, but we pride ourselves to offer the best solution for this in the market. (humble brag allowed?)

Since 2011, we at Nutrient have been working with the PDF specification and have developed one of the most comprehensive PDF SDKs available on the market.

During that time, we’ve been able to work with clients like Disney, IBM, UBS, and Dropbox to help them improve how their users work with PDF documents.

Nutrient DWS API is our first hosted product — an API that offers more than 30 document processing tools (with many more tools planned).

Our PDF Generator API gives you the ability to generate PDF files from HTML templates. You can style the CSS, add unique images and fonts, and persist your headers and footers across multiple pages.

Additionally, Nutrient DWS API offers:

Support for Postman collections
Language-specific documentation with sample code
Support for custom CSS and HTML
Support for customizing headers and footers

In addition to generating PDFs, one of the key benefits of Nutrient DWS API is that you can combine additional document processing tools in your PDF generation workflow, including:

PDF editing
OCR
Watermarking
Document conversion (supports more than 10 file types)

Documents supported
DOCS
DOC
XLSX
XLS
PPTX
PPT
PNG
JPG
TIFF
HTML

Pricing
Nutrient has a simple pricing policy, which is based strictly on the number of documents you need to generate.

You’re not limited by document size or the number of datasets requested. Additionally, you get an unlimited number of API actions for each document.

All API actions — such as conversion, OCR, watermarking, flattening, merging, splitting, and editing — spend just one credit when combined in a single document.

This is helpful when you’re building complex applications and workflows. Many other vendors charge based on the number of actions, which results in workflows being both difficult to develop and costly.

Nutrient offers multiple plans with subscription-based pricing, allowing a varying number of documents to be processed. You can learn more about the pricing on our website.

If you want to test it out, you can create a free account that comes with 100 free credits. Different operations on a document consume different amounts of credits, so the number of PDF documents you can generate may vary.

2. PDF Generator API

PDF Generator API is one of the most popular solutions and offers a flexible REST API and template editor you can use to generate PDF documents.

You first need to create a template, and then you need to pass the template ID and the JSON data through the API to generate the PDF.

Documents supported
JSON

Pricing

PDF Generator API has five different pricing plans that vary in the amount of documents you’ll be able to generate. It charges its users by the number of “merges” per document generated.

What this means is that if you pull data from three different sources to generate your PDF, you’ll be charged for three “merges” for generating that document, instead of just one.

3. APITemplate.io

APITemplate.io is a flexible template editor you can use to generate PDFs. It has an online editor that accepts HTML, Markdown, and WYSIWYG and converts these formats into PDF.

You can also integrate this tool into Zapier, n8n, and Integromat. The best feature of this tool is that you can preview the live PDF before publishing it. Additionally, APITemplate.io supports headers and footers with page metadata like the page number and the total number of pages.

Documents supported
HTML
Markdown
WYSIWYG

Pricing

APITemplate.io has a simple pricing structure with three pricing tiers. Each tier increases the number of PDFs you can generate and the number of templates you can use.

4. PDFMonkey

PDFMonkey is a simple REST API that takes HTML or JSON data to generate a PDF. You can also load external resources — like images, CSS, and JavaScript — in a template to generate the PDF. The API is straightforward and easy to use.

Documents supported
HTML
JSON

Pricing

PDFMonkey has a simple pricing model. The free tier subscription allows you to generate 300 PDFs per month.

In addition, there are three paid tiers that increase the number of PDFs you can generate. There aren’t any catches or complex pricing strategies. Customers who pay annually also get a 10 percent discount.

5. Anvil PDF Generation API

This is another very simple API that takes HTML and CSS or Markdown to generate the PDF. You just need to pass the HTML and CSS as a string, and it returns the PDF.

It also has a Postman collection that you can integrate into your Postman.

The documentation is nicely written and has code for rendering PDFs in React and Vue.js. You can also use Anvil’s Node client library to encrypt the data payloads with your public key.

Documents supported
HTML and CSS
Markdown

Pricing
Anvil doesn’t have monthly plans for its PDF generation API; you’re charged when you make a PDF generation request.

6. Paperplane

The way Paperplane generates PDFs is different to other APIs. You need to have an AWS S3 bucket ready with upload permissions.

You then have to put the API keys of your AWS account under the destinations, and only then is it ready to convert. After this, you can download the PDF directly or have it generated and uploaded to your S3 bucket.

Documents supported
HTML

Pricing
Paperplane has no free plan, but it offers a 14-day trial, which should be enough to test the product’s capabilities.

There are fixed-price plans, plus overage charges. Each of the packages has varying overage charges (meaning how much you have to pay for each PDF over the limit).

It’s also important to note that the Basic package doesn’t include the company’s 99.8 percent uptime guarantee.

Choosing the best PDF generation API for your business

When choosing a PDF generation API, consider the following factors:

Business needs — Assess your business requirements and choose an API that meets your specific needs. Whether you need to generate invoices, reports, or certificates, ensure the API can handle your use cases.
Scalability — Select an API that can handle your expected volume of PDF generation. As your business grows, the API should be able to scale with you, ensuring consistent performance.
Customization — Opt for an API that offers flexible customization options to meet your branding and design requirements. The ability to create custom templates and styles is crucial for maintaining a professional appearance.
Security — Prioritize data security and choose an API that provides robust security features. Look for encryption, access controls, and compliance with industry standards to protect your data.
Support — Look for an API with reliable customer support and resources. Good documentation, tutorials, and responsive support can make a significant difference in your implementation experience.

Final words

The best PDF generator for you depends entirely on your project’s requirements.

While you should never make compromises in terms of security and reliability that an API offers, you can be flexible with other factors and features.

If you only need to generate PDFs, and you don’t think your needs will expand in the future, then it’s a good idea to go with the most affordable solution.

However, if you have — or predict having — more complex workflows, then the feature list becomes the crucial factor when selecting the right API.

If you want to test Nutrient’s PDF generation API, you can create a free account.

It comes with 100 free credits and has access to all the tools, such as watermarking, PDF editing, OCR, and much more.

FAQs

Here are a few frequently asked questions about PDF generation APIs.

What is a PDF generator API?
A PDF generator API is a hosted service that allows you to create PDF documents by sending structured data to an API, which then returns a PDF file.

What are common use cases for PDF generator APIs?
Common use cases include automating invoice generation, generating reports, creating financial and legal documents, and producing educational materials.

How do I choose the right PDF generator API?

Consider factors like accuracy, reliability, security, feature set, and pricing models to ensure the API meets your project’s needs and budget.

What features should I look for in a PDF generator API?
Look for high-quality rendering, customizable templates, support for dynamic data, and additional functionalities like annotations and form handling.

Can PDF generator APIs be integrated with web applications?
Yes, most PDF generator APIs offer SDKs or RESTful interfaces that facilitate integration with web applications for seamless PDF creation and management.

What are the common challenges with PDF generator APIs?
Challenges include handling large documents, ensuring compatibility with various file formats, and managing API usage limits or costs.