Introduction
In some circumstances, you may receive a PDF document that contains empty or blank pages inside it. Before you redeliver it to your colleagues or clients, you’d better delete all blank pages in the PDF. In this article, you will learn how to programmatically remove blank pages from PDF by using Spire.PDF for Java.
Prerequisite Knowledge
There are two kinds of blank pages: the first is the completely empty page which has nothing inside; the second is the page that looks blank but actually contains spaces or blank images. Spire.PDF offers the PdfPageBase.isBlank() method to detect whether a page is absolutely blank (empty) or not. For the pages that meet the second situation, we convert them into images and determine if the converted images are blank. If yes, these PDF pages are also considered blank.
What you need to pay special attention to is that the converted images have watermark, which will affect the judgment results. Therefore, a temporary license that removes the evaluation message is required in this scenario. Free free to request one from sales@e-iceblue.com, and obtain the license key from it. The license key will be applied in your program this way.
com.spire.license.LicenseProvider.setLicenseKey("your license key");
Once you get the blanks pages from a document, you can remove them one by one using PdfDocument.getPages().removeAt(int pageIndex) method.
Installing Spire.Pdf.jar
If you use Maven, you can easily import the Spire.Pdf.jar in your application by adding the following code to your project’s pom.xml file. For non-Maven projects, download the jar file from this link and manually add it as a dependency in your application.
<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>http://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId> e-iceblue</groupId>
<artifactId>spire.pdf</artifactId>
<verson>4.11.2</version>
</dependency>
</dependencies>
Using the code
The following are the steps to remove blank pages from a PDF document.
- Apply your temporary license by license key in your Java program.
- Create a PdfDocument object, and load a sample PDF file using PdfDocument.loadFromFile() method.
- Traverse all the pages in the document.
- Judge if a certain page is absolutely blank using PdfPageBase.isBlank() method. If not, convert the page into an image and judge if the image is blank using the custom method isImageBlank().
- After a blank page is found, use PdfDocument.getPages().removeAt() method to delete it.
- Save the changes to another file using PdfDocument.saveToFile() method.
import com.spire.pdf.*;
import java.awt.*;
import java.awt.image.*;
import static com.spire.pdf.graphics.PdfImageType.Bitmap;
public class RemoveBlankPages {
public static void main(String[] args) {
//Register the license key
com.spire.license.LicenseProvider.setLicenseKey("your license key");
//Create a PdfDocument object
PdfDocument pdf = new PdfDocument();
//Load a sample PDF file
pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\blank.pdf");
//Traverse all the pages
for (int i = pdf.getPages().getCount() - 1; i >= 0; i--) {
//Determine if a page is absolutely blank (empty)
if (pdf.getPages().get(i).isBlank()) {
//Remove the blank page
pdf.getPages().removeAt(i);
}
else{
//Convert the PDF page to image
BufferedImage image = pdf.saveAsImage(i, Bitmap);
//Determine whether the image is blank or not
if (isImageBlank(image)) {
//Delete the corresponding PDF page if the image is blank
pdf.getPages().removeAt(i);
}
}
//Save the changes to another PDF file
pdf.saveToFile("output/RemoveBlankPages.pdf");
}
}
//Determine if an image is blank
public static boolean isImageBlank (BufferedImage image)
{
for (int i = 0; i < image.getWidth(); i++) {
for (int j = 0; j < image.getHeight(); j++) {
int pixel = image.getRGB(i, j);
Color c = new Color(pixel);
if (c.getRed() < 240 || c.getGreen() < 240 || c.getBlue() < 240) {
return false;
}
}
}
return true;
}
}
Top comments (0)