DEV Community

Usman Aziz
Usman Aziz

Posted on • Originally published at blog.groupdocs.com

Find and Remove Watermarks from Documents in Java

This article is useful for the Java developers who are looking for a way to find and remove text or image watermarks from PDFWordExcelPowerPointVisio and Email documents.

GroupDocs.Watermark for Java API supports adding text and image watermarks to a wide range of document formats. In addition, it also has the ability to find and remove watermarks from the documents. The API also finds the watermark objects that are added using the third-party tools. So let me demonstrate how you can remove the watermark from a document in a few steps in Java.

Before we begin, have a look at the following PDF document which contains a text as well as an image watermark. We’ll use this document and remove the watermarks from it.
Alt Text

Steps to remove watermarks from a document

1. Create a new project.

2. Add the following imports.

 import com.groupdocs.watermark.Document; 
 import com.groupdocs.watermark.ImageDctHashSearchCriteria;
 import com.groupdocs.watermark.ImageSearchCriteria; 
 import com.groupdocs.watermark.PossibleWatermarkCollection;
 import com.groupdocs.watermark.SearchCriteria;
 import com.groupdocs.watermark.TextSearchCriteria; 
Enter fullscreen mode Exit fullscreen mode

3. Create an instance of Document class and load the source document.

 Document doc =Document.load("watermarked.pdf");
Enter fullscreen mode Exit fullscreen mode

4. Find the watermarks based on search criteria using findWatermarks method (if you don’t pass any searching criteria, findWatermark will return all the possible watermark objects).

 // configure the search criteria for image watermark
 ImageSearchCriteria imageSearchCriteria =newImageDctHashSearchCriteria("watermark.png");
 // configure the search criteria for text watermark
 TextSearchCriteria textSearchCriteria =newTextSearchCriteria("CONFIDENTIAL");
 // combine the search criteria
 SearchCriteria combinedSearchCriteria = imageSearchCriteria.or(textSearchCriteria);
 // find possible watermarks
 PossibleWatermarkCollection possibleWatermarks = doc.findWatermarks(combinedSearchCriteria);
Enter fullscreen mode Exit fullscreen mode

5. Iterate over the watermark collection and remove watermarks using removeAt method.

 // iterate through the collection and remove watermarks
 while(possibleWatermarks.getCount()>0)
 {
 if (possibleWatermarks.get_Item(0).getImageData() !=null)
    {
        possibleWatermarks.removeAt(0);
 System.out.println("removed image watermark.");
    }
 else
    {
    possibleWatermarks.removeAt(0);
System.out.println("removed text watermark.");
    }
 } 
Enter fullscreen mode Exit fullscreen mode

6. Save the resultant document using save method.

 doc.save("without_watermark.pdf");
 doc.close(); 
Enter fullscreen mode Exit fullscreen mode

Complete Code

 Document doc =Document.load("watermarked.pdf");
 // configure the search criteria for image watermark
 ImageSearchCriteria imageSearchCriteria =newImageDctHashSearchCriteria("watermark.png");
 // configure the search criteria for text watermark
 TextSearchCriteria textSearchCriteria =newTextSearchCriteria("CONFIDENTIAL");
 // combine the search criteria
 SearchCriteria combinedSearchCriteria = imageSearchCriteria.or(textSearchCriteria);
 PossibleWatermarkCollection possibleWatermarks = doc.findWatermarks(combinedSearchCriteria);
 // iterate through the collection and remove watermarks
 while(possibleWatermarks.getCount()>0)
 {
 if (possibleWatermarks.get_Item(0).getImageData() !=null)
    {
        possibleWatermarks.removeAt(0);
 System.out.println("removed image watermark.");
    }
 else
    {
        possibleWatermarks.removeAt(0);
 System.out.println("removed text watermark.");
    }
 }        
 doc.save("without_watermark.pdf");
 doc.close();
Enter fullscreen mode Exit fullscreen mode

Results

The following is the screenshot of the resultant PDF document that we get after removing the watermarks.
Alt Text

Top comments (0)