DEV Community

Usman Aziz
Usman Aziz

Posted on • Originally published at blog.groupdocs.com

6

Find and Remove Watermarks from Documents in Java

This article is useful for the Java developers who are looking for a way to find and remove text or image watermarks from PDFWordExcelPowerPointVisio and Email documents.

GroupDocs.Watermark for Java API supports adding text and image watermarks to a wide range of document formats. In addition, it also has the ability to find and remove watermarks from the documents. The API also finds the watermark objects that are added using the third-party tools. So let me demonstrate how you can remove the watermark from a document in a few steps in Java.

Before we begin, have a look at the following PDF document which contains a text as well as an image watermark. We’ll use this document and remove the watermarks from it.
Alt Text

Steps to remove watermarks from a document

1. Create a new project.

2. Add the following imports.

 import com.groupdocs.watermark.Document; 
 import com.groupdocs.watermark.ImageDctHashSearchCriteria;
 import com.groupdocs.watermark.ImageSearchCriteria; 
 import com.groupdocs.watermark.PossibleWatermarkCollection;
 import com.groupdocs.watermark.SearchCriteria;
 import com.groupdocs.watermark.TextSearchCriteria; 
Enter fullscreen mode Exit fullscreen mode

3. Create an instance of Document class and load the source document.

 Document doc =Document.load("watermarked.pdf");
Enter fullscreen mode Exit fullscreen mode

4. Find the watermarks based on search criteria using findWatermarks method (if you don’t pass any searching criteria, findWatermark will return all the possible watermark objects).

 // configure the search criteria for image watermark
 ImageSearchCriteria imageSearchCriteria =newImageDctHashSearchCriteria("watermark.png");
 // configure the search criteria for text watermark
 TextSearchCriteria textSearchCriteria =newTextSearchCriteria("CONFIDENTIAL");
 // combine the search criteria
 SearchCriteria combinedSearchCriteria = imageSearchCriteria.or(textSearchCriteria);
 // find possible watermarks
 PossibleWatermarkCollection possibleWatermarks = doc.findWatermarks(combinedSearchCriteria);
Enter fullscreen mode Exit fullscreen mode

5. Iterate over the watermark collection and remove watermarks using removeAt method.

 // iterate through the collection and remove watermarks
 while(possibleWatermarks.getCount()>0)
 {
 if (possibleWatermarks.get_Item(0).getImageData() !=null)
    {
        possibleWatermarks.removeAt(0);
 System.out.println("removed image watermark.");
    }
 else
    {
    possibleWatermarks.removeAt(0);
System.out.println("removed text watermark.");
    }
 } 
Enter fullscreen mode Exit fullscreen mode

6. Save the resultant document using save method.

 doc.save("without_watermark.pdf");
 doc.close(); 
Enter fullscreen mode Exit fullscreen mode

Complete Code

 Document doc =Document.load("watermarked.pdf");
 // configure the search criteria for image watermark
 ImageSearchCriteria imageSearchCriteria =newImageDctHashSearchCriteria("watermark.png");
 // configure the search criteria for text watermark
 TextSearchCriteria textSearchCriteria =newTextSearchCriteria("CONFIDENTIAL");
 // combine the search criteria
 SearchCriteria combinedSearchCriteria = imageSearchCriteria.or(textSearchCriteria);
 PossibleWatermarkCollection possibleWatermarks = doc.findWatermarks(combinedSearchCriteria);
 // iterate through the collection and remove watermarks
 while(possibleWatermarks.getCount()>0)
 {
 if (possibleWatermarks.get_Item(0).getImageData() !=null)
    {
        possibleWatermarks.removeAt(0);
 System.out.println("removed image watermark.");
    }
 else
    {
        possibleWatermarks.removeAt(0);
 System.out.println("removed text watermark.");
    }
 }        
 doc.save("without_watermark.pdf");
 doc.close();
Enter fullscreen mode Exit fullscreen mode

Results

The following is the screenshot of the resultant PDF document that we get after removing the watermarks.
Alt Text

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay