Extract text and metadata from a number of different text and presentation templates on Java platform using GroupDocs.Parser for Java API. Following template formats are supported:
- dotx (Template)
- dotm (Macro-enabled template)
- ott (OpenDocument Text Template)
- potx (Template)
- potm (Macro-enabled template)
- ppsm (Macro-enabled slideshow)
- pptm (Macro-enabled presentation)
Below code samples demonstrates how to extract text and metadata from templates.
// Extracting Text
void extractText(String fileName) {
    // Extract a text from the file
    String text = Extractor.DEFAULT.extractText(fileName);
    // Print an extracted text
    System.out.println(text);
}
// Extracting Metadata 
void extractMetadata(String fileName) {
    // Extract metadata from the file
    MetadataCollection metadata = Extractor.DEFAULT.extractMetadata(fileName);
    // Print extracted metadata
    for (String key : metadata.getKeys()) {
        // Print a metadata key
        System.out.print(key);
        System.out.print(": ");
        // Print a metadata value
        System.out.println(metadata.get_Item(key));
    }
}
In addition to this, parsing API also supports retrieving tables from PDF documents and allows identifying the media type for your secure Office Open XML documents - http://bit.ly/2CCy7bX
 

 
    
Top comments (0)