DEV Community

Discussion on: Extract the text data from PDF file using Aspose.PDF for .NET

Collapse
 
andruhovski profile image
Andriy Andruhovski

Thanks for your interest!
Currently, you can use TextFragementAbsorber with regular expression as an input parameter.

    // Create TextFragmentAbsorber object that searches all words starting 'h' 
    // and ending 'o' using regular expression.
    TextFragmentAbsorber absorber = new TextFragmentAbsorber(@"h\w*?o", 
         new TextSearchOptions(true));

Unfortunately, ParagraphAbsorber doesn't support search by the regular expression, so you need to analyze paragraphs extracted with this tool manually.