Discussion on: Extract the text data from PDF file using Aspose.PDF for .NET

View post

Replies for: Hi Team, I need to extract content from PDF, by giving a paragraph heading or some phrase. How to achieve this. ParagraphAbsober, does get all...

Thanks for your interest!
Currently, you can use TextFragementAbsorber with regular expression as an input parameter.

    // Create TextFragmentAbsorber object that searches all words starting 'h' 
    // and ending 'o' using regular expression.
    TextFragmentAbsorber absorber = new TextFragmentAbsorber(@"h\w*?o", 
         new TextSearchOptions(true));

Unfortunately, ParagraphAbsorber doesn't support search by the regular expression, so you need to analyze paragraphs extracted with this tool manually.