I need to extract content from PDF, by giving a paragraph heading or some phrase.
How to achieve this. ParagraphAbsober, does get all text. However I need only from a particular paragraph or particular portion of a paragraph, not the complete page.
How to achieve this.
Thanks for your interest!
Currently, you can use TextFragementAbsorber with regular expression as an input parameter.
// Create TextFragmentAbsorber object that searches all words starting 'h'
// and ending 'o' using regular expression.
TextFragmentAbsorber absorber = new TextFragmentAbsorber(@"h\w*?o",
Unfortunately, ParagraphAbsorber doesn't support search by the regular expression, so you need to analyze paragraphs extracted with this tool manually.
We're a place where coders share, stay up-to-date and grow their careers.
We strive for transparency and don't collect excess data.