DEV Community

Discussion on: How to Extract Tabular Data from PDF [part 2]

Collapse
 
anupjoseph profile image
AnupJoseph

Can you please suggest some tools which help in dealing with a pdf having multiple tables in a single page and sometimes even spanning multiple pages

Collapse
 
upsilon_it profile image
Upsilon

Thanks for the question!

All tools except PDFTables coped well with the multiple tables on the page. PDFTables detected the original PDF's separate tables as a big, single table.

If you work with a multi-paged table, you will need to 'glue' its parts by yourself. Either manually or via a custom script (if you will come up with an algorithm). As far as we know, no instruments are allowing to do that.

For other parameters, Excalibur is the winner of the study.