DEV Community

IronSoftware
IronSoftware

Posted on • Originally published at ironsoftware.com

Make any PDF have Searchable, Copyable Text (Code Example)

We can use Iron's advanced Tesseract engine make scanned PDF documents searchable and indexable, with text users can copy and paste.

C#:

using IronOcr;

var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    Input.AddPdf("scan.pdf","password")

    // clean up twisted pages
    Input.Deskew();

    var Result = Ocr.Read(Input);
    Result.SaveAsSearchablePdf("searchable.pdf");
}
Enter fullscreen mode Exit fullscreen mode

VB:

Imports IronOcr

Private Ocr = New IronTesseract()
Using Input = New OcrInput()
    Input.AddPdf("scan.pdf","password") Input.Deskew()

    Dim Result = Ocr.Read(Input)
    Result.SaveAsSearchablePdf("searchable.pdf")
End Using
Enter fullscreen mode Exit fullscreen mode

Top comments (0)