DEV Community

IronSoftware
IronSoftware

Posted on • Originally published at ironsoftware.com

Additional OCR Language Packs

IronOCR supports 125 international languages, but only English is installed within IronOCR as standard.

Additional Language packs may be easily added to your C#, VB or ASP .NET project via Nuget or as Dlls which can be downloaded and added as project references.

Code Examples

International Language Example

C#:

//PM> Install-Package IronOcr.Languages.ChineseSimplified
using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.ChineseSimplified;
using (var input = new OcrInput())
{
    input.AddImage("img/chinese.gif");
    // Add image filters if needed
    // Input.Deskew();
    // Input.DeNoise();
    var Result = Ocr.Read(input);
    string TestResult = Result.Text;
    // Console can't print unicode. Save to disk instead.
    Result.SaveAsTextFile("chinese.txt");
 }
Enter fullscreen mode Exit fullscreen mode

VB:

'PM> Install-Package IronOcr.Languages.ChineseSimplified
Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.ChineseSimplified
Using input = New OcrInput()
    input.AddImage("img/chinese.gif")
    ' Add image filters if needed
    ' Input.Deskew();
    ' Input.DeNoise();
    Dim Result = Ocr.Read(input)
    Dim TestResult As String = Result.Text
    ' Console can't print unicode. Save to disk instead.
    Result.SaveAsTextFile("chinese.txt")
End Using
Enter fullscreen mode Exit fullscreen mode

Custom Language Example

For using any Tesseract .Traineddata language file you have downloaded or trained yourself

C#:

using IronOcr;
var Ocr = new IronTesseract();
Ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata");
using (var Input = new OcrInput(@"images\image.png"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Enter fullscreen mode Exit fullscreen mode

VB:

Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata")
Using Input = New OcrInput("images\image.png")
    Dim Result = Ocr.Read(Input)
    Console.WriteLine(Result.Text)
End Using
Enter fullscreen mode Exit fullscreen mode

Multiple Language Example

More than one Language at a time.

C#:

//PM> Install-Package IronOcr.Languages.Arabic
using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.English;
Ocr.AddSecondaryLanguage(OcrLanguage.Arabic);
// Add any number of languages
using (var Input = new OcrInput(@"images\multi-lang.pdf"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Enter fullscreen mode Exit fullscreen mode

VB:

'PM> Install-Package IronOcr.Languages.Arabic
Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.English
Ocr.AddSecondaryLanguage(OcrLanguage.Arabic)
' Add any number of languages
Using Input = New OcrInput("images\multi-lang.pdf")
    Dim Result = Ocr.Read(Input)
    Console.WriteLine(Result.Text)
End Using
Enter fullscreen mode Exit fullscreen mode

Faster Language Example

Dictionaries Tuned for Speed. Use 'Fast' Variant of any OcrLanguage.

C#:

using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.EnglishFast;
using (var Input = new OcrInput(@"images\image.png"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Enter fullscreen mode Exit fullscreen mode

VB:

Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.EnglishFast
Using Input = New OcrInput("images\image.png")
    Dim Result = Ocr.Read(Input)
    Console.WriteLine(Result.Text)
End Using
Enter fullscreen mode Exit fullscreen mode

Higher Accuracy Detail Language Example

Dictionaries tuned for accuracy but much slower results. Use 'Best' Variant of any OcrLanguage.

C#:

//PM> Install-Package IronOcr.Languages.French
using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.FrenchBest;
using (var Input = new OcrInput(@"images\image.png"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Enter fullscreen mode Exit fullscreen mode

VB:

'PM> Install-Package IronOcr.Languages.French
Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.FrenchBest
Using Input = New OcrInput("images\image.png")
    Dim Result = Ocr.Read(Input)
    Console.WriteLine(Result.Text)
End Using
Enter fullscreen mode Exit fullscreen mode

How To Install OCR Language Packs

Additional OCR Language packs are available for download below. Either

  • Install the Nuget package. Search Nuget for IronOcr Languages
  • Or download the "ocrdata" file and add it to your .NET project in any folder you like. Set CopyToOutputDirectory = CopyIfNewer

Download OCR Language Packs

Download OCR Language Packs directly from the IronOCR Website itself.

Help

If the language you are looking to read is not available in the list above please get in touch with us. Many other languages are available on request.

Priority on production resources are given to IronOCR licensees so please also consider licensing IronOCR for access to your desired language pack.


Top comments (0)