DEV Community

IronSoftware
IronSoftware

Posted on • Originally published at ironsoftware.com

3 2

Additional OCR Language Packs

IronOCR supports 125 international languages, but only English is installed within IronOCR as standard.

Additional Language packs may be easily added to your C#, VB or ASP .NET project via Nuget or as Dlls which can be downloaded and added as project references.

Code Examples

International Language Example

C#:

//PM> Install-Package IronOcr.Languages.ChineseSimplified
using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.ChineseSimplified;
using (var input = new OcrInput())
{
    input.AddImage("img/chinese.gif");
    // Add image filters if needed
    // Input.Deskew();
    // Input.DeNoise();
    var Result = Ocr.Read(input);
    string TestResult = Result.Text;
    // Console can't print unicode. Save to disk instead.
    Result.SaveAsTextFile("chinese.txt");
 }
Enter fullscreen mode Exit fullscreen mode

VB:

'PM> Install-Package IronOcr.Languages.ChineseSimplified
Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.ChineseSimplified
Using input = New OcrInput()
    input.AddImage("img/chinese.gif")
    ' Add image filters if needed
    ' Input.Deskew();
    ' Input.DeNoise();
    Dim Result = Ocr.Read(input)
    Dim TestResult As String = Result.Text
    ' Console can't print unicode. Save to disk instead.
    Result.SaveAsTextFile("chinese.txt")
End Using
Enter fullscreen mode Exit fullscreen mode

Custom Language Example

For using any Tesseract .Traineddata language file you have downloaded or trained yourself

C#:

using IronOcr;
var Ocr = new IronTesseract();
Ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata");
using (var Input = new OcrInput(@"images\image.png"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Enter fullscreen mode Exit fullscreen mode

VB:

Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata")
Using Input = New OcrInput("images\image.png")
    Dim Result = Ocr.Read(Input)
    Console.WriteLine(Result.Text)
End Using
Enter fullscreen mode Exit fullscreen mode

Multiple Language Example

More than one Language at a time.

C#:

//PM> Install-Package IronOcr.Languages.Arabic
using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.English;
Ocr.AddSecondaryLanguage(OcrLanguage.Arabic);
// Add any number of languages
using (var Input = new OcrInput(@"images\multi-lang.pdf"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Enter fullscreen mode Exit fullscreen mode

VB:

'PM> Install-Package IronOcr.Languages.Arabic
Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.English
Ocr.AddSecondaryLanguage(OcrLanguage.Arabic)
' Add any number of languages
Using Input = New OcrInput("images\multi-lang.pdf")
    Dim Result = Ocr.Read(Input)
    Console.WriteLine(Result.Text)
End Using
Enter fullscreen mode Exit fullscreen mode

Faster Language Example

Dictionaries Tuned for Speed. Use 'Fast' Variant of any OcrLanguage.

C#:

using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.EnglishFast;
using (var Input = new OcrInput(@"images\image.png"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Enter fullscreen mode Exit fullscreen mode

VB:

Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.EnglishFast
Using Input = New OcrInput("images\image.png")
    Dim Result = Ocr.Read(Input)
    Console.WriteLine(Result.Text)
End Using
Enter fullscreen mode Exit fullscreen mode

Higher Accuracy Detail Language Example

Dictionaries tuned for accuracy but much slower results. Use 'Best' Variant of any OcrLanguage.

C#:

//PM> Install-Package IronOcr.Languages.French
using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.FrenchBest;
using (var Input = new OcrInput(@"images\image.png"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Enter fullscreen mode Exit fullscreen mode

VB:

'PM> Install-Package IronOcr.Languages.French
Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.FrenchBest
Using Input = New OcrInput("images\image.png")
    Dim Result = Ocr.Read(Input)
    Console.WriteLine(Result.Text)
End Using
Enter fullscreen mode Exit fullscreen mode

How To Install OCR Language Packs

Additional OCR Language packs are available for download below. Either

  • Install the Nuget package. Search Nuget for IronOcr Languages
  • Or download the "ocrdata" file and add it to your .NET project in any folder you like. Set CopyToOutputDirectory = CopyIfNewer

Download OCR Language Packs

Download OCR Language Packs directly from the IronOCR Website itself.

Help

If the language you are looking to read is not available in the list above please get in touch with us. Many other languages are available on request.

Priority on production resources are given to IronOCR licensees so please also consider licensing IronOCR for access to your desired language pack.


Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay