Mehr Muhammad Hamza

Posted on Sep 23, 2021

Explore Azure OCR in 125 Languages with IronOCR

#azureocr #ironocr #cloud #azure

This article is all about AzureOcr and IronOcr. If you are looking for tesseract Ocr, you can refer to my previous article from this link.

Azure OCR by Microsoft:

Optical Character Recognition (OCR) allows you to extract handwritten or printed text from images like documents, bills and articles etc. Microsoft supports 73 languages for extracting printed documents.

Read API in lastest AzureOCR Technology extracts printed and handwritten text , digits and currency symbol from images and PDF documents. It is optimized for extracting text from heavy images or multi-page PDF documents. It's features include:

Print text extraction in 73 languages
Handwritten text extraction in English
Text lines and words with location and confidence scores
No language identification required
Support for mixed languages, mixed mode (print and handwritten)

IronOCR:

IronOCR is a C# software library that allows .NET developers to read text from images and PDF documents. It is a pure .NET OCR library. Azure OCR is an excellent tool allowing to extract text from an image by API calls. It provides developers with access to advanced algorithms that process images and return information. To analyze an image, you can either upload an image or specify an image URL.

How Is IronOCR better than AzureOCR?

The cloud-based OCR API provides developers with access to advanced algorithms for reading text in images and returning structured content and you can learn how to extract printed and handwritten text in multiple languages with quickstarts, tutorials, and samples. IronOCR supports 125 international languages.

Benefits To Use IronOCR in AzureOCR:

IronOCR provides all these :

In Azure OCR, there is Azure Cognitive Services which is a computer vision API but incase you wanted to make sure that you could run this without making API calls, and without paying more as you scale up. IronOCR is a one time fee.
It Support multiple languages as compare to other libraries who supports English only.
It provides a smart cleanup like if the scanned document or image is a bit scratchy, the library come up with something to make it clean.
It has a huge feature that works both for printed and handwritten.
You can easily use it in the cloud that means that you don’t need to install anything because you may not be using a VM at all and it will be entirely serverless.

Azure OCR API features:

The ability to perform an OCR on almost any file, image, or PDF
Lightning-fast speed
Exceptional accuracy
Reads bar codes and QR codes
Runs locally, with no SaaS required
Can turn PDFs and images into searchable documents
Excellent Alternative to Azure OCR from Microsoft Cognitive Services.

IronOCR With AzureOCR:

Let’s s see an example of screenshot from the Google Books page on Frankenstein by Mary Shelly.

I then took my C#/.NET Console Application, and ran the following in the nuget package manager to install IronOCR
Install-Package IronOcr
you can directly download from this link
And then onto the code and OCR’d this image to extract text, including line breaks and everything, using 4 lines of code.

var ocr = new IronTesseract();
using (var Input = new OcrInput("Frankenstein.PNG"))
{
    var result = ocr.Read(Input);
    Console.WriteLine(result.Text);
}

And here is the result

Frankenstein
Annotated for Scientists, Engineers, and Creators of All Kinds
By Mary Wollstonecraft Shelley - 2017

We are using Azure Functions in a microservices architecture, here is a simple function that can take a parameter of an image, OCR it, and return the text

public static class OCRFunction
{
    public static HttpClient _httpClient = new HttpClient();

    [FunctionName("OCRFunction")]
    public static async Task<IActionResult> Run([HttpTrigger] HttpRequest req, ExecutionContext context)
    {
        var imageUrl = req.Query["image"];
        var imageStream = await _httpClient.GetStreamAsync(imageUrl);

        var ocr = new IronTesseract();
        using (var input = new OcrInput(imageStream))
        {
            var result = ocr.Read(input);
            return new OkObjectResult(result.Text);
        }
    }
}

We take the parameter of picture, download it and OCR it right away. The good thing about doing this inside an Azure Function is that it support various pieces of our application in a microservice architecture without us duplicating the code everywhere.

In case you're right now paying for some help that charges a for each OCR expense. Things can seem modest yet at scale, the month to month expense can rapidly winding crazy. Contrast this with a one time expense with IronOCR, and you're getting what is basically a callable API all facilitated in the Azure Cloud, and without any continuous expenses

Non-English Support:

Many OCR libraries only support English language.

However, IronOCR supports 125 languages currently, and you can add as many or as few as you like by simply installing the applicable Nuget language pack.

Iron Azure OCR Language Support

Summary:

For past couple years, Computer vision and optical character recognition is on high demand. IronOCR supports 125 international languages as compare to Microsoft Azure that supports 73 languages. IronOCR has a computer vision API and an ability to convert any image , file or PDF in OCR with its advanced algorithms. Interested thing is, If you buy complete Iron Suite, you will get all 5 Products for the Price of 2. For further details about the licensing, Please follow this link to Purchase complete Package.
I hope that you have liked this article. Feel free to ask any query in the comment section.

Top comments (4)

mahditaheri2004 • Nov 24 '21

Hi,
I see you poster about IronOCR that support Persian language, does it really extract Persian from Image?
I have attached car plate number image in this post,so does IronOCR can extract Iranian number from this image?

Mehr Muhammad Hamza • Nov 25 '21 • Edited

Hi mahditaheri2004,
Yes, It can extract Iranian Number in Persian langugae from the Image. Just install Nuget Package for Persian Language "IronOCr.Language.Persian" as attached below.

mahditaheri2004 • Nov 25 '21

Yes already tested this library in Persian but it gives me better result in Arabic!
however it is not very accurate in my project because I want to extract Car Plate Number and still not successful!

Mehr Muhammad Hamza • Dec 2 '21

It's Quality might be too low. Furthermore, Car Number Plates don't work well with scanned document OCR - they need computer vision software.

DEV Community

Explore Azure OCR in 125 Languages with IronOCR

Azure OCR by Microsoft:

IronOCR:

How Is IronOCR better than AzureOCR?

Benefits To Use IronOCR in AzureOCR:

Azure OCR API features:

IronOCR With AzureOCR:

Non-English Support:

Summary:

Top comments (4)

Read next

Building a Serverless Recipe Generator : AWS Project

Deploy web apps with help from GitHub Copilot for Azure!

Building Scalable Applications with Azure Functions: Best Practices and Tips

Mastering Terraform: File Types, Best Practices, and Use Cases Part 2