DEV Community

Omo Agbagbara
Omo Agbagbara

Posted on

Implementing OCR in Azure: A Comparison of Logic App Connectors and Function Apps

I recently started learning more about the Azure Applied AI Services. It is quite straightforward to setup and understand how it works furthmore, Azure provides a lot of starter samples.See samples.

Being an integration consultant, I tend to implement a few simple integrations using logic apps so I decided to try creating a logic app implementation with the provided connector. Below is the simple ocr process to be implemented.
Ocr Process

To perform ocr on an image a computer vision api connector is required. This connector uses the API endpoint and API Key which are generated when Azure AI services multi-service account is first created.

Setting Up Connector

Some of the actions which are available are
Actions

For the OCR process the Optical Character Recognition to JSON V3 action will be used with the following setup.
OCR To Json Action

Input Image

This input image was taken on the day of my graduation ceremony in 1998, I recent converted the negative film to digital.
Using the action above, the output for the given input image did not contain any valid text or bounding boxes.
Input Image

Output Response

Output

However, despite the ease of setup, the results were not as expected. Even after testing with various images, the Logic App's output consistently lacked valid text or bounding boxes. It seemed there might be additional configurations needed for this specific connector action.

Alternate Implementation.

An alternate implementation will be to a function app to call the Azure Applied AI Service to perform the OCR. This function app will use the same Api Key and Api endpoint.
Below is the process to be implemented.

The logic app was setup with a function app action as follows.
Function app

The Program.cs is updated as follows

using Azure;
using Azure.AI.Vision.ImageAnalysis;
.
.
// Register ImageAnalysisClient as a singleton
builder.Services.AddSingleton<ImageAnalysisClient>(sp =>
{    
    var endpoint = Environment.GetEnvironmentVariable("AI_ENDPOINT");
    var key = Environment.GetEnvironmentVariable("AI_APIKEY");

    var credential = new AzureKeyCredential(key);
    var client = new ImageAnalysisClient(new Uri(endpoint), credential);
    return client;
});
Enter fullscreen mode Exit fullscreen mode

The HTTP trigger within the Function App was designed to read the image data, perform analysis using VisualFeatures.Read, and then extract the text, bounding polygons, and word confidence scores.

var response = new List<dynamic>();

 // Read the request body into BinaryData
 BinaryData requestData = await BinaryData.FromStreamAsync(req.Body);

 // If you need the string representation
 ImageAnalysisResult result = _ImageAnalysisClient.Analyze(requestData, VisualFeatures.Read);

 foreach (var line in result.Read.Blocks.SelectMany(block => block.Lines))
 {            

     var lineObject = new
     {
         Text = line.Text,
         BoundingPolygon = line.BoundingPolygon.ToList(),
         Words = line.Words.Select(word => new
         {
             Text = word.Text,
             Confidence = word.Confidence.ToString("#.####"),
             BoundingPolygon = word.BoundingPolygon.ToList()
         }).ToList()
     };

     response.Add(lineObject);               
 }
Enter fullscreen mode Exit fullscreen mode

Using this approach produces results which are significantly better and aligned with the expected OCR output.
Below is a summary example of the result based on the original input image.
Function app result summary

Conculsion

While the Azure Logic App connectors in Azure logic app, provide a simple and convenient way to interact with the Azure ai services, my experience suggests that the results may not always be as intended for certain operations like OCR. A solution will be to create system apis which can act as proxies to backend services.
The problem with this approach is that there is now an extra set of resources which have to be implemented and maintained. But this an easy task.

Have you had similar experiences with Azure AI services and different integration patterns? Share your thoughts in the comments!

Top comments (0)