DEV Community

Hari Haran😎
Hari Haran😎

Posted on

How to build an Image processor bot using Microsoft Bot framework

Recently the hype around bots and machine learning are on a huge demand. I recently participated in the Global Azure Bootcamp. Since I was a presenter, I got to learn all these amazing things about building a bot and am sharing it with you all.

Prerequisites

Objectives or Project Goals

The goal for me is to give the bot an image or an image URL and it should give back the analysis of the particular image.

Getting Started

Let's go to Visual Studio and create a Bot Framework V4 project.

NewProject

But wait, you won't be having the Bot Framework template straight away. You need to install the Bot Framework v4 SDK Templates for Visual Studio from the marketplace.

Now that you have the template installed, select it and create an **Echo Bot, **and you will be having a solution like this.

Solution structureSolution structure

Exploring the Solution

This looks like the same structure as a .NET CORE Web API application. It has a controller for interacting with the bot with[Route(“api/messages”)]. It has a startup for injecting your dependencies and also a ErrorHandler class.

EchoBot.cs

If you have a look at the EchoBot.cs under the Bots folder, this is where the actual bot processing and logic are done. The main thing to note here is the OnMessageActivityAsync and also OnMembersAddedAsync

OnMembersAddedAsync

The name of the function itself indicates whenever a new member is added to the bot for interacting or either the first time the bot is connected, interact with the user. Let’s modify this first.

var welcomeText = “Hello and welcome to ImageDescriber Bot. Enter a image url or upload a image to begin analysis :”;

foreach (var member in membersAdded){
  if (member.Id != turnContext.Activity.Recipient.Id)
  await    turnContext.SendActivityAsync(MessageFactory.Text(welcomeText), cancellationToken);
}

I have removed CreateActivityWithTextAndSpeak and changed it like the above one. All it does is just welcome the user.

OnMessageActivityAsync

This is where we need to do something to process the image or a URL. Let’s see the possibilities. In case if you are not aware you can make use of **Azure Cognitive Services **for AI operations like this.

There are always two ways to interact with Azure Services.

  • REST API

  • Client SDK

I’m a lazy guy, so I won't go explore the REST API browser and find the suitable API, find its headers, mandatory params blah blah.. This is the same reason I created my own library for interacting with StackExchange API. It’s called StackExchange.NET and you can also find it in Nuget

Azure Cognitive Services SDK

So I’m going to install the Azure Cognitive services SDK on my project for processing the image. But before you need to create an Azure Cognitive service to be able to do it.

  • Open Azure portal and click on Add new resource

  • Select the AI & Machine Learning category and click on Computer Vision

ComputerVision

Give it a preferable name as you want, and while selecting the pricing plan, choose the **F0 **cause its free and it will serve our purpose. We’re not going to production so FREE should be fine.

**Note: **If you already have a computer vision resource, you cannot re-create a free one. You can make use of it.

Already existing resourceAlready existing resource

Connecting to the Cognitive service

Once the resource creation is completed you can open the resource and you will find a Key and an Endpoint

Azure cognitive serviceAzure cognitive service

Now navigate back to the solution and open appsettings.json and create like the below json. Copy the key and endpoint and paste it.

{
  “MicrosoftAppId”: “”,
  “MicrosoftAppPassword”: “”,
  “Credentials”: {
    “ComputerVisionKey”: “enter your Key”,
    “ComputerVisionEndpoint”:“enter endpoint URL"
   }
}

Injecting the credentials

  • Create a new class like below.

    public class Credentials {
    public string ComputerVisionKey { get; set; }
    public string ComputerVisionEndpoint { get; set; }
    }

  • Now open Startup.cs and under the ConfigureServices method add the below line.

    services.Configure<Credentials(Configuration.GetSection(“Credentials”));

  • I hope you are aware of how we get values from appsettings.json. Its the same thing here.

Installing the SDK

  • Install the Microsoft.Azure.CognitiveServices.Vision.ComputerVision from Nuget package manager.

  • We will be creating a new class to perform operations using the client sdk.


  public class ImageAnalyzer {
    private readonly string _computerVisionEndpoint;
    private readonly string _computerVisionKey;

    public ImageAnalyzer(IOptions<Credentials> options {
    _computerVisionKey = options.Value.ComputerVisionKey;
    _computerVisionEndpoint = options.Value.ComputerVisionEndpoint;
    }
  }


`

  • I have a simple class where the constructor will be automatically injected in the runtime.

Computing with the SDK

Any API / SDK that we use needs to be authenticated first. So I have created a method like this

public static ComputerVisionClient Authenticate(string endpoint, string key) {
  var client = new ComputerVisionClient(new     ApiKeyServiceClientCredentials(key)) { 
  Endpoint = endpoint };
return client;
}

As we are going to analyze either a Stream or a URL I am creating two methods for it.

public async Task<ImageAnalysis> AnalyzeImageAsync(Stream image) {
  var client = Authenticate(_computerVisionEndpoint, _computerVisionKey);
  var analysis = await client.AnalyzeImageInStreamAsync(image, Features);
  return analysis;
  }
}

public async Task<ImageAnalysis> AnalyzeUrl(string url){
  var client = Authenticate(_computerVisionEndpoint, _computerVisionKey);
  var result = await client.AnalyzeImageWithHttpMessagesAsync(url, Features);
  return result.Body;
}

So that is it. The SDK operations are done. The thing to note is that the second parameter called Features on both of the client calls. What is it?

Cognitivesdk

So it is a List<Enums> accepted by the SDK. I have copied it from the docs.

private static readonly List<VisualFeatureTypes> Features = new List<VisualFeatureTypes> {
  VisualFeatureTypes.Categories, 
  VisualFeatureTypes.Description,
  VisualFeatureTypes.Faces,   
  VisualFeatureTypes.ImageType,
  VisualFeatureTypes.Tags
};

Interacting with the Bot

The ITurnContext<IMessageActivity> turnContext is the main thing which contains whatever you share with the bot. Have a look at the below code. I have kept it simple so that it is understandable.

On a simple note, the below code does is

if its an attachment image => call the processImage method else if its a url => call the URL method and return results

`

  var result = new ImageAnalysis();
  if (turnContext.Activity.Attachments?.Count > 0) 
  {
    var attachment = turnContext.Activity.Attachments[0];
    var image = await _httpClient.GetStreamAsync(attachment.ContentUrl);

    if (image != null) {
    result = await _imageAnalyzer.AnalyzeImageAsync(image);
    }
  }
  else {
    result = await _imageAnalyzer.AnalyzeUrl(turnContext.Activity.Text);
  }
  var stringResponse = $”I think the Image you uploaded is a {result.Tags[0].Name.ToUpperInvariant()} and it is {result.Description.Captions[0].Text.ToUpperInvariant()} ;
  return stringResponse;


`

Demo

Now its time to see if the Bot actually works. To do so,

  • Build the solution and press F5 .

  • As mentioned in the pre-requisite, I already have Azure Bot framework emulator installed, let's open it.

  • When you open it you will get a page like this

demo

Bot first interactionBot first interaction

Uploading an image now.

bot-framework-emulator

Hooyah! MISSION ACCOMPLISHED! 🔥

processed-image

The full solution can be downloaded from here on Github

Thanks for reading. Please stay tuned for more blogs.

Latest comments (0)