Olalekan Oladiran

Posted on Aug 4

Build a Fruit Detection AI with Azure Custom Vision: A Step-by-Step Guide

#ai #machinelearning #azure #python

Introduction

The Azure AI Custom Vision service enables you to create computer vision models that are trained on your own images. You can use it to train image classification and object detection models; which you can then publish and consume from applications.

In this exercise, you will use the Custom Vision service to train an object detection model that can detect and locate three classes of fruit (apple, banana, and orange) in an image.

Create Custom Vision resources

Open the Azure portal at https://portal.azure.com, and sign in using your Azure credentials. Close any welcome messages or tips that are displayed.
Select Create a resource.
In the search bar, search for Custom Vision, select Custom Vision, and create the resource with the following settings:
- Create options: Both
- Subscription: Your Azure subscription
- Resource group: Create or select a resource group
- Region: Choose any available region
- Name: A valid name for your Custom Vision resource
- Training pricing tier: F0
- Prediction pricing tier: F0
Create the resource and wait for deployment to complete, and then view the deployment details. Note that two Custom Vision resources are provisioned; one for training, and another for prediction.

Note: Each resource has its own endpoint and keys, which are used to manage access from your code. To train an image classification model, your code must use the training resource (with its endpoint and key); and to use the trained model to predict image classes, your code must use the prediction resource (with its endpoint and key).

When the resources have been deployed, go to the resource group to view them. You should see two custom vision resources, one with the suffix -Prediction.

Create a Custom Vision project in the Custom Vision portal

To train an object detection model, you need to create a Custom Vision project based on your training resource. To do this, you’ll use the Custom Vision portal.

Open a new browser tab (keeping the Azure portal tab open - you’ll return to it later).
In the new browser tab, open the Custom Vision portal at https://customvision.ai. If prompted, sign in using your Azure credentials and agree to the terms of service.
Create a new project with the following settings:
- Name: Detect Fruit
- Description: Object detection for fruit.
- Resource: Your Custom Vision resource
- Project Types: Object Detection
- Domains: General
Wait for the project to be created and opened in the browser.

Upload and tag images

Now that you have an object detection project, you can upload and tag images to train a model.

The Custom Vision portal includes visual tools that you can use to upload images and tag regions within them that contain multiple types of object.
In a new browser tab, download the training images from https://github.com/MicrosoftLearning/mslearn-ai-vision/raw/main/Labfiles/object-detection/training-images.zip and extract the zip folder to view its contents. This folder contains images of fruit. - In the Custom Vision portal, in your object detection project, select Add images and upload all of the images in the extracted folder. - After the images have been uploaded, select the first one to open it. - Hold the mouse over any object in the image until an automatically detected region is displayed like the image below. Then select the object, and if necessary resize the region to surround it. Alternatively, you can simply drag around the object to create a region. - When the region surrounds the object, add a new tag with the appropriate object type (apple, banana, or orange) as shown here: - Select and tag each other object in the image, resizing the regions and adding new tags as required. - Use the > link on the right to go to the next image, and tag its objects. Then just keep working through the entire image collection, tagging each apple, banana, and orange. - When you have finished tagging the last image, close the Image Detail editor. On the Training Images page, under Tags, select Tagged to see all of your tagged images:

Use the Custom Vision SDK to upload images

You can use the UI in the Custom Vision portal to tag your images, but many AI development teams use other tools that generate files containing information about tags and object regions in images. In scenarios like this, you can use the Custom Vision training API to upload tagged images to the project.

Click the settings (⚙) icon at the top right of the Training Images page in the Custom Vision portal to view the project settings.
Under General (on the left), note the Project Id that uniquely identifies this project.
On the right, under Resources note that the Key and Endpoint are shown. These are the details for the training resource (you can also obtain this information by viewing the resource in the Azure portal).
Return to the browser tab containing the Azure portal (keeping the Custom Vision portal tab open - you’ll return to it later).
- Open VS Code
Enter the following commands to clone the GitHub repo containing the code files for this exercise git clone https://github.com/MicrosoftLearning/mslearn-ai-vision

After the repo has been cloned, use the following command to navigate to the application code files:

cd mslearn-ai-vision/Labfiles/object-detection/python/train-detector

The folder contains application configuration and code files for your app. It also contains a tagged-images.json file which contains bounding box coordinates for objects in multiple images, and an /images subfolder, which contains the images.
Install the Azure AI Custom Vision SDK package for training and any other required packages by running the following commands:

pip install -r requirements.txt azure-cognitiveservices-vision-customvision

Open env file in VS Code, update the configuration values it contains to reflect the Endpoint and an authentication Key for your Custom Vision training resource, and the Project ID for the custom vision project you created previously.
After you’ve replaced the placeholders, within the code editor, use the CTRL+S command to save your changes and then use the CTRL+Q command to close the code editor while keeping the cloud shell command line open.
Open the tagged-images.json file to see the tagging information for the image files in the /images subfolder: JSON defines a list of images, each containing one or more tagged regions. Each tagged region includes a tag name, and the top and left coordinates and width and height dimensions of the bounding box containing the tagged object.
Open add-tagged-images.py
Note the following details in the code file:
- The namespaces for the Azure AI Custom Vision SDK are imported.
- The Main function retrieves the configuration settings, and uses the key and endpoint to create an authenticated
- CustomVisionTrainingClient, which is then used with the project ID to create a Project reference to your project.
- The Upload_Images function extracts the tagged region information from the JSON file and uses it to create a batch of images with regions, which it then uploads to the project.
Enter the following command to run the program:

python3 add-tagged-images.py

Wait for the program to end.
Switch back to the browser tab containing the Custom Vision portal (keeping the Azure portal cloud shell tab open), and view the Training Images page for your project (refreshing the browser if necessary).
Verify that some new tagged images have been added to the project.

Train and test a model

Now that you’ve tagged the images in your project, you’re ready to train a model.

In the Custom Vision project, click Train (⚙⚙) to train an object detection model using the tagged images. Select the Quick Training option.
Wait for training to complete (it might take ten minutes or so).
In the Custom Vision portal, when training has finished, review the Precision, Recall, and mAP performance metrics - these measure the prediction accuracy of the object detection model, and should all be high.
At the top right of the page, click Quick Test, and then in the Image URL box, type https://aka.ms/test-fruit and click the quick test image (➔) button.
View the prediction that is generated.
Close the Quick Test window.

Use the object detector in a client application

Now you’re ready to publish your trained model and use it in a client application.

In the Custom Vision portal, on the Performance page, click 🗸 Publish to publish the trained model with the following settings:
- Model name: fruit-detector
- Prediction Resource: The prediction resource you created previously which ends with “-Prediction” (not the training resource).
At the top left of the Project Settings page, click the Projects Gallery (👁) icon to return to the Custom Vision portal home page, where your project is now listed.
On the Custom Vision portal home page, at the top right, click the settings (⚙) icon to view the settings for your Custom Vision service. Then, under Resources, find your prediction resource which ends with “-Prediction” (not the training resource) to determine its Key and Endpoint values (you can also obtain this information by viewing the resource in the Azure portal).

Use the image classifier from a client application

Return to VS Code
Run the following commands to switch to the folder for you client application cd ../test-detector

The folder contains application configuration and code files for your app. It also contains the following produce.jpg image file, which you’ll use to test your model.
Open .env file
Update the configuration values to reflect the Endpoint and Key for your Custom Vision prediction resource, the Project ID for the object detection project, and the name of your published model (which should be fruit-detector). Save your changes (CTRL+S)
Open test-detector.py
Review the code, noting the following details:
- The namespaces for the Azure AI Custom Vision SDK are imported.
- The Main function retrieves the configuration settings, and uses the key and endpoint to create an authenticated CustomVisionPredictionClient.
- The prediction client object is used to get object detection predictions for the produce.jpg image, specifying the project ID and model name in the request. The predicted tagged regions are then drawn on the image, and the result is saved as output.jpg.
Enter the following command to run the program:

python3 test-detector.py

Review the program output, which lists each object detected in the image.
Note that an image file named output.jpg is generated.

You’ve just transformed raw images into an intelligent fruit-detection system—proving how accessible computer vision has become with Azure Custom Vision. Whether you're building a smart grocery scanner, industrial quality checker, or just exploring AI, the pattern remains the same: tag, train, and deploy.

Project guide link: https://microsoftlearning.github.io/mslearn-ai-vision/Instructions/Labs/05-custom-vision-object-detection.html