Follow me on Twitter, happy to take your suggestions on topics or improvements /Chris
An AI villain against humanity! As a kid growing up in the 80s some of the coolest movies to come out was Terminator and Terminator 2 starring Arnold Schwarzenegger. ICYMI - an AI robot ( Arnold ) was sent from the future to try to destroy any chance of human resistance in the future.
Back then it felt far away into the future for us humans to construct a robot moving like that until this clip made the internet https://www.youtube.com/watch?v=LikxFZZO2sk
a robot constructed by Boston Dynamics. A lot of people choked on their coffee that day.
If that thing ever becomes smart and hostile to humans we need to join Elon Musks Tesla in space 😉
One really cutting edge scene in Terminator edged itself into my mind. The Terminator enters the motorcycle bar, scans the people and objects around the room, correctly classifying what the objects are, their color, size and if they are his target! https://www.youtube.com/watch?v=zzcdPA6qYAU
Back then it was amazing, science fiction at its best.
Here's the thing though, it's not science fiction anymore. So many things have happened in the area of Machine Learning. The Machine Learning industry employs an army of data scientists that construct algorithms that given training data is able to correctly identify what it's looking at.
A quite famous example is the pug or muffin training data in which we get a peek on how these algorithms are trained on countless images like this one:
I know some of you are probably chuckling by now, thinking we don't need to worry about machines overtaking us any time soon 😉.
I mentioned it wasn't science fiction anymore and it isn't. Microsoft offers a whole suite of services called Azure Cognitive Services
centering on
-
vision, This is image-processing algorithms that can identify,
caption
,index
, moderatepictures
andvideos
- speech, Can convert spoken audio into text, use voice for verification or add speech recognition to your app
- language, Allow your apps to process natural language with pre-built scripts, evaluate sentiment and learn how to recognize what users want
- knowledge, Map complex information and data in order to solve tasks such as intelligent recommendations and semantic search.
- search, Enable apps and services to harness the power of a web-scale, ad-free search engine with Search. Use search services to find exactly what you're looking for across billions of web pages, images, videos and news search results
As you notice, where you to click on any of the above categories, each area leads to a ton of services and they are free to try. I don't know about you but I feel like a kid in a candy store when someone tells me that here are a ton of APIs for you to use and especially if it makes Machine Learning usable for me as a developer.
To go with the introduced narrative let's dive into the vision category, cause we want to see like a Terminator right? ;)
Let's click on Celebrity and landmark recognition in images. Oh, cool we get a demo page where we can see the algorithms at work, try it before you buy it :)
Above we can see it requires us to input a URL for an image and it seems to respond with JSON. Ok, let's give it something easy, a picture of Abe Lincoln:
And the winner is…. Abe Lincoln. Ok, that was easy let's try something else:
I have to admit. I'm about nervous about this one ;). Ok, let's see the results:
Ok, it recognized Arnold Schwarzenegger from the movie Terminator 2, good. I swear if it had mentioned John Connor I would have run for the hills, just kidding :)
Using Azure Cognitive Services
To start using the Cognitive Services API we need an API key. We need to take a few steps to acquire said key but it really isn't that much. The Cognitive Services resides on Azure. To get a free Azure account head to this link:
Once you are signed up you could either be using the Azure portal or the Azure CLI. The Azure CLI enables us to talk to Azure from the command line which is usually way quicker and cooler, than clicking around in a UI.
Once we have come this far there are only four steps left, so stay with me and we will soon see the world like Arnold 😃
What remains is the following:
- create a resource group, this is a like a directory where you put all the things that belong together like accounts, databases, apps, it takes only a second to create
- create a cognitive services account, that's also just a one-liner of code, creating this will give us our API key
- make a POST call to the API, it's a very simple REST API call given they API key we get from constructing our cognitive services account
- parse the JSON response, we will get a JSON back and we will have a look at the different parts it gives us to see what we can show to our user
Create a resource group
First thing we will need to do is to log in to Azure using the Azure CLI. To use the Azure CLI we first need to install it. Head over to this link for installation instructions, the installation instruction is different for different OS so make sure you pick the right one:
https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest
Let's login to Azure using the Azure CLI:
az login
This will open up a window in the browser where we login to our Azure account. Thereafter the terminal will have access to Azure.
Let's now create the resource group:
az group create \
--name resourceforcogservices \
--location westeurope
The command here is az group create
and we are giving it the following arguments:
- name, this is a name we choose
- location, we can select between a number of locations here depending on where we are in the world
For location we have chosen westeurope
, cause that's where I am writing this article. So choose a region depending on where you are located. Here is the full list of supported regions:
- westus2
- southcentralus
- centralus
- eastus
- westeurope
- southeastasia
- japaneast
- brazilsouth
- australiasoutheast
- centralindia
Create a Azure Cognitive Services account
It's quite easy to create this account. It's done with the following command:
az cognitiveservices account create \
--kind ComputerVision \
--name ComputerVisionService \
--sku S1 \
--resource-group resourceforcogservices \
--location westeurope
Ok, our basic command is az cognitiveservices account create
, thereafter we have added some arguments to said command:
- kind, here we need to type what kind of Cognitive Services we will use, our value here needs to be ComputerVision
-
name, the name is simply the name of the service, which is
ComputerVisionService
- sku, means the pricing tier and is fixed for the lifetime of the service, we choose S1, which is really cheap.
- resourcegroup, we have created this one previously and as stated before this is like a folder where everything that is related should be organized under
- location, we keep going with westeurope here cause that's what we started with, you are welcome to continue with the location you went with
https://docs.microsoft.com/en-us/azure/search/search-sku-tier
Once the Cognitive Services account is created then we can retrieve the API key. The following command will list our cognitive services account, including the API key:
az cognitiveservices account show \
--name ComputerVisionService \
--resource-group resourceforcogservices
Our command for retrieving the key is az cognitiveservices account show
then we need to give said command some arguments:
- name, this is the name of our service
- resource group, we keep using the resource group
westeurope
that we chose initially
Make a POST call to the API
Now to make it easy to use when doing our REST call we will assign the API key to a shell variable and we can refer to said variable when we later do our REST call. Let's do the assignment:
key=$(az cognitiveservices account keys list \
--name ComputerVisionService \
--resource-group resourceforcogservices \
--query key1 -o tsv)
The above lists all the keys on the account picks out a key called key1
and assigns it to the variable key. Now we are all set up and ready to make our REST call.
Let's have a look at our API and see what the URL looks like generally:
https://[region].api.cognitive.microsoft.com/vision/v2.0/analyze?visualFeatures=<...>&details=<...>&language=<...>
We see that we need to replace [region]
with whatever region we created our resource group and account with, in our case that is westeurope
. Furthermore, we see the API is using a method called analyze
and the parameters visualFeatures
, details and language
.
-
details, this can have value
Landmarks
orCelebrities
-
visualFeatures, this is about what kind of information you want back, The
Categories
option will categorize the content of the images like trees, buildings, and more.Faces
will identify people's faces and give you their gender and age
Ok, let's see what the actual call looks like:
curl "https://westeurope.api.cognitive.microsoft.com/vision/v2.0/analyze?visualFeatures=Categories,Description&details=Landmarks" \
-H "Ocp-Apim-Subscription-Key: $key" \
-H "Content-Type: application/json" \
-d "{'url' : 'https://raw.githubusercontent.com/MicrosoftDocs/mslearn-process-images-with-the-computer-vision-service/master/images/mountains.jpg'}" \
| jq '.'
Above we call cURL
and set the header Ocp-Apim-Subscription-Key
to our API key, or more specifically to our variable key
that contains our API key. We see that we create a BODY value with property url
and set that to the image we want to analyze.
Looking at the response
Ok, we make the call, we were told there would be JSON. And there is, a whole lot of it :)
{
"categories": [{
"name": "outdoor_mountain",
"score": 0.99609375,
"detail": {
"landmarks": []
}
}],
"description": {
"tags": [
"snow",
"outdoor",
"mountain",
"nature",
"covered",
"skiing",
"man",
"flying",
"standing",
"wearing",
"side",
"air",
"slope",
"jumping",
"plane",
"red",
"hill",
"riding",
"people",
"group",
"yellow",
"board",
"doing",
"airplane"
],
"captions": [{
"text": "a snow covered mountain",
"confidence": 0.956279380622841
}]
},
"requestId": "<undisclosed>",
"metadata": {
"width": 600,
"height": 462,
"format": "Jpeg"
}
}
The score
is an indication of how certain it is of the results. With a value of 0.99609375
(max is 1.0) i would say it's pretty darn certain. The captions are the algorithm trying to give us a normal sentence of what this is. It says it is: a snow-covered mountain
. Let's see for ourselves with the URL we provided to the service call:
Yep, looks like a Mountain to me, good Skynet ;)
Summary
I've taken you through my childhood and by now you know I'm a movie nerd, a bit of a skeptic on where all this AI, Machine Learning research is taking us. At the same time excited about all the cool apps I can build with Cognitive Services.
Here is also some food for thought. It's easy to joke about killer robots especially when they come from the world of movies. With all great tech, we have a responsibility to do something useful with it, to serve mankind. Imagine algorithms like this mounted on drones or helicopters. Imagine further that a catastrophe has happened and you are looking for survivors and you got some great algorithms that quickly can aid you to find people. That can make a real difference, save lives.
I hope you are as excited as me and give it a try. The best way to get started is hopefully this blog post but it's worth checking the LEARN platform and especially this course. Good luck :)
If you found this article useful/hilarious/ amusing / anything, please give me a clap :)
Top comments (1)
Incase if anyone didn't watched this 🤣
youtu.be/dKjCWfuvYxQ