Sample code to use Cloud Vision API with Ruby

#ruby #api #webdev

Vision AI is a set of pre-trained machine learning models provided by Google. By using Cloud Vision API, you can easily integrate vision detection features within applications.
If you are unfamiliar with the API, you might think it would be too complicated to use it in your application and give up before you read the documentation till the end. Don’t worry. I have prepared a sample code that allows you to easily use Cloud Vision API in a Ruby application.

Before you begin, you will need to create an API key and enable Cloud Vision API. If you haven’t done it yet, please follow the official guide.

Here is a sample code:
*The image file used must be on your computer.

require 'base64'
require 'json'
require 'net/https'
# Step 1 - Set path to the image file, API key, and API URL.
IMAGE_FILE = './sample.jpg'
API_KEY = 'XXXXXXXXXX' # Don't forget to protect your API key.
API_URL = "https://vision.googleapis.com/v1/images:annotate?key=#{API_KEY}"
# Step 2 - Convert the image to base64 format.
base64_image = Base64.strict_encode64(File.new(IMAGE_FILE, 'rb').read)
# Step 3 - Set request JSON body.
body = {
  requests: [{
    image: {
      content: base64_image
    },
    features: [
      {
        type: 'LABEL_DETECTION', # Details are below.
        maxResults: 3 # The number of results you would like to get
      }
    ]
  }]
}
# Step 4 - Send request.
uri = URI.parse(API_URL)
https = Net::HTTP.new(uri.host, uri.port)
https.use_ssl = true
request = Net::HTTP::Post.new(uri.request_uri)
request["Content-Type"] = "application/json"
response = https.request(request, body.to_json)
# Step 5 - Print the response in the console.
puts response.body

The main features that can be used with Cloud Vision API are as follows:

TEXT_DETECTION
CROP_HINTS
FACE_DETECTION
IMAGE_PROPERTIES
LABEL_DETECTION
LANDMARK_DETECTION
LOGO_DETECTION
OBJECT_LOCALIZATION
SAFE_SEARCH_DETECTION

See the details here.

Let’s try it out. Here are the results of LABEL_DETECTION, FACE_DETECTION, and OBJECT_LOCALIZATION as examples.

LABEL_DETECTION

{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/01yrx",
          "description": "Cat",
          "score": 0.99598557,
          "topicality": 0.99598557
        },
        {
          "mid": "/m/04rky",
          "description": "Mammal",
          "score": 0.9890478,
          "topicality": 0.9890478
        },
        {
          "mid": "/m/09686",
          "description": "Vertebrate",
          "score": 0.9851104,
          "topicality": 0.9851104
        }
      ]
    }
  ]
}

FACE_DETECTION

{
  "responses": [
    {
      "faceAnnotations": [
        {
          "boundingPoly": {
            "vertices": [
              {
                "x": 85,
                "y": 32
              },
              {
                "x": 228,
                "y": 32
              },
              {
                "x": 228,
                "y": 198
              },
              {
                "x": 85,
                "y": 198
              }
            ]
          },
          "fdBoundingPoly": {
            "vertices": [
              {
                "x": 96,
                "y": 53
              },
              {
                "x": 220,
                "y": 53
              },
              {
                "x": 220,
                "y": 183
              },
              {
                "x": 96,
                "y": 183
              }
            ]
          },
          "landmarks": [
            {
              "type": "LEFT_EYE",
              "position": {
                "x": 137.98528,
                "y": 107.89613,
                "z": -0.0003142357
              }
            },
            {
              "type": "RIGHT_EYE",
              "position": {
                "x": 182.86436,
                "y": 108.30501,
                "z": 3.7897882
              }
            },
            {
              "type": "LEFT_OF_LEFT_EYEBROW",
              "position": {
                "x": 124.37836,
                "y": 98.03804,
                "z": 3.9364848
              }
            },
            {
              "type": "RIGHT_OF_LEFT_EYEBROW",
              "position": {
                "x": 151.22585,
                "y": 97.4466,
                "z": -5.1410694
              }
            },
                                 .
                                 .
                                 .
],
          "rollAngle": 0.5700898,
          "panAngle": 5.027816,
          "tiltAngle": 7.44839,
          "detectionConfidence": 0.89301175,
          "landmarkingConfidence": 0.7306799,
          "joyLikelihood": "LIKELY", # The face seems happy!
          "sorrowLikelihood": "VERY_UNLIKELY",
          "angerLikelihood": "VERY_UNLIKELY",
          "surpriseLikelihood": "VERY_UNLIKELY",
          "underExposedLikelihood": "VERY_UNLIKELY",
          "blurredLikelihood": "VERY_UNLIKELY",
          "headwearLikelihood": "VERY_UNLIKELY"
        }
      ]
    }
  ]
}

OBJECT_LOCALIZATION

{
  "responses": [
    {
      "localizedObjectAnnotations": [
        {
          "mid": "/m/0199g",
          "name": "Bicycle",
          "score": 0.86724836,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.6728187,
                "y": 0.673309
              },
              {
                "x": 0.9256048,
                "y": 0.673309
              },
              {
                "x": 0.9256048,
                "y": 0.92176706
              },
              {
                "x": 0.6728187,
                "y": 0.92176706
              }
            ]
          }
        },
        {
          "mid": "/m/02dgv",
          "name": "Door",
          "score": 0.84919137,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.40092757,
                "y": 0.31736398
              },
              {
                "x": 0.51375383,
                "y": 0.31736398
              },
              {
                "x": 0.51375383,
                "y": 0.7087951
              },
              {
                "x": 0.40092757,
                "y": 0.7087951
              }
            ]
          }
        },
        {
          "mid": "/m/0d4v4",
          "name": "Window",
          "score": 0.8491046,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.11334894,
                "y": 0.2133823
              },
              {
                "x": 0.28181157,
                "y": 0.2133823
              },
              {
                "x": 0.28181157,
                "y": 0.5806629
              },
              {
                "x": 0.11334894,
                "y": 0.5806629
              }
            ]
          }
        }
      ]
    }
  ]
}

Voila! Now you are ready to build a Ruby application with Vision AI. I am excited to see innovative products and services in the future.