DEV Community

loading...

Sample code to use Cloud Vision API with Ruby

junko911 profile image Junko T. ・3 min read

Alt Text
Vision AI is a set of pre-trained machine learning models provided by Google. By using Cloud Vision API, you can easily integrate vision detection features within applications.
If you are unfamiliar with the API, you might think it would be too complicated to use it in your application and give up before you read the documentation till the end. Don’t worry. I have prepared a sample code that allows you to easily use Cloud Vision API in a Ruby application.

Before you begin, you will need to create an API key and enable Cloud Vision API. If you haven’t done it yet, please follow the official guide.

Here is a sample code:
*The image file used must be on your computer.

require 'base64'
require 'json'
require 'net/https'
# Step 1 - Set path to the image file, API key, and API URL.
IMAGE_FILE = './sample.jpg'
API_KEY = 'XXXXXXXXXX' # Don't forget to protect your API key.
API_URL = "https://vision.googleapis.com/v1/images:annotate?key=#{API_KEY}"
# Step 2 - Convert the image to base64 format.
base64_image = Base64.strict_encode64(File.new(IMAGE_FILE, 'rb').read)
# Step 3 - Set request JSON body.
body = {
  requests: [{
    image: {
      content: base64_image
    },
    features: [
      {
        type: 'LABEL_DETECTION', # Details are below.
        maxResults: 3 # The number of results you would like to get
      }
    ]
  }]
}
# Step 4 - Send request.
uri = URI.parse(API_URL)
https = Net::HTTP.new(uri.host, uri.port)
https.use_ssl = true
request = Net::HTTP::Post.new(uri.request_uri)
request["Content-Type"] = "application/json"
response = https.request(request, body.to_json)
# Step 5 - Print the response in the console.
puts response.body
Enter fullscreen mode Exit fullscreen mode

The main features that can be used with Cloud Vision API are as follows:

  • TEXT_DETECTION
  • CROP_HINTS
  • FACE_DETECTION
  • IMAGE_PROPERTIES
  • LABEL_DETECTION
  • LANDMARK_DETECTION
  • LOGO_DETECTION
  • OBJECT_LOCALIZATION
  • SAFE_SEARCH_DETECTION

See the details here.

Let’s try it out. Here are the results of LABEL_DETECTION, FACE_DETECTION, and OBJECT_LOCALIZATION as examples.

LABEL_DETECTION

Alt Text

{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/01yrx",
          "description": "Cat",
          "score": 0.99598557,
          "topicality": 0.99598557
        },
        {
          "mid": "/m/04rky",
          "description": "Mammal",
          "score": 0.9890478,
          "topicality": 0.9890478
        },
        {
          "mid": "/m/09686",
          "description": "Vertebrate",
          "score": 0.9851104,
          "topicality": 0.9851104
        }
      ]
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

FACE_DETECTION

Alt Text

{
  "responses": [
    {
      "faceAnnotations": [
        {
          "boundingPoly": {
            "vertices": [
              {
                "x": 85,
                "y": 32
              },
              {
                "x": 228,
                "y": 32
              },
              {
                "x": 228,
                "y": 198
              },
              {
                "x": 85,
                "y": 198
              }
            ]
          },
          "fdBoundingPoly": {
            "vertices": [
              {
                "x": 96,
                "y": 53
              },
              {
                "x": 220,
                "y": 53
              },
              {
                "x": 220,
                "y": 183
              },
              {
                "x": 96,
                "y": 183
              }
            ]
          },
          "landmarks": [
            {
              "type": "LEFT_EYE",
              "position": {
                "x": 137.98528,
                "y": 107.89613,
                "z": -0.0003142357
              }
            },
            {
              "type": "RIGHT_EYE",
              "position": {
                "x": 182.86436,
                "y": 108.30501,
                "z": 3.7897882
              }
            },
            {
              "type": "LEFT_OF_LEFT_EYEBROW",
              "position": {
                "x": 124.37836,
                "y": 98.03804,
                "z": 3.9364848
              }
            },
            {
              "type": "RIGHT_OF_LEFT_EYEBROW",
              "position": {
                "x": 151.22585,
                "y": 97.4466,
                "z": -5.1410694
              }
            },
                                 .
                                 .
                                 .
],
          "rollAngle": 0.5700898,
          "panAngle": 5.027816,
          "tiltAngle": 7.44839,
          "detectionConfidence": 0.89301175,
          "landmarkingConfidence": 0.7306799,
          "joyLikelihood": "LIKELY", # The face seems happy!
          "sorrowLikelihood": "VERY_UNLIKELY",
          "angerLikelihood": "VERY_UNLIKELY",
          "surpriseLikelihood": "VERY_UNLIKELY",
          "underExposedLikelihood": "VERY_UNLIKELY",
          "blurredLikelihood": "VERY_UNLIKELY",
          "headwearLikelihood": "VERY_UNLIKELY"
        }
      ]
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

OBJECT_LOCALIZATION

Alt Text

{
  "responses": [
    {
      "localizedObjectAnnotations": [
        {
          "mid": "/m/0199g",
          "name": "Bicycle",
          "score": 0.86724836,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.6728187,
                "y": 0.673309
              },
              {
                "x": 0.9256048,
                "y": 0.673309
              },
              {
                "x": 0.9256048,
                "y": 0.92176706
              },
              {
                "x": 0.6728187,
                "y": 0.92176706
              }
            ]
          }
        },
        {
          "mid": "/m/02dgv",
          "name": "Door",
          "score": 0.84919137,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.40092757,
                "y": 0.31736398
              },
              {
                "x": 0.51375383,
                "y": 0.31736398
              },
              {
                "x": 0.51375383,
                "y": 0.7087951
              },
              {
                "x": 0.40092757,
                "y": 0.7087951
              }
            ]
          }
        },
        {
          "mid": "/m/0d4v4",
          "name": "Window",
          "score": 0.8491046,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.11334894,
                "y": 0.2133823
              },
              {
                "x": 0.28181157,
                "y": 0.2133823
              },
              {
                "x": 0.28181157,
                "y": 0.5806629
              },
              {
                "x": 0.11334894,
                "y": 0.5806629
              }
            ]
          }
        }
      ]
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Voila! Now you are ready to build a Ruby application with Vision AI. I am excited to see innovative products and services in the future.

Discussion (0)

Forem Open with the Forem app