This article shows how to use the TensorFlow Object Detection API (the inference part). You can do it in Colab.
Colab sample — run the cells in order to try the TensorFlow Object Detection API. Change
Image_Pathin the last cell to your own image to detect objects in it.
The official TensorFlow Model Zoo has many kinds of models.
(For training a model: Train an object detection model with the TensorFlow Object Detection API)
(For quick training with just a few images: Quick-train an object detection model with the TensorFlow Object Detection API)
Steps
0. Install TensorFlow 2
!pip install -U --pre tensorflow=="2.2.0"
1. Clone the official TensorFlow Models from GitHub
import os
import pathlib
# If "models" is in the current directory path, move there. Otherwise clone it.
if "models" in pathlib.Path.cwd().parts:
while "models" in pathlib.Path.cwd().parts:
os.chdir('..')
elif not pathlib.Path('models').exists():
!git clone --depth 1 https://github.com/tensorflow/models
2. Install the Object Detection API and required modules
%%bash # enable bash commands
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install .
3. Import modules
import matplotlib
import matplotlib.pyplot as plt
import io
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.builders import model_builder
%matplotlib inline
4. Image-loading function
def load_image_into_numpy_array(path):
"""Load an image into a numpy array.
Puts image into numpy array to feed into tensorflow graph.
Note that by convention we put it into a numpy array with shape
(height, width, channels), where channels=3 for RGB.
Args:
path: the file path to the image
Returns:
uint8 numpy array with shape (img_height, img_width, 3)
"""
img_data = tf.io.gfile.GFile(path, 'rb').read()
image = Image.open(BytesIO(img_data))
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
def get_keypoint_tuples(eval_config):
"""Return a tuple list of keypoint edges from the eval config.
Args:
eval_config: an eval config containing the keypoint edges
Returns:
a list of edge tuples, each in the format (start, end)
"""
tuple_list = []
kp_list = eval_config.keypoint_edge
for edge in kp_list:
tuple_list.append((edge.start, edge.end))
return tuple_list
5. Download a model
!wget http://download.tensorflow.org/models/object_detection/tf2/20200713/centernet_hg104_512x512_coco17_tpu-8.tar.gz
!tar -xf centernet_hg104_512x512_coco17_tpu-8.tar.gz
Download any model you like from the official Model Zoo. Hover over a model name there to see its download URL.
It's fun just looking at the performance comparisons. Once download and extraction finish, you get a folder containing checkpoint, saved_model, and pipeline.config.
6. Read the pipeline config and build the model
# Path to the config file. The repo has a folder of config files, but the model
# names are slightly abbreviated, so the downloaded one is more reliable.
pipeline_config = "./centernet_hg104_512x512_coco17_tpu-8/pipeline.config"
# Path to the checkpoint
model_dir = "./centernet_hg104_512x512_coco17_tpu-8/checkpoint"
# Load the model config
configs = config_util.get_configs_from_pipeline_file(pipeline_config)
model_config = configs['model']
# Build the model from the loaded config
detection_model = model_builder.build(
model_config=model_config, is_training=False)
# Restore weights from the checkpoint
ckpt = tf.compat.v2.train.Checkpoint(model=detection_model)
ckpt.restore(os.path.join(model_dir, 'ckpt-0')).expect_partial()
7. Prepare the inference function
def get_model_detection_function(model):
"""Get a tf.function for detection."""
@tf.function
def detect_fn(image):
"""Detect objects in image."""
image, shapes = model.preprocess(image)
prediction_dict = model.predict(image, shapes)
detections = model.postprocess(prediction_dict, shapes)
return detections, prediction_dict, tf.reshape(shapes, [-1])
return detect_fn
detect_fn = get_model_detection_function(detection_model)
8. Prepare labels
Inference needs the object labels the model was trained on. The labels are in the official repo at models/research/object_detection/data/. This model was trained on COCO, so we use mscoco_label_map.pbtxt.
label_map_path = './models/research/object_detection/data/mscoco_label_map.pbtxt'
label_map = label_map_util.load_labelmap(label_map_path)
categories = label_map_util.convert_label_map_to_categories(
label_map,
max_num_classes=label_map_util.get_max_label_map_index(label_map),
use_display_name=True)
category_index = label_map_util.create_category_index(categories)
label_map_dict = label_map_util.get_label_map_dict(label_map, use_display_name=True)
9. Run detection on your image
Upload any image to Colab and set its path as image_path. By the way, images with an alpha channel seem to need converting to 3 channels first.
image_dir = 'models/research/object_detection/test_images/'
image_path = os.path.join(image_dir, 'image2.jpg')
image_np = load_image_into_numpy_array(image_path)
# Things to try:
# Flip horizontally
# image_np = np.fliplr(image_np).copy()
# Convert image to grayscale
# image_np = np.tile(
# np.mean(image_np, 2, keepdims=True), (1, 1, 3)).astype(np.uint8)
input_tensor = tf.convert_to_tensor(
np.expand_dims(image_np, 0), dtype=tf.float32)
detections, predictions_dict, shapes = detect_fn(input_tensor)
label_id_offset = 1
image_np_with_detections = image_np.copy()
# Use keypoints if available in detections
keypoints, keypoint_scores = None, None
if 'detection_keypoints' in detections:
keypoints = detections['detection_keypoints'][0].numpy()
keypoint_scores = detections['detection_keypoint_scores'][0].numpy()
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_detections,
detections['detection_boxes'][0].numpy(),
(detections['detection_classes'][0].numpy() + label_id_offset).astype(int),
detections['detection_scores'][0].numpy(),
category_index,
use_normalized_coordinates=True,
max_boxes_to_draw=200,
min_score_thresh=.30,
agnostic_mode=False,
keypoints=keypoints,
keypoint_scores=keypoint_scores,
keypoint_edges=get_keypoint_tuples(configs['eval_config']))
plt.figure(figsize=(12,16))
plt.imshow(image_np_with_detections)
plt.show()
Boxes, labels, and confidence scores are displayed.
Originally published in Japanese on Qiita. I build apps with Core ML and write about machine learning. GitHub / X




Top comments (0)