Instance segmentation, a challenging task in computer vision that involves detecting and delineating individual objects within an image or video, has seen significant advancements in recent years. One such advancement is Detectron2, a flexible and efficient framework developed by Facebook AI Research. In this guide, we'll explore how to leverage the power of Detectron2 within the Google Colab environment to perform instance segmentation on videos.
Step 1: Check GPU availability
Check whether you have connected to GPU by changing the runtime from the Runtime tab in the dropdown menu.
After that check whether the GPU is accessible or not by running the command:
!nvidia-smi
If you see something like this, you are all set to go.
Step 2: Install detectron2
Run this single command to directly install detectron2.
!python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
Step 3: Import libraries
Import the required libraries.
# COMMON LIBRARIES
import os
import cv2
from google.colab.patches import cv2_imshow
# VISUALIZATION
from detectron2.utils.visualizer import Visualizer
from detectron2.utils.visualizer import ColorMode
# CONFIGURATION
from detectron2 import model_zoo
from detectron2.config import get_cfg
# EVALUATION
from detectron2.engine import DefaultPredictor
Step 4: Initialize the predictor
Choose a model as per your requirement from the model zoo. You can see the list of available models here.
cfg = get_cfg()
cfg.merge_from_file("detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"
predictor = DefaultPredictor(cfg)
Step 5: Inference on Video
Set the path to your video in the following code, and execute the code. The output will be a video with segmentation applied.
import imageio
import numpy
# Load video
video_path = "path_to_your_video.mp4"
cap = cv2.VideoCapture(video_path)
# Initialize video writer
fps = cap.get(cv2.CAP_PROP_FPS)
output_path = '/content/output.mp4'
writer = imageio.get_writer(output_path, fps=fps)
# Perform instance segmentation on each frame
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
outputs = predictor(frame)
# Find the classes (Optional)
pred_classes = instances.pred_classes.cpu().numpy()
# Find the segment points (Optional)
pred_masks = instances.pred_masks.cpu().numpy()
v = Visualizer(frame[:, :, ::-1], metadata=MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=0.8)
frame = v.draw_instance_predictions(outputs["instances"].to("cpu")).get_image()[:, :, ::-1]
# Write processed frame to output video
writer.append_data(frame)
# Release video resources
cap.release()
writer.close()
Top comments (0)