Jimmy Guerrero for Voxel51

Posted on Mar 22, 2024 • Edited on Apr 10, 2024 • Originally published at voxel51.com

FiftyOne Computer Vision Tips and Tricks - March 22, 2024

#computervision #ai #machinelearning #datascience

Welcome to our weekly FiftyOne tips and tricks blog where we recap interesting questions and answers that have recently popped up on Slack, GitHub, Stack Overflow, and Reddit.

As an open source community, the FiftyOne community is open to all. This means everyone is welcome to ask questions, and everyone is welcome to answer them. Continue reading to see the latest questions asked and answers provided!

Wait, what’s FiftyOne?

FiftyOne is an open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.

If you like what you see on GitHub, give the project a star.
Get started! We’ve made it easy to get up and running in a few minutes.
Join the FiftyOne Slack community, we’re always happy to help.

Ok, let’s dive into this week’s tips and tricks!

Loading a view saved in the App, using the Python SDK

Community Slack member Kaisbedioui asked:

During my FiftyOne app session, I explored data and saved certain views using the UI. How can I use the Python SDK to load and perform certain operations on those saved views?

You can retrieve a view you saved in the UI using the SDK as follows:

import fiftyone as fo

dataset = fo.load_dataset("quickstart")

# Retrieve a saved view
cats_view = dataset.load_saved_view("cats-view")
print(cats_view)

You can find all the Python SDK methods for the same operations you performed in the UI in the Saving Views section of the FiftyOne Docs.

Troubleshooting MongoDB installation issues

Community Slack member Jaydeep asked:

I am getting the following error at the import line. What’s the issue?

raise FiftyOneConfigError(
fiftyone.core.config.FiftyOneConfigError: MongoDB could not be installed on your system. Please define a `database_uri` in your `fiftyone.core.config.FiftyOneConfig` to connect to your own MongoDB instance or cluster

This error usually means that FiftyOne cannot determine what version of MongoDB to install based on your operating system. For background, FiftyOne uses /etc/os-release to determine which MongoDB binary to install on your system. You can work around this issue by installing a MongoDB instance that is compatible with your Linux distribution and providing its URL as an environment variable to FiftyOne.

Additional installation troubleshooting tips can be found in the FiftyOne Docs.

Computing area of each instance in COCO format

Community Slack member Luiz asked:

I'm using a dataset for object detection and instance segmentation. I need to do some analysis based on the area of the segmentation masks. The data is stored in the COCO format, so every annotation has the following structure:

annotation{
    "id": int, 
    "image_id": int,
    "category_id": int, 
    "segmentation": RLE or [polygon], 
    "area": float, 
    "bbox": [x,y,width,height], 
    "iscrowd": 0 or 1
}

On loading, FiftyOne builds a Detection object for each annotation, but I noticed that the area is not included in any field. Can I keep the "area" information when I load the dataset?
I'm using the following method to load the data:

dataset = fo.Dataset.from_dir(
   dataset_type=fo.types.COCODetectionDataset,
   data_path=data_path,
   labels_path=labels_path,
   label_field="ground_truth",
   label_types="segmentations",
)

Although the COCO importer doesn’t currently load the area attribute specifically, you could recompute the areas in FiftyOne quite easily with something like this:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset(
    "coco-2017",
    split="validation",
    label_types="segmentations",
    max_samples=25,
)

dataset.compute_metadata()

# Compute area of each instance
for sample in dataset:
    frame_size = (sample.metadata.width, sample.metadata.height)
    for detection in sample.ground_truth.detections:
        instance = detection.to_polyline()
        detection["area"] = instance.to_shapely(frame_size=frame_size).area

    sample.save()

Exporting COCO predictions

Community Slack member Michel asked:

I want to add COCO predictions to an existing dataset, and then export only the COCO predictions. Is this possible?

Yes! The FiftyOne Model Zoo supports various COCO-compatible models, including centernet, deeplab, efficientdet, rcnn, ssd, and YOLO. (Click on the “COCO” tag to see the complete list of supported models.) Also worth reviewing are the docs covering export datasets in supported and custom formats.

Returning samples that have or don’t have non-None values for a given field

Community Slack member Nadav asked:

I have a classification field in my data. What is the best way to filter all unlabeled sample throw code? Currently, I'm using ctx.dataset.match(F(source_field) != None), but Pycharm raises a warning.

Two things to consider here:

Because you are using pycharm, you should modify your code to use is not instead of !=.

A cleaner way to approach your desired result is to use:

dataset.exists(source_field, bool=False)

Learn more in the Docs about returning a view containing the samples in the collection that have (or do not have) a non-None value for the given field or embedded field.