DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

Datasets for Computer Vision (3)

Buy Me a Coffee

*Memos:

(1) Oxford-IIIT Pet(2012):

  • has the 7,349 cat and dog images(3,680 for train and validation, 3,669 for test) each connected to the label from 37 classes: *Memos:
    • Each class has roughly 200 images.
    • 3,680 for train or train and validation and 3,669 for test.
  • is used for Image Classification and Fine-Grained Image Classification.
  • is OxfordIIITPet() in PyTorch. *My post explains OxfordIIITPet().

Image description

(2) Oxford 102 Flower(2008):

  • has 8,189 flower images(1,020 for train, 1,020 for validation and 6,149 for test) with the 102 categories(classes). *Each class has 40 to 258 images.
  • is used for Fine-Grained Flower Classification.
  • is Flowers102() in PyTorch. *My post explains Flowers102().

Image description

(3) Stanford Cars(2013):

  • has 16185 car images(8,144 for train and 8,041 for test) with 196 classes.
  • is used for Fine-Grained Flower Classification.
  • is StanfordCars() in PyTorch. *My post explains StanfordCars().

Image description

(4) Places365(2017):

  • has scene images with the 365 scene categories(classes) out of the 434 scene categories(classes) in the Places Database and there are Places365-Standard, Places365-Challenge and Places-Extra69 as you can see here: *Memos:
    • Places365-Standard has 2,168,460 images(1,803,460 for train, 36,500 for validation and 328,500 for test) with the 365 categories(classes) out of the 434 categories(classes) in the Places Database. *There are 50 images per category(class) in the validation set and 900 images per category(class) in the test set.
    • Places365-Challenge has 8,391,628 images(8,026,628 for train, 36,500 for validation and 328,500 for test), adding 6,223,168 extra images to the train set of Places365-Standard.
    • Places-Extra69 has 105,321 images(98,721 for train and 6,600 for test) with the extra 69 categories(classes) out of the 434 categories(classes) in the Places Database. *Currently, it cannot be downloaded.
  • is used for Scene Classification.
  • is Places365() in PyTorch. *My post explains Places365().

Image description

(5) Flickr8k(2013):

  • has the 8,091 images obtained from flickr with the five different captions for each image.
  • is used for Image Captioning.
  • is Flickr8k() in PyTorch but it doesn't explain how to set up the dataset to it so I don't know how to load the dataset with it.

Image description

(6) Flickr30k(2015):

  • has 31,784 images obtained from flickr with the five different captions for each image.
  • is used for Image Captioning.
  • is Flickr8k() in PyTorch but it doesn't explain how to set up the dataset to it so I don't know how to load the dataset with it.

Image description

Reinvent your career. Join DEV.

It takes one minute and is worth it for your career.

Get started

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more