DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on

1

Datasets for Computer Vision (5)

Buy Me a Coffee

*Memos:

(1) PASCAL VOC(Pattern Analysis, Statistical Modelling, and Computational Learning Visual Object Classes)(2005):

  • has object images and annotations with 4, 10 or 20 classes and there are the 8 datasets VOC2005, VOC2006, VOC2007, VOC2008, VOC2009, VOC2010, VOC2011 and VOC2012: *Memos:
    • VOC2005 has 2,232 images and annotations(some for train, some for validation and some for test) with 4 classes.
    • VOC2006 has 5,304 images and annotations(1,277 for train, 1,341 for validation and 2,686 for test) with 10 classes.
    • VOC2007 has 9,963 images and annotations(2,501 for train, 2,510 for validation and 4,952 for test) with 20 classes.
    • VOC2008 has 5,096 images and annotations(2,111 for train, 2,221 for validation and 764 as extra) with 20 classes. *There are 4,133 images for test in it but just ignore them.
    • VOC2009 has 7,818 images and annotations(3,473 for train, 3,581 for validation and 764 as extra) with 20 classes.
    • VOC2010 has 11,321 images and annotations(4,998 for train, 5,105 for validation and 1,218 as extra) with 20 classes.
    • VOC2011 has 14,961 images and annotations(5,717 for train, 5,823 for validation and 3,421 as extra) with 20 classes.
    • VOC2012 has 17,125 images and annotations(5,717 for train, 5,823 for validation and 5,585 as extra) with 20 classes.
  • is VOCSegmentation() and VOCDetection() in PyTorch.

Image description

(2) SUN Database(Scene UNderstanding database)(2010):

  • has 108,754 scene images with 397 classes.
  • is also called SUN397.
  • is SUN397() in PyTorch.

Image description

(3) Kinetics Dataset(2017):

  • has human action short video clips and there are the 3 datasets Kinetics-400, Kinetics-600 and Kinetics-700: *Memos:
    • Each video clip lasts around 10 seconds.
    • Kinetics-400(2017) has 306,245 video clips each connected to the label from 400 categories(classes).
    • Kinetics-600(2018) has 495,547 video clips each connected to the label from 600 categories.
    • Kinetics-700(2019) has 545,317 video clips each connected to the label from 700 categories.
  • is used for Video Classification.
  • is Kinetics() in PyTorch.

Image description

(4) Cityscapes(2016):

  • has the 25,000 annotated urban street scene images of semantic understanding with the 30 classes grouped into 8 categories. *5,000 images are fine-annotated and 20,000 images are coarse-annotated.
  • is used for Image Segmentation.
  • is Cityscapes() in PyTorch. *How to set the dataset isn't explained.

Fine-annotated images:

Image description

Coarse-annotated images:

Image description

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay