Datasets for Computer Vision (4)

#python #pytorch #dataset #computervision

Buy Me a Coffee☕

*Memos:

My post explains MNIST, EMNIST, QMNIST, ETLCDB, Kuzushiji and Moving MNIST.
My post explains Fashion-MNIST, Caltech 101, Caltech 256, CelebA, CIFAR-10 and CIFAR-100.
My post explains Oxford-IIIT Pet, Oxford 102 Flower, Stanford Cars, Places365, Flickr8k and Flickr30k.
My post explains PASCAL VOC, SUN Database, Kinetics Dataset and Cityscapes.
My post explains Image Classification(Recognition), Object Localization, Object Detection and Image Segmentation.
My post explains Keypoint Detection(Landmark Detection), Image Matching, Object Tracking, Stereo Matching, Video Prediction, Optical Flow, Image Captioning.

(1) ImageNet(2009):

has the 1,431,167 object images(1,281,167 for train, 50,000 for validation and 100,000 for test) each connected to the label from 1000 classes: *Memos:
- Each class has the one or more names which represent the same things.
- You can download the dataset from Kaggle. *You can also download ILSVRC2012_devkit_t12.tar.gz, ILSVRC2012_img_train.tar and ILSVRC2012_img_val.tar.
is ImageNet() in PyTorch. *My post explains ImageNet().

(2) LSUN(Large-scale Scene Understanding)(2015):

has scene images and there are the 10 datasets Bedroom, Bridge, Church Outdoor, Classroom, Conference Room, Dining Room, Kitchen, Living Room, Restaurant and Tower:
- Bedroom has 3,033,342 bedroom images(3,033,042 for train and 300 for validation).
- Bridge has 818,987 bridge images(818,687 for train and 300 for validation).
- Church Outdoor has 126,527 church outdoor images(126,227 for train and 300 for validation).
- Classroom has 126,527 classroom images(126,227 for train and 300 for validation).
- Conference Room has 229,369 conference room images(229,069 for train and 300 for validation).
- Dining Room has 657,871 dining room images(657,571 for train and 300 for validation).
- Kitchen has 2,212,577 kitchen images(2,212,277 for train and 300 for validation).
- Living Room has 1,316,102 living room images(1,315,802 for train and 300 for validation).
- Restaurant has 626,631 restaurant images(626,331 for train and 300 for validation).
- Tower has 708,564 tower images(708,264 for train and 300 for validation).
is LSUN() in PyTorch but it has the bug.

(3) MS COCO(Microsoft Common Objects in Context)(2014):

has object images with annotations and there are the 16 datasets 2014 Train images and 2014 Val images with 2014 Train/Val annotations, 2014 Test images with 2014 Testing Image info, 2015 Test images with 2015 Testing Image info, 2017 Train images and 2017 Val images with 2017 Train/Val annotations, 2017 Stuff Train/Val annotations or 2017 Panoptic Train/Val annotations, 2017 Test images with 2017 Testing Image info and 2017 Unlabeled images with 2017 Unlabeled Image info: *Memos:
- 2014 Train images has 82,782 images.
- 2014 Val images has 40,504 images.
- 2014 Train/Val annotations has 123,286 annotations(82,782 for train and 40,504 for validation) for 2014 Train images and 2014 Val images.
- 2014 Test images has 40,775 images.
- 2014 Testing Image info has 40,775 annotations for 2014 Test images.
- 2015 Test images has 81,434 images.
- 2015 Testing Image info has 101,722 annotations(81,434 annotations and 20,288 dev-annotations) for 2015 Test images.
- 2017 Train images has 118,287 images.
- 2017 Val images has 5,000 images.
- 2017 Train/Val annotations has 123,287 annotations(118,287 for train and 5,000 for validation) for 2017 Train images and 2017 Val images.
- 2017 Stuff Train/Val annotations has 123,287 annotations(118,287 for train and 5,000 for validation) for 2017 Train images and 2017 Val images.
- 2017 Panoptic Train/Val annotations has 123,287 annotations(118,287 for train and 5,000 for validation) for 2017 Train images and 2017 Val images.
- 2017 Test images has 40,670 images.
- 2017 Testing Image info has 40,670 annotations for 2017 Test images.
- 2017 Unlabeled images has 123,403 images.
- 2017 Unlabeled Image info has 123,403 annotations for 2017 Unlabeled images.
is also called just COCO.
is CocoDetection() and CocoCaptions(): *Memos:
- My post explains CocoDetection().
- My post explains CocoCaptions().

DEV Community

Datasets for Computer Vision (4)

Top comments (0)