DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

Datasets for Computer Vision (4)

Buy Me a Coffee

*Memos:

(1) ImageNet(2009):

Image description

(2) LSUN(Large-scale Scene Understanding)(2015):

  • has scene images and there are the 10 datasets Bedroom, Bridge, Church Outdoor, Classroom, Conference Room, Dining Room, Kitchen, Living Room, Restaurant and Tower:
    • Bedroom has 3,033,342 bedroom images(3,033,042 for train and 300 for validation).
    • Bridge has 818,987 bridge images(818,687 for train and 300 for validation).
    • Church Outdoor has 126,527 church outdoor images(126,227 for train and 300 for validation).
    • Classroom has 126,527 classroom images(126,227 for train and 300 for validation).
    • Conference Room has 229,369 conference room images(229,069 for train and 300 for validation).
    • Dining Room has 657,871 dining room images(657,571 for train and 300 for validation).
    • Kitchen has 2,212,577 kitchen images(2,212,277 for train and 300 for validation).
    • Living Room has 1,316,102 living room images(1,315,802 for train and 300 for validation).
    • Restaurant has 626,631 restaurant images(626,331 for train and 300 for validation).
    • Tower has 708,564 tower images(708,264 for train and 300 for validation).
  • is LSUN() in PyTorch but it has the bug.

Image description

(3) MS COCO(Microsoft Common Objects in Context)(2014):

  • has object images with annotations and there are the 16 datasets 2014 Train images and 2014 Val images with 2014 Train/Val annotations, 2014 Test images with 2014 Testing Image info, 2015 Test images with 2015 Testing Image info, 2017 Train images and 2017 Val images with 2017 Train/Val annotations, 2017 Stuff Train/Val annotations or 2017 Panoptic Train/Val annotations, 2017 Test images with 2017 Testing Image info and 2017 Unlabeled images with 2017 Unlabeled Image info: *Memos:
    • 2014 Train images has 82,782 images.
    • 2014 Val images has 40,504 images.
    • 2014 Train/Val annotations has 123,286 annotations(82,782 for train and 40,504 for validation) for 2014 Train images and 2014 Val images.
    • 2014 Test images has 40,775 images.
    • 2014 Testing Image info has 40,775 annotations for 2014 Test images.
    • 2015 Test images has 81,434 images.
    • 2015 Testing Image info has 101,722 annotations(81,434 annotations and 20,288 dev-annotations) for 2015 Test images.
    • 2017 Train images has 118,287 images.
    • 2017 Val images has 5,000 images.
    • 2017 Train/Val annotations has 123,287 annotations(118,287 for train and 5,000 for validation) for 2017 Train images and 2017 Val images.
    • 2017 Stuff Train/Val annotations has 123,287 annotations(118,287 for train and 5,000 for validation) for 2017 Train images and 2017 Val images.
    • 2017 Panoptic Train/Val annotations has 123,287 annotations(118,287 for train and 5,000 for validation) for 2017 Train images and 2017 Val images.
    • 2017 Test images has 40,670 images.
    • 2017 Testing Image info has 40,670 annotations for 2017 Test images.
    • 2017 Unlabeled images has 123,403 images.
    • 2017 Unlabeled Image info has 123,403 annotations for 2017 Unlabeled images.
  • is also called just COCO.
  • is CocoDetection() and CocoCaptions(): *Memos:

Image description

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (0)

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more