DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

Datasets for Computer Vision (1)

Buy Me a Coffee

*Memos:

(1) MNIST(Modified National Institute of Standards and Technology)(1998):

  • has the 70,000 handwritten digit images[0~9] each connected to the label from 10 classes: *Memos:
    • 60,000 for train and 10,000 for test.
    • Each image has 28x28 pixels.
  • is used for Image Classification.
  • is MNIST() in PyTorch. *My post explains MNIST().

Image description

(2) EMNIST(Extended MNIST)(2017):

  • has the handwritten character images(digits[0~9] and alphabet letters[A~Z][a~z]) splitted into 6 datasets(ByClass, ByMerge, Balanced, Letters, Digits and MNIST): *Memos:
    • Each image has 28x28 pixels.
    • ByClass has the 814,255 character images(digits[0~9] and alphabet letters[A~Z][a~z]) each connected to the label from 62 classes. *697,932 for train and 116,323 for test.
    • ByMerge has the 814,255 character images(digits[0~9] and alphabet letters[A~Z][a, b, d~h, n, q, r, t]) each connected to the label from 47 classes. *697,932 for train and 116,323 for test.
    • Balanced has the 131,600 character images(digits[0~9] and alphabet letters[A~Z][a, b, d~h, n, q, r, t]) each connected to the label from 47 classes. *112,800 for train and 18,800 for test.
    • Letters has the 145,600 alphabet letter images[a~z] each connected to the label from 27 classes. *124,800 for train and 20,800 for test.
    • Digits has the 280,000 digit images[0~9] each connected to the label from 10 classes. *240,000 for train and 40,000 for test.
    • MNIST has the 70,000 digit images[0~9] each connected to the label from 10 classes. *60,000 for train and 10,000 for test.
    • is used for Image Classification.
  • is EMNIST() in PyTorch. *My post explains EMNIST().

Image description

(3) QMNIST(2019):

  • has the 522,953 handwritten digit images[0~9] each connected to the label from 10 classes: *Memos:
  • is an extended MNIST. *I don't know what Q of QMNIST means.
  • is used for Image Classification.
  • is QMNIST() in PyTorch. *My post explains QMNIST().

Image description

(4) ETLCDB(Extract-Transform-Load Character Database)(2011):

  • has the handwritten or machine-printed character images(digits, symbols, alphabet letters and Japanese characters) splitted into 9 datasets(ETL-1, ETL-2, ETL-3, ETL-4, ETL-5, ETL-6, ETL-7, ETL-8 and ETL-9): *Memos:
    • ETL1 has the 141,319 character images(digits[0~9], alphabet letters[A~Z], symbols[+-*/=()・,?’] and Katakana[ア~ン]) each connected to the label from 99 classes. *Each image has 64x63 pixels.
    • ETL2 has 52,796 character images(digits[0~9], alphabet letters[A~Z], symbols, Katakana letters[ア~ン], Hiragana letters[あ~ん] and Kanji letters) each connected to the label from 2,184 classes. *Each image has 60x60 pixels.
    • ETL3 has 9,600 character images(digits[0~9], alphabet letters[A~Z] and symbols[¥+-*/=()・,_▾]) each connected to the label from 48 classes. *Each image has 72×76 pixels.
    • ETL4 has 6,120 Hiragana letter images[あ~ん] each connected to the label from 51 classes. *Each image has 72×76 pixels.
    • ETL5 has 10,608 Katakana letter images[ア~ン] each connected to the label from 51 classes. *Each image has 72×76 pixels.
    • ETL6 has 52,796 character images(digits[0~9], alphabet letters[A~Z][a~z], symbols and Katakana letters[ア~ン]) each connected to the label from 114 classes. *Each image has 64x63 pixels.
    • ETL7(ETL7L and ETL7S) has 16,800 character images(Hiragana letters[あ~ん], Dakuten[゛] and Handakuten[゜]) each connected to the label from 48 classes. *Each image has 64x63 pixels.
    • ETL8(ETL8G and ETL8B2) has 152,960 character images(Hiragana letters[あ~ん] and Kanji letters) each connected to the label from 956 classes. *Each image has 128x127 pixels.
    • ETL9(ETL9G and ETL9B) has 607,200 character images(Hiragana letters[あ~ん] and JIS first level Kanji letters) each connected to the label from 3,036 classes. *Each image has 128x127 pixels.
  • is used for Image Classification.
  • isn't in PyTorch so we need to download it from etlcdb.

Image description

(5) Kuzushiji(2018):

  • has the cursive style Japanese character images splitted into 3 datasets(Kuzushiji-MNIST, Kuzushiji-49 and Kuzushiji-Kanji): *Memos:
    • Kuzushiji-MNIST has the 70,000 Hiragana letter images each connected to the label from 10 classes. *Each image has 28x28 pixels.
    • Kuzushiji-49 has the imbalanced 270,912 character images(Hiragana letters and Hiragana iteration marks) each connected to the label from 49 classes. *Each image has 28x28 pixels.
    • Kuzushiji-Kanji has the imbalanced 140,424 Kanji letter images of 3832 classes. *Each image has 64x64 pixels.
  • is used for Image Classification.
  • is KMNIST() in PyTorch but it only has Kuzushiji-MNIST so we need to download Kuzushiji-49 and Kuzushiji-Kanji from GitHub. *My post explains KMNIST().

Image description

(6) Moving MNIST(2015):

  • has 10,000 videos: *Memos:
    • Each video has 20 frames(images) with 2 moving digits.
    • Each frame(image) has 64x64 pixels.
  • is used for Video Prediction.
  • is MovingMNIST() in PyTorch. *My post explains MovingMNIST().

Image description

Image description

Image description

Top comments (0)