DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

QMNIST in PyTorch

Buy Me a Coffee

*My post explains QMNIST.

QMNIST() can use QMNIST dataset as shown below:

*Memos:

  • The 1st argument is root(Required-Type:str or pathlib.Path). *An absolute or relative path is possible.
  • The 2nd argument is what(Optional-Default:None-Type:str). *"train"(60,000 images), "test"(60,000 images), "test10k"(10,000 images), "test50k"(50,000 images) or "nist"(402,953 images) can be set to it.
  • The 3rd argument is compat(Optional-Default:True-Type:bool). *If it's True, the class number of each image is returnd(for compatibility with the MNIST dataloader) while if it's False, the 1D tensor of the full qmnist information is returned.
  • The 4th argument is train argument(Optional-Default:True-Type:bool): *Memos:
    • It's ignored if what isn't None.
    • If it's True, train data(60,000 images) is used while if it's False, test data(60,000 images) is used.
  • There is transform argument(Optional-Default:None-Type:callable). *transform= must be used.
  • There is target_transform argument(Optional-Default:None-Type:callable). *target_transform= must be used.
  • There is download argument(Optional-Default:False-Type:bool): *Memos:
    • download= must be used.
    • If it's True, the dataset is downloaded from the internet and extracted(unzipped) to root.
    • If it's True and the dataset is already downloaded, it's extracted.
    • If it's True and the dataset is already downloaded and extracted, nothing happens.
    • It should be False if the dataset is already downloaded and extracted because it's faster.
    • You can manually download and extract the dataset(qmnist-train-images-idx3-ubyte.gz, qmnist-train-labels-idx2-int.gz, qmnist-test-labels-idx2-int.gz, qmnist-test-images-idx3-ubyte.gz, xnist-images-idx3-ubyte.xz and xnist-labels-idx2-int.xz) from here to data/QMNIST/raw/.
from torchvision.datasets import QMNIST

train_data = QMNIST(
    root="data"
)

train_data = QMNIST(
    root="data",
    what=None,
    compat=True,
    train=True,
    transform=None,
    target_transform=None,
    download=False
)

train_data = QMNIST(
    root="data",
    what="train",
    train=False
)

test_data = QMNIST(
    root="data",
    train=False
)

test_data = QMNIST(
    root="data",
    what="test",
    train=True
)

test10k_data = QMNIST(
    root="data",
    what="test10k"
)

test50k_data = QMNIST(
    root="data",
    what="test50k",
    compat=False
)

nist_data = QMNIST(
    root="data",
    what="nist"
)

l = len
l(train_data), l(test_data), l(test10k_data), l(test50k_data), l(nist_data)
# (60000, 60000, 10000, 50000, 402953)

train_data
# Dataset QMNIST
#     Number of datapoints: 60000
#     Root location: data
#     Split: train

train_data.root
# 'data'

train_data.what
# 'train'

train_data.compat
# True

train_data.train
# True

print(train_data.transform)
# None

print(train_data.target_transform)
# None

train_data.download
# <bound method QMNIST.download of Dataset QMNIST
#     Number of datapoints: 60000
#     Root location: data
#     Split: train>

len(train_data.classes)
# 10

train_data.classes
# ['0 - zero', '1 - one', '2 - two', '3 - three', '4 - four',
#  '5 - five', '6 - six', '7 - seven', '8 - eight', '9 - nine']

train_data[0]
# (<PIL.Image.Image image mode=L size=28x28>, 5)

test50k_data3[0]
# (<PIL.Image.Image image mode=L size=28x28>,
#  tensor([3, 4, 2424, 51, 33, 261051, 0, 0]))

train_data[1]
# (<PIL.Image.Image image mode=L size=28x28>, 0)

test50k_data3[1]
# (<PIL.Image.Image image mode=L size=28x28>,
#  tensor([8, 1, 522, 60, 38, 55979, 0, 0]))

train_data[2]
# (<PIL.Image.Image image mode=L size=28x28>, 4)

test50k_data3[2]
# (<PIL.Image.Image image mode=L size=28x28>,
#  tensor([9, 4, 2496, 115, 39, 269531, 0, 0]))

train_data[3]
# (<PIL.Image.Image image mode=L size=28x28>, 1)

test50k_data3[3]
# (<PIL.Image.Image image mode=L size=28x28>,
#  tensor([5, 4, 2427, 77, 35, 261428, 0, 0]))

train_data[4]
# (<PIL.Image.Image image mode=L size=28x28>, 9)

test50k_data3[4]
# (<PIL.Image.Image image mode=L size=28x28>,
#  tensor([7, 4, 2524, 69, 37, 272828, 0, 0]))
Enter fullscreen mode Exit fullscreen mode
from torchvision.datasets import QMNIST

train_data = QMNIST(
    root="data",
    what="train"
)

test_data = QMNIST(
    root="data",
    what="test"
)

test10k_data = QMNIST(
    root="data",
    what="test10k"
)

test50k_data = QMNIST(
    root="data",
    what="test50k"
)

nist_data = QMNIST(
    root="data",
    what="nist"
)

import matplotlib.pyplot as plt

def show_images(data, main_title=None):
    plt.figure(figsize=(10, 4))
    plt.suptitle(t=main_title, y=0.8, fontsize=14)
    for i, (im, lab) in enumerate(data, start=1):
        plt.subplot(1, 5, i)
        plt.title(label=lab)
        plt.imshow(X=im)
        if i == 5:
            break
    plt.tight_layout()
    plt.show()

show_images(data=train_data, main_title="train_data")
show_images(data=test_data, main_title="test_data")
show_images(data=test10k_data, main_title="test10k_data")
show_images(data=test50k_data, main_title="test50k_data3")
show_images(data=nist_data, main_title="nist_data")
Enter fullscreen mode Exit fullscreen mode

Image description

Image description

Image description

Top comments (0)