DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

CIFAR10 in PyTorch

Buy Me a Coffee

*Memos:

CIFAR10() can use CIFAR-10 dataset as shown below:

*Memos:

  • The 1st argument is root(Required-Type:str or pathlib.Path). *An absolute or relative path is possible.
  • The 2nd argument is train(Optional-Default:True-Type:bool). *If it's True, train data(50,000 images) is used while if it's False, test data(10,000 images) is used.
  • The 3rd argument is transform(Optional-Default:None-Type:callable).
  • The 4th argument is target_transform(Optional-Default:None-Type:callable).
  • The 5th argument is download(Optional-Default:False-Type:bool): *Memos:
    • If it's True, the dataset is downloaded from the internet and extracted(unzipped) to root.
    • If it's True and the dataset is already downloaded, it's extracted.
    • If it's True and the dataset is already downloaded and extracted, nothing happens.
    • It should be False if the dataset is already downloaded and extracted because it's faster.
    • You can manually download and extract the dataset(cifar-10-python.tar.gz) from here to data/cifar-10-batches-py/.
from torchvision.datasets import CIFAR10

train_data = CIFAR10(
    root="data"
)

train_data = CIFAR10(
    root="data",
    train=True,
    transform=None,
    target_transform=None,
    download=False
)

test_data = CIFAR10(
    root="data",
    train=False
)

len(train_data), len(test_data)
# (50000, 10000)

train_data
# Dataset CIFAR10
#     Number of datapoints: 50000
#     Root location: data
#     Split: Train

train_data.root
# 'data'

train_data.train
# True

print(train_data.transform)
# None

print(train_data.target_transform)
# None

train_data.download
# <bound method CIFAR10.download of Dataset CIFAR10
#     Number of datapoints: 50000
#     Root location: data
#     Split: Train>

len(train_data.classes), train_data.classes
# (10,
#  ['airplane', 'automobile', 'bird', 'cat', 'deer',
#   'dog', 'frog', 'horse', 'ship', 'truck'])

train_data[0]
# (<PIL.Image.Image image mode=RGB size=32x32>, 6)

train_data[1]
# (<PIL.Image.Image image mode=RGB size=32x32>, 9)

train_data[2]
# (<PIL.Image.Image image mode=RGB size=32x32>, 9)

train_data[3]
# (<PIL.Image.Image image mode=RGB size=32x32>, 4)

train_data[4]
# (<PIL.Image.Image image mode=RGB size=32x32>, 1)

import matplotlib.pyplot as plt

def show_images(data, main_title=None):
    plt.figure(figsize=(10, 5))
    plt.suptitle(t=main_title, y=1.0, fontsize=14)
    for i, (im, lab) in zip(range(1, 11), data):
        plt.subplot(2, 5, i)
        plt.imshow(X=im)
        plt.title(label=lab)
    plt.tight_layout()
    plt.show()

show_images(data=train_data, main_title="train_data")
show_images(data=test_data, main_title="test_data")
Enter fullscreen mode Exit fullscreen mode

Image description

Image description

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Top comments (0)

Image of Timescale

Timescale – the developer's data platform for modern apps, built on PostgreSQL

Timescale Cloud is PostgreSQL optimized for speed, scale, and performance. Over 3 million IoT, AI, crypto, and dev tool apps are powered by Timescale. Try it free today! No credit card required.

Try free

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay