DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

1

Caltech101 in PyTorch

Buy Me a Coffee

*Memos:

Caltech101() can use Caltech 101 dataset as shown below:

*Memos:

  • The 1st argument is root(Required-Type:str or pathlib.Path). *An absolute or relative path is possible.
  • The 2nd argument is target_type(Optional-Default:"category"-Type:str or tuple or list of str): *Memos:
    • "category" and/or "annotation" can be set to it.
    • An empty tuple or list can also be set to it.
    • The multiple same values can be set to it.
    • If the order of values is different, the order of their elements is also different.
    • The 8.677 images with the labels from 101 categories(classes) and/or with annotations are returned.
  • The 3rd argument is transform(Optional-Default:None-Type:callable).
  • The 4th argument is target_transform(Optional-Default:None-Type:callable).
  • The 5th argument is download(Optional-Default:False-Type:bool): *Memos:
    • If it's True, the dataset is downloaded from the internet and extracted(unzipped) to root.
    • If it's True and the dataset is already downloaded, it's extracted.
    • If it's True and the dataset is already downloaded and extracted, nothing happens.
    • It should be False if the dataset is already downloaded and extracted because it's faster.
    • scipy may be required to load target files from .mat files.
    • gdown may be required to download the dataset.
    • You can manually download and extract the dataset(101_ObjectCategories.tar.gz and Annotations.tar) from here to data/caltech101/.
  • About the label from the categories(classes) for the image indices, Faces(0) is 0~434, Faces_easy(1) is 435~869, Leopards(2) is 870~1069, Motorbikes(3) is 1070~1867, accordion(4) is 1868~1922, airplanes(5) is 1923~2722, anchor(6) is 2723~2764, ant(7) is 2765~2806, barrel(8) is 2807~2853, bass(9) is 2854~2907, etc.
from torchvision.datasets import Caltech101

category_data = Caltech101(
    root="data"
)

category_data = Caltech101(
    root="data",
    target_type="category",
    transform=None,
    target_transform=None,
    download=False
)

annotation_data = Caltech101(
    root="data",
    target_type="annotation"
)

all_data = Caltech101(
    root="data",
    target_type=["category", "annotation"]
)

len(category_data), len(annotation_data), len(all_data)
# (8677, 8677, 8677)

category_data
# Dataset Caltech101
#     Number of datapoints: 8677
#     Root location: data\caltech101
#     Target type: ['category']

category_data.root
# 'data/caltech101'

category_data.target_type
# ['category']

print(category_data.transform)
# None

print(category_data.target_transform)
# None

category_data.download
# <bound method Caltech101.download of Dataset Caltech101
#     Number of datapoints: 8677
#     Root location: data\caltech101
#     Target type: ['category']>

len(category_data.categories), category_data.categories
# (101,
#  ['Faces', 'Faces_easy', 'Leopards', 'Motorbikes', 'accordion', 
#   'airplanes', 'anchor', 'ant', 'barrel', 'bass', 'beaver',
#   'binocular', 'bonsai', 'brain', 'brontosaurus', 'buddha',
#   'butterfly', 'camera', 'cannon', 'car_side', 'ceiling_fan',
#   'cellphone', 'chair', ..., 'windsor_chair', 'wrench', 'yin_yang'])

len(category_data.annotation_categories), category_data.annotation_categories
# (101,
#  ['Faces_2', 'Faces_3', 'Leopards', 'Motorbikes_16', 'accordion',
#   'Airplanes_Side_2', 'anchor', 'ant', 'barrel', 'bass', 'beaver',
#   'binocular', 'bonsai', 'brain', 'brontosaurus', 'buddha',
#   'butterfly', 'camera', 'cannon', 'car_side', 'ceiling_fan',
#   'cellphone', 'chair', ..., 'windsor_chair', 'wrench', 'yin_yang'])

category_data[0]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=510x337>, 0)

category_data[1]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=519x343>, 0)

category_data[2]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=492x325>, 0)

category_data[435]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=290x334>, 1)

category_data[870]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=192x128>, 2)

annotation_data[0]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=510x337>,
#  array([[10.00958466, 8.18210863, 8.18210863, 10.92332268, ...],
#         [132.30670927, 120.42811502, 103.52396166, 90.73162939, ...]]))

annotation_data[1]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=519x343>,
#  array([[15.19298246, 13.71929825, 15.19298246, 19.61403509, ...],
#         [121.5877193, 103.90350877, 80.81578947, 64.11403509, ...]]))

annotation_data[2]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=492x325>,
#  array([[10.40789474, 7.17807018, 5.79385965, 9.02368421, ...],
#         [131.30789474, 120.69561404, 102.23947368, 86.09035088, ...]]))

annotation_data[435]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=290x334>,
#  array([[64.52631579, 95.31578947, 123.26315789, 149.31578947, ...],
#         [15.42105263, 8.31578947, 10.21052632, 28.21052632, ...]]))

annotation_data[870]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=192x128>,
#  array([[2.96536524, 7.55604534, 19.45780856, 33.73992443, ...],
#         [23.63413098, 32.13539043, 33.83564232, 8.84193955, ...]]))

all_data[0]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=510x337>,
#  (0, array([[10.00958466, 8.18210863, 8.18210863, 10.92332268, ...],
#             [132.30670927, 120.42811502, 103.52396166, 90.73162939, ...]])))

all_data[1]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=519x343>,
#  (0, array([[15.19298246, 13.71929825, 15.19298246, 19.61403509, ...],
#             [121.5877193, 103.90350877, 80.81578947, 64.11403509, ...]])))

all_data[2]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=492x325>,
#  (0, array([[10.40789474, 7.17807018, 5.79385965, 9.02368421, ...],
#             [131.30789474, 120.69561404, 102.23947368, 86.09035088, ...]])))

all_data[435]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=290x334>,
#  (1, array([[64.52631579, 95.31578947, 123.26315789, 149.31578947, ...],
#             [15.42105263, 8.31578947, 10.21052632, 28.21052632, ...]])))

all_data[870]
# (<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=192x128>,
#  (2, array([[2.96536524, 7.55604534, 19.45780856, 33.73992443, ...],
#             [23.63413098, 32.13539043, 33.83564232, 8.84193955, ...]])))

import matplotlib.pyplot as plt

def show_images(data, main_title=None):
    plt.figure(figsize=(10, 5))
    plt.suptitle(t=main_title, y=1.0, fontsize=14)
    ims = (0, 1, 2, 435, 870, 1070, 1868, 1923, 2723, 2765)
    if len(data.target_type) == 1:
        if data.target_type[0] == "category":
            for i, j in enumerate(iterable=ims, start=1):
                plt.subplot(2, 5, i)
                im, lab = data[j]
                plt.imshow(X=im)
                plt.title(label=lab)
        elif data.target_type[0] == "annotation":
            for i, j in enumerate(iterable=ims, start=1): 
                plt.subplot(2, 5, i)
                im, (px, py) = data[j]
                plt.imshow(X=im)
                plt.scatter(x=px, y=py)
                plt.imshow(X=im)
    elif len(data.target_type) == 2:
        if data.target_type[0] == "category":
            for i, j in enumerate(iterable=ims, start=1): 
                plt.subplot(2, 5, i)
                im, (lab, (px, py)) = data[j]
                plt.imshow(X=im)
                plt.title(label=lab)
                plt.scatter(x=px, y=py)
        elif data.target_type[0] == "annotation":
            for i, j in enumerate(iterable=ims, start=1): 
                plt.subplot(2, 5, i)
                im, ((px, py), lab) = data[j]
                plt.imshow(X=im)
                plt.scatter(x=px, y=py)
                plt.title(label=lab)
    plt.tight_layout()
    plt.show()

show_images(data=category_data, main_title="category_data")
show_images(data=annotation_data, main_title="annotation_data")
show_images(data=all_data, main_title="all_data")
Enter fullscreen mode Exit fullscreen mode

Image description

Image description

Image description

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay