DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on

1 1 1 1 1

CocoCaptions in PyTorch (3)

Buy Me a Coffee

*Memos:

  • My post explains CocoCaptions() using train2014 with captions_train2014.json, instances_train2014.json and person_keypoints_train2014.json, val2014 with captions_val2014.json, instances_val2014.json and person_keypoints_val2014.json and test2017 with image_info_test2014.json, image_info_test2015.json and image_info_test-dev2015.json.
  • My post explains CocoCaptions() using train2017 with captions_train2017.json, instances_train2017.json and person_keypoints_train2017.json, val2017 with captions_val2017.json, instances_val2017.json and person_keypoints_val2017.json and test2017 with image_info_test2017.json and image_info_test-dev2017.json.
  • My post explains CocoDetection() using train2014 with captions_train2014.json, instances_train2014.json and person_keypoints_train2014.json, val2014 with captions_val2014.json, instances_val2014.json and person_keypoints_val2014.json and test2017 with image_info_test2014.json, image_info_test2015.json and image_info_test-dev2015.json.
  • My post explains CocoDetection() using train2017 with captions_train2017.json, instances_train2017.json and person_keypoints_train2017.json, val2017 with captions_val2017.json, instances_val2017.json and person_keypoints_val2017.json and test2017 with image_info_test2017.json and image_info_test-dev2017.json.
  • My post explains CocoDetection() using train2017 with stuff_train2017.json, val2017 with stuff_val2017.json, stuff_train2017_pixelmaps with stuff_train2017.json, stuff_val2017_pixelmaps with stuff_val2017.json, panoptic_train2017 with panoptic_train2017.json, panoptic_val2017 with panoptic_val2017.json and unlabeled2017 with image_info_unlabeled2017.json.
  • My post explains MS COCO.

CocoCaptions() can use MS COCO dataset as shown below. *This is for train2017 with stuff_train2017.json, val2017 with stuff_val2017.json, stuff_train2017_pixelmaps with stuff_train2017.json, stuff_val2017_pixelmaps with stuff_val2017.json, panoptic_train2017 with panoptic_train2017.json, panoptic_val2017 with panoptic_val2017.json and unlabeled2017 with image_info_unlabeled2017.json:

from torchvision.datasets import CocoCaptions

stf_train2017_data = CocoCaptions(
    root="data/coco/imgs/train2017",
    annFile="data/coco/anns/stuff_trainval2017/stuff_train2017.json"
)

stf_val2017_data = CocoCaptions(
    root="data/coco/imgs/val2017",
    annFile="data/coco/anns/stuff_trainval2017/stuff_val2017.json"
)

len(stf_train2017_data), len(stf_val2017_data)
# (118287, 5000)

pms_stf_train2017_data = CocoCaptions(
    root="data/coco/anns/stuff_trainval2017/stuff_train2017_pixelmaps",
    annFile="data/coco/anns/stuff_trainval2017/stuff_train2017.json"
)

pms_stf_val2017_data = CocoCaptions(
    root="data/coco/anns/stuff_trainval2017/stuff_val2017_pixelmaps",
    annFile="data/coco/anns/stuff_trainval2017/stuff_val2017.json"
)

len(pms_stf_train2017_data), len(pms_stf_val2017_data)
# (118287, 5000)

# pan_train2017_data = CocoCaptions(
#     root="data/coco/anns/panoptic_trainval2017/panoptic_train2017",
#     annFile="data/coco/anns/panoptic_trainval2017/panoptic_train2017.json"
# ) # Error

# pan_val2017_data = CocoCaptions(
#     root="data/coco/anns/panoptic_trainval2017/panoptic_val2017",
#     annFile="data/coco/anns/panoptic_trainval2017/panoptic_val2017.json"
# ) # Error

unlabeled2017_data = CocoCaptions(
    root="data/coco/imgs/unlabeled2017",
    annFile="data/coco/anns/unlabeled2017/image_info_unlabeled2017.json"
)

len(unlabeled2017_data)
# 123403

stf_train2017_data[2] # Error

stf_train2017_data[47] # Error

stf_train2017_data[64] # Error

stf_val2017_data[2] # Error

stf_val2017_data[47] # Error

stf_val2017_data[64] # Error

pms_stf_train2017_data[2] # Error

pms_stf_train2017_data[47] # Error

pms_stf_train2017_data[64] # Error

pms_stf_val2017_data[2] # Error

pms_stf_val2017_data[47] # Error

pms_stf_val2017_data[64] # Error

unlabeled2017_data[2]
# (<PIL.Image.Image image mode=RGB size=640x427>, [])

unlabeled2017_data[47]
# (<PIL.Image.Image image mode=RGB size=428x640>, [])

unlabeled2017_data[64]
# (<PIL.Image.Image image mode=RGB size=640x480>, [])

import matplotlib.pyplot as plt

def show_images(data, ims, main_title=None):
    file = data.root.split('/')[-1]
    fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(14, 8))
    fig.suptitle(t=main_title, y=0.9, fontsize=14)
    for i, axis in zip(ims, axes.ravel()):
        if not data[i][1]:
            im, _ = data[i]
            axis.imshow(X=im)
    fig.tight_layout()
    plt.show()

ims = (2, 47, 64)

show_images(data=unlabeled2017_data, ims=ims,
            main_title="unlabeled2017_data")
Enter fullscreen mode Exit fullscreen mode

Image description

Image of Docusign

Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay