First Machine Learning Project
Created for an introduction to image understanding course.
Simple player detection, using PyTorch's Faster R-CNN model with a ResNet-50-FPN Backbone, with in-place horizontal flipping data augmentation.
Demo Link
https://github.com/ryanro97/player-detector/blob/master/predicted-1.gif
https://github.com/ryanro97/player-detector/blob/master/predicted-2.gif
https://drive.google.com/drive/folders/1oLPPxYUrAYwq40zxfc5OilLNLeyW4kct?usp=sharing
Link to Code
Player Detector
Player detection using PyTorch's Faster R-CNN model with a ResNet-50-FPN Backbone.
Training and validation images extracted using FFMPEG, and labeled using Tzuta Lin's LabelImg tool.
Both the Trainer and Predictor can be used to train and predict other objects, however the only data augmentation implemented is a horizontal flip, due to other augmentations not making much sense for player detection.
Further details to setup object detection are in the notebooks.
Examples:
Predicted Play 1:
Predicted Play 2:
How I built it (what's the stack? did I run into issues or discover something new along the way?)
Built on Jupyter Notebook using Python 3 and mainly the PyTorch library, with the addition of cv2 and skvideo packages. Trained using Google Colab's Tesla K80 GPU.
Additional Thoughts / Feelings / Stories
Though the outcome was satisfactory, optimizing parameters for a model with a dataset that takes roughly 12 hours to train is extremely time consuming and at times quite frustrating. This made me do a double take whether I wanted to go further into this area.
Top comments (0)