My machine-learning pet project. Part 4. Feeding the dataset to Retinanet

#machinelearning #imagerecognition

My pet-project is about food recognition. More info here.

Meme from csteinmetz1 on reddit

Today I want to feed my dataset to Retinanet using a csv format. For now, I am going to use only 2 labels and about 80 images (half labelled Salad, half labelled Pastry) just to try Retinanet, to get an idea how it works.

According to retinanet readme, annotations file should look like this:

pic-037.jpg,80,20,500,120,Salad
pic-025.jpg,520,250,1152,953,Pastry
pic-with-nothing.jpg,,,,,
pic-004.jpg,0,0,1600,1113,Salad
...

But Scrapy made me a csv file like this:

Apple Cake,"Some apple cake description...",https://www.some-recipes-website.ru/binfiles/images/20200109/m12b509e.jpg,"[{'url': 'https://www.some-recipes-website.ru/binfiles/images/20200109/m12b509e.jpg', 'path': 'full/ae00a78059ad08506aa4767ed925bef5dccabf63.jpg', 'checksum': '55088c744a564af5ed8d4e5ea6478d20', 'status': 'downloaded'}]"

So the first thing I did was to write a small Python script which converted Scrapy csv to a Retinanet csv.

I spent some time wondering what are the requirements for the images in the dataset. I found one opinion here:

No need to rescale your image, because RetinaNet resize the image to get the appropriate size by default.
In general I would advise to keep your settings the same during inference and training.

Ok, so I won't do anything about images resolution - for now.

Then I needed to install Retinanet on my computer. I cloned the repo and went to its root.

I have macOS Monterey, arm 64. The biggest challenge was to install tensorflow, which was one of Retinanet dependencies.

I followed this article. Installed conda, then

conda create --name mlp python=3.8
conda activate mlp
conda install -c apple tensorflow-deps
pip install tensorflow-macos

I copied the dataset files (csv and images) to the root of retinanet repo. Made labels.csv:

Salad,0
Pastry,1

Then I followed retinanet Readme:

pip install . --user
python setup.py build_ext --inplace

Tried to run training script

keras_retinanet/bin/train.py csv dataset/annotations.csv dataset/labels.csv

but Retinanet said:

ValueError: invalid CSV annotations file: dataset/annotations.csv: line 1: malformed x1: invalid literal for int() with base 10: '43.0'

I fixed the annotations file so it contained integer coordinates instead of float ones and repeated the attempt.

Retinanet worked for about 3 mins and said:

Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches (in this case, 2 batches). You may need to use the repeat() function when building your dataset.

Well, that wasn't unexpected (because I had only about 80 images in my dataset, remember?). I don't know what epochs are at the moment, but I will find it out while writing a next post.