DEV Community

Natalia D
Natalia D

Posted on

5 1

My machine-learning pet project. Part 5. Exploring epochs and batches

My pet-project is about food recognition. More info here.

Image description

The last thing I did was to feed a part of my dataset to Retinanet. This part consisted of 71 images. Retinanet basically complained that my dataset was too small for training.

It said, "Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches".

I looked up epochs and batch sizes. One training epoch means that the learning algorithm has made one pass through the training dataset, where examples were separated into randomly selected “batch size” groups. But I've just remembered that 1 picture is worth a 1000 words, so I've drawn a picture for you, check this out:

Image description

steps_per_epoch (or steps) is number of batches to run per training epoch. So if we have batch_size = 32, with 71 images we can break dataset into 2 batches (=2 steps). 64 images will be used for training, and 7 images will be redundant.

I excluded 7 images from my little dataset and ran command:

keras_retinanet/bin/train.py csv dataset/annotations3.csv dataset/labels.csv --epochs 50 --steps 2
Enter fullscreen mode Exit fullscreen mode

but I got a message:

Failed to load the native TensorFlow runtime.
Enter fullscreen mode Exit fullscreen mode

It means I forgot to load the environment where Tensorflow had been installed. I ran command:

conda activate mlp
Enter fullscreen mode Exit fullscreen mode

repeated the train.py command, and ran into a different error message:

train.py: error: unrecognized arguments: --epochs 50 --steps 2
Enter fullscreen mode Exit fullscreen mode

After reading this issue I realised that I needed to change my command to this:

keras_retinanet/bin/train.py --epochs=50 --steps=2 csv dataset/annotations3.csv dataset/labels.csv
Enter fullscreen mode Exit fullscreen mode

After that Retinanet started working and finished its calculations without errors:

Image description

Next time I'll try to plan my real dataset size, I need to find a compromise between what I can label and what amount needs to be labelled. I'll research the methods for making the most of small sized datasets.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay