DEV Community

Natalia D
Natalia D

Posted on

My machine-learning pet project. Part 3. Brushing up and labelling my dataset.

My pet-project is about food recognition. More info here.

Image description

Scrapy made me a folder with images and a .csv with rows like:

Apple Cake,"Some apple cake description...",,"[{'url': '', 'path': 'full/ae00a78059ad08506aa4767ed925bef5dccabf63.jpg', 'checksum': '55088c744a564af5ed8d4e5ea6478d20', 'status': 'downloaded'}]"
Enter fullscreen mode Exit fullscreen mode

Now, I needed to create .csv files like this:

Enter fullscreen mode Exit fullscreen mode

To do that, I googled smth like "best machine learning label tools 2022" and found Label-studio. I followed these steps from their docs:

python3 -m venv env
source env/bin/activate
python -m pip install label-studio
Enter fullscreen mode Exit fullscreen mode

But I couldn't launch label-studio until I did some of these things:

pip install wheel
pip install spacy
pip install cymem
brew install postgresql
Enter fullscreen mode Exit fullscreen mode

(link1, link2 might be helpful)

I didn't have any problems importing my data to Label-studio. The only setting I had to do was to add my labeling interface code:

  <Image name="image" value="$image"/>
  <Choices name="choice" toName="image" showInLine="true">
    <Choice value="Salad" background="blue"/>
    <Choice value="Soup" background="green" />
    <Choice value="Pastry" background="orange" />
    <Choice value="Nothing" background="orange" />
  <RectangleLabels name="label" toName="image">
    <Label value="Salad" background="green"/>
    <Label value="Soup" background="blue"/>
    <Label value="Pastry" background="orange"/>
    <Label value="Nothing" background="black"/>
Enter fullscreen mode Exit fullscreen mode

After that, labeling interface looked like this:

Image description

I had label "Nothing" for confusing images that I decided to exclude from the dataset:

Image description

After I finished labeling a portion of images, I chose to export them in .csv format. I got a file with rows like this:

/data/upload/1/83d8ce57-7478f053119ca2a85c4932870ef3e1833eb3eeb5.jpg,14,Pastry,"[{""x"": 13.157894736842104, ""y"": 6.5625, ""width"": 41.578947368421055, ""height"": 74.375, ""rotation"": 0, ""rectanglelabels"": [""Pastry""], ""original_width"": 570, ""original_height"": 320}]",1,20,2022-06-06T07:11:56.377261Z,2022-06-10T21:02:35.758255Z,1209.599
Enter fullscreen mode Exit fullscreen mode

I was surprised when I saw x, y, width and height. Then I've read in the docs that "Image annotations exported in JSON format use percentages of overall image size, not pixels, to describe the size and location of the bounding boxes."

I wrote a small python script to check exported regions:

from PIL import Image
img ='../../images/full/7478f053119ca2a85c4932870ef3e1833eb3eeb5.jpg')

x = 13.157894736842104
y = 6.5625
width = 41.578947368421055
height = 74.375
original_width = 570
original_height = 320

pixel_x = x / 100.0 * original_width
pixel_y = y / 100.0 * original_height
pixel_width = width / 100.0 * original_width
pixel_height = height / 100.0 * original_height

left = pixel_x
upper = pixel_y
right = pixel_x + pixel_width
lower = pixel_y + pixel_height

box = (left, upper, right, lower)
region = img.crop(box)
Enter fullscreen mode Exit fullscreen mode

When I launched the script, it showed me the correct region cropped out of the original image:

Image description

So now I understood how to get pixel annotations if I need them.

Next post is going to be about trying to feed the dataset to Retinanet. I am going to use only 48 images scraped so far just to see what format of input it really needs.

Top comments (0)