DEV Community: Abhishek Annamraju

Ensemble of classifiers using Monk: Creating a food classifier

Abhishek Annamraju — Tue, 03 Dec 2019 11:23:39 +0000

TLDR;
Colab Notebook

What will you build!

In this post we will make 3 levels of Food Classification models :

Classify input image as Food vs Non Food
Classify an input image of Food into 11 super categories
Classify an input image of Food as one of 101 dishes

And finally combine the 3 projects into one Food Classification application.

Let’s begin!

Setup

We start by setting up Monk and it’s dependencies on colab. For further setup instructions on different platforms check out the DOCS.

$ git clone https://github.com/Tessellate-Imaging/monk_v1
$ cd monk_v1/installation && pip install -r requirements_cu10.txt
$ cd ../..

Setup the dependencies based on the platform you are working with. Monk is compatible with Ubuntu, MacOS, Windows and online Jupyter environments like Kaggle and Colab.

Project 1 : Food vs Non Food Classification

Let’s create our first project for classifying an input image into food or non food categories. The goal of this project is solely to determine if a food item is present in the input image and can be further classified into a super and sub category.

Start Experiment 1

Import monk library

import os
import sys
sys.path.append("./monk_v1/monk/");
import psutil
from pytorch_prototype import prototype

and create a new experiment

ptf = prototype(verbose=1);
ptf.Prototype(“food_nonfood”, “exp1”);

Dataset

For this project we are using the Food-5k dataset.

“This dataset contains 2500 food and 2500 non-food images, for the task of food/non-food classification from the paper “Food/Non-food Image Classification and Food Categorisation using Pre-Trained GoogLeNet Model”. The whole dataset is divided in three parts: training, validation and evaluation. The naming convention is as follows:

{ClassID}_{ImageID}.jpg

ClassID: 0 or 1; 0 means non-food and 1 means food.”

For our project we will combine the training, validation and evaluation sets into one.

from glob import glob
import os
import shutil
from tqdm import tqdmfolders = glob("./food-5k/*")food_dir = './food-5k/food'
non_food_dir = './food-5k/non-food'
if not os.path.exists(food_dir):
    os.makedirs(food_dir)
if not os.path.exists(non_food_dir):
    os.makedirs(non_food_dir)c = 1
n_c = 1
for i in folders:
    imageList = glob(i + '/*.jpg')
    print(len(imageList))
    for j in tqdm(imageList):
        imgName = j.split('/')[-1]
        label = imgName.split('_')[0]
        if label == '0':
            outPath = non_food_dir + '/' + str(n_c) + '.jpg'
            n_c += 1
        elif label == '1':
            outPath = food_dir + '/' + str(c) + '.jpg'
            c += 1
        shutil.move(j,outPath)

Now we can load our dataset, select our CNN architecture to train and set the number of epochs.

ptf.Default(dataset_path="./food-5k/", model_name="resnet18", freeze_base_network=True, num_epochs=5)

And begin training :

ptf.Train()

After training is finished we can observe the loss and accuracy plots stored inside workspace>”project_name”>”exp_name”>output>logs

accuracy plot for food_nonfood experiment 1

Let’s create a new experiment with a different CNN architecture and see if we can achieve better performance.

Start Experiment 2

ptf = prototype(verbose=1);
ptf.Prototype(“food_nonfood”, “exp2”);

We can check the available models using :

ptf.List_Models()

Our experiment 1 was created using ‘resnet18’. For experiment 2 we will select ‘resnet101’.

By using residual networks, many problems can be solved such as:

ResNets are easy to optimize, but the “plain” networks (that simply stack layers) shows higher training error when the depth increases.
ResNets can easily gain accuracy from greatly increased depth, producing results which are better than previous networks.

To know more about ResNet architectures check out this post.

ptf.Default(dataset_path=”./food-5k/”, 
            model_name=”resnet101", 
            freeze_base_network=True, num_epochs=5);

For experiment 2 we will unfreeze a few more layers from our pre-trained model to make them available for training. We also increase the number of epochs and reload the experiment.

ptf.update_freeze_layers(100);
ptf.update_num_epochs(30);
ptf.Reload()

And finally train the model:

ptf.Train()

Once again we can observe the loss and accuracy plots for this experiment:
accuracy plot for food_nonfood classifier experiment 2

Comparison

We can compare the two experiments for training and validation accuracies and losses. (Check out DOCS)

from compare_prototype import compare
ctf = compare(verbose=1);
ctf.Comparison("food_nonfood");
ctf.Add_Experiment("food_nonfood", "exp1");
ctf.Add_Experiment("food_nonfood", "exp2");
ctf.Generate_Statistics();

Let’s test our model on food and non_food images.

Inference

First we load the experiment in evaluation mode:

ptf.Prototype(“food_nonfood”, “exp2”, eval_infer=True);
img_name = “./test.jpg”;
predictions = ptf.Infer(img_name=img_name, return_raw=False);
print(predictions);

Prediction
    Image name:         ./test.jpg
    Predicted class:      food
    Predicted score:      4.16873025894165

{'img_name': './test.jpg', 'predicted_class': 'food', 'score': 4.1687303}

Project 2 : Food-11 Classification

Now that we can classify input image as food or nonfood, we shall begin with classifying food images into different categories of dishes.

Dataset

For this project we will utilise Food-11 dataset.

“This dataset contains 16643 food images grouped in 11 major food categories. The 11 categories are Bread, Dairy product, Dessert, Egg, Fried food, Meat, Noodles/Pasta, Rice, Seafood, Soup, and Vegetable/Fruit. Similar as Food-5K dataset, the whole dataset is divided in three parts: training, validation and evaluation. The same naming convention is used, where ID 0–10 refers to the 11 food categories respectively.”

We will combine the training, evaluation and validation folders into one set of images and split them into respective class folders.

classes = {'0':'Bread','1':'Dairy_Product','2':'Dessert','3':'Egg','4':'Fried_Food','5':'Meat','6':'Noodles_Pasta','7':'Rice','8':'Seafood','9':'Soup','10':'Vegetable_Fruit'}from glob import globfolders = glob("./food-11/*")
print(folders)import osfor k,item in classes.items():
    directory = './food-11/' + item
    if not os.path.exists(directory):
        os.makedirs(directory)import shutil
from tqdm import tqdmc = 1
for i in folders:
    imageList = glob(i + '/*.jpg')
    #print(len(imageList))
    for j in tqdm(imageList):
        imgName = j.split('/')[-1]
        label = imgName.split('_')[0]
        outPath = './food-11/' + classes[label] + '/' + str(c) + '.jpg'
        c += 1
        shutil.move(j,outPath)

Start Experiment 1

We can now create a new project, start with our experiment and load our dataset.

ptf = prototype(verbose=1);
ptf.Prototype("food-11", "exp1");
ptf.Default(dataset_path="./food-11/", 
            model_name="resnet101", 
            freeze_base_network=True, num_epochs=10)

For this experiment we are using Resnet101 as our pre-trained model. Finally we can begin training :

ptf.Train()

Loss curves

We can observe that after 10 epochs the model achieves an accuracy of ~ 75%. The validation loss was still going down and we can achieve better accuracy if we train for more epochs. Let’s try that out using Copy_Experiment

Start Experiment 2

We create the new experiment using the previous experiment 1 as a template and make a few updates.

ptf = prototype(verbose=1)
ptf.Prototype("food-11", "exp2", 
              copy_from=["food-11", "exp1"]);

ptf.update_freeze_layers(100);
ptf.update_num_epochs(30);

ptf.Reload();

Note : Don’t forget to reload the experiment with ‘ptf.Reload()’ after making updates to your experiment

And now we can begin our training. Once training is complete we will compare the both the experiments and choose one to inference with.

Compare

ctf = compare(verbose=1);
ctf.Comparison("food-11");

ctf.Add_Experiment("food-11", "exp1");
ctf.Add_Experiment("food-11", "exp2");

ctf.Generate_Statistics();

Since we used Copy_Experiment the accuracy for experiment 2 for epoch 1 starts from a better value that experiment 1. We can also observe that we quickly reach saturation around 10 epochs.

We can also observe the validation accuracy keeps fluctuating. We can definitely choose to go ahead with experiment 2 model with considerably better training and validation performance.

Inference

Our food-11 project model can classify an input food image into one of 11 top level food categories.

Let’s test it on our waffles image!

{'img_name': './test.jpg', 'predicted_class': 'Dessert', 'score': 27.722319}

The prediction comes out to ‘Dessert’!

Project 3 : Food-101 Classification

Finally we move on to classifying the input image into exactly what we are looking at. In this project we will classify the image from out of 101 classes of food items.

Dataset

For this experiment we will be utilising dataset gathered by these awesome researchers : LINK

Citations at the bottom of the post.

“This dataset contains 101 food categories, with 101'000 images. For each class, 250 manually reviewed test images are provided as well as 750 training images. On purpose, the training images were not cleaned, and thus still contain some amount of noise. This comes mostly in the form of intense colors and sometimes wrong labels. All images were rescaled to have a maximum side length of 512 pixels.”

Start Experiment 1

For this experiment we will utilise a denset201 pretrained model.

We can create the experiment, load the dataset, select the pretrained architecture, set the number of epochs and begin training all in just 4 lines of code.

ptf = prototype(verbose=1)
ptf.Prototype("food-101", "exp1")

ptf.Default(dataset_path="./food-101/images/", 
            model_name="densenet169", freeze_base_network=False,
            num_epochs=10)

ptf.Train()

After 10 epochs the training accuracy comes out to ~87% which is not bad to start out with.

Let’s finally test our waffles input image and see the results.

Inference

Let’s load up the experiment in evaluation mode and test on single image. Check out our Docs for running test on a batch of images.

{'img_name': './test.jpg', 'predicted_class': 'waffles', 'score': 74.87055}

And guess what! The model predicts that the input image is of “Waffles”.

Food Classification Application

Finally we will combine the 3 projects to create a Food Classification application, which can detect if an input image is of a food item and if yes then predict the super (out of 11) and sub (out of 101) category of the food.

import requests
import os
import sys
sys.path.append("./monk_v1/monk/");
import psutil
from pytorch_prototype import prototypedef saveImg(img_url):
    # URL of the image to be downloaded is defined as image_url 
    r = requests.get(img_url) # create HTTP response object 
    with open('test.jpg','wb') as f: 
        f.write(r.content)def classify(img_url):
    saveImg(img_url)
    img_name = './test.jpg'
    ptf1 = prototype(verbose=0)
    ptf1.Prototype("food_nonfood", "exp2", eval_infer=True);    predictions = ptf1.Infer(img_name=img_name, return_raw=False);    if predictions['predicted_class'] == "non_food":
        return "Input image does not contain food"
    else:
        ptf2 = prototype(verbose=0)
        ptf2.Prototype("food-11", "exp2", eval_infer=True);
        predictions = ptf2.Infer(img_name=img_name,  return_raw=False);    superLabel = predictions['predicted_class']    ptf3 = prototype(verbose=0)
        ptf3.Prototype("food-101", "exp1", eval_infer=True);
        predictions = ptf3.Infer(img_name=img_name, return_raw=False);    subLabel = predictions['predicted_class']    return "Input image is of category {}, and is actually {}.".format(superLabel,subLabel)

Finally to test the classification app run:

test_url = "https://hips.hearstapps.com/hmg-prod.s3.amazonaws.com/images/delish-keto-waffle-horizontal-034-1543784709.jpg"
output = classify(test_url)
print(output)

To test with a new image, update the test_url.

Input image is of category Bread, and is actually pizza.

That’s all for this post folks. Hope you enjoyed going through this.

The trained models are available to download here as a workspace directory HERE.

The workspace folder is cross platform compatible. All you need to do is setup monk and paste the workspace in your working directory to be able to run inference.

Do give us a star on Github if you like what you see (All the haters of food get out!)

Happy Coding!

References

@inproceedings{bossard14,
  title = {Food-101 -- Mining Discriminative Components with Random Forests},
  author = {Bossard, Lukas and Guillaumin, Matthieu and Van Gool, Luc},
  booktitle = {European Conference on Computer Vision},
  year = {2014}
}

Quick Prototyping with Monk: Creating a sign language classifier

Abhishek Annamraju — Tue, 03 Dec 2019 11:23:05 +0000

TLDR;
Colab Notebook

Introduction

Communication is an essence of human life. From ancient hieroglyphics to the 6500 languages spoken currently across the world, they all signify the importance of our ability to reach out to our fellow human beings.

Even though the global literacy rate for all people aged 15 and above is 86.3%, there are only about 250 certified sign language interpreters in India, translating for a deaf population of between 1.8 million and 7 million. The wide disparity in population estimates exists because the Indian census doesn’t track the number of deaf people — instead, it documents an aggregate number of people with disabilities.
(Source : https://www.pri.org/stories/2017-01-04/deaf-community-millions-hearing-india-only-just-beginning-sign)

What will you build!

In this blog post we build an American Sign Language classifier and try to automate the process of sign language translation.

Setup

We start by setting up Monk and it’s dependencies on colab. For further setup instructions on different platforms check out the DOCS.

$ git clone https://github.com/Tessellate-Imaging/monk_v1
$ cd monk_v1/installation && pip install -r requirements_cu10.txt
$ cd ../..

Dataset

We will utilise the ASL image dataset from Kaggle — LINK.

Training

Next we will use Pytorch as our backend to create a new Project and use Resnet50 as our pre-trained model.

ptf = prototype(verbose=1);
ptf.Prototype("asl", "exp1");
ptf.Default(dataset_path="./dataset/train", 
            model_name="resnet50", 
            freeze_base_network=True, num_epochs=10);

We train our model for 10 epochs and check the training and validation accuracy.

Accuracy curves

The plot shows that even though our training accuracy is lower, the model performs quite well on the validation set.

The dataset is not diverse enough to create a generalised model.

However further improvement can be achieved by using image augmentation strategies.

Let’s put the model to test with a realtime video classification.

*Note: A process automation pipeline for image or video processing requires much more than just a classification model. In our case we would require a hand detector to localise the position of hands within the frame, a hand tracker to reduce jitter from the hand detector and finally the sign language classifier. Even after setting up all the modules, the heuristics and dataset requirements would differ based on the application and deployment scenario. To give an example, the current classifier is built with a webcam dataset and will surely not work with AR headsets which have a different point of view.

For this exercise we will set a region of interest within the frame where our hand must be placed for the model to determine the gestures.

Realtime Gesture Classification

We begin by loading our experiment in evaluation mode.

import os
import sys
sys.path.append(“./monk_v1/monk/”);
import psutil
from pytorch_prototype import prototype
from imutils.video import VideoStream
import cv2ptf = prototype(verbose=1)
ptf.Prototype(“asl”, “exp1”, eval_infer=True);

We capture a video stream from our webcam, set a region of interest within the frame and store it in a file.

if __name__ == ‘__main__’:
  vs = VideoStream().start()
  im_height, im_width = (None, None)
  while True:
    # Read Frame and process
    frame = vs.read()
    frame = cv2.resize(frame, (640, 480))
    frame = cv2.flip( frame, 1 )    #Set ROI
    cv2.rectangle(frame, (350,50), (600,300), (255,0,0) , 3, 1)    roi = frame[50:300,350:600]    roi = cv2.cvtColor(roi, cv2.COLOR_BGR2RGB)
    cv2.imwrite(“roi.jpg”,roi)

Next we load the ROI image and infer using our classification model generated with Monk and display the predicted class:

    predictions = ptf.Infer(img_name=”roi.jpg”, return_raw=False);

    cv2.putText(frame, predictions[‘predicted_class’],(350,40),cv2.FONT_HERSHEY_SIMPLEX, 1, (0,255,0), 2)

    cv2.imshow(‘ASL’, cv2.cvtColor(frame, cv2.COLOR_RGB2BGR))

    if cv2.waitKey(25) & 0xFF == ord(‘q’):
      cv2.destroyAllWindows()
      vs.stop()
      break

The model performs fairly well in proper lighting conditions. In the following video I try to write the word “MONK” using hand gestures.

Coming Soon!

We are constantly striving to improve the features available in Monk. Following are some developments that we will be releasing soon:

Adding custom layers and creating custom Deep Neural architectures
More optimisers, regularisers and loss functions
Multi-label classification for image tagging

We are creating a community of collaborators for Monk.

Signup here to become a beta tester : LINK

Mention in comments which features and applications you would like us to build next.

Do give us a star on Github if you like what you see!

Happy Coding!

Find hyper-parameters using Monk - Creating a Plant Disease Classifier

Abhishek Annamraju — Tue, 03 Dec 2019 11:22:25 +0000

TLDR;
Colab Notebook

In agriculture, leaf diseases cause a major decrease in both quality and quantity of yields. Automating plant disease detection using Computer Vision could play a role in early detection and prevention of diseases.

What will you build!

In this exercise we will explore how to build a plant leaf disease classifier using Monk’s quick Hyper-Parameter finding feature.

Monk provides a syntax invariant transfer learning framework that supports Keras, Pytorch and Mxnet in the backend. (Read — Documentation).

Computer Vision developers have to explore strategies while selecting the correct learning rates, a fitting CNN architecture, use the right optimisers and fine-tune many more parameters to get the best performing models.

The Hyper-Parameter finding features assists in analysing multiple options for a selected hyper-parameter before proceeding with the actual experiment. This not just saves a lot of time spent in prototyping but also assists in quickly exploring the how well a selected set of parameters perform on the dataset in use and the final application.

Let’s begin!

Setup

We start by setting up Monk and it’s dependencies on colab. For further setup instructions on different platforms check out the DOCS.

$ git clone https://github.com/Tessellate-Imaging/monk_v1
$ cd monk_v1/installation && pip install -r requirements_cu10.txt
$ cd ../..

Dataset

For this exercise we will use dataset gathered by the awesome folks at PlantVillage.

Experimentation

Before setting up our analysis, we have to start by creating a new project and experiment

# Step 1 - Create experimentptf = prototype(verbose=1);ptf.Prototype("plant_disease", "exp1");

and setup the ‘Default’ dataset paths

ptf.Default(dataset_path=["./dataset/train", "./dataset/val"], 
            model_name="resnet18", 
            freeze_base_network=True, num_epochs=5);

Now we are ready to run some analysis and find the best Hyper-Parameters.

Currently we can analyse the following parameters :

Find the best CNN architecture — DOCS
Find the right batch size — DOCS
Find the a good input shape — DOCS
Select a good starting learning rate — DOCS
Select the best performing Optimiser — DOCS

We will analyse each of the above parameters to select the best and finally train our model to build the application of Plant Leaf disease classification.

Model Finder

Start by giving a name to the analysis. For every analysis a new project is created with multiple experiments inside.

analysis_name = “Model_Finder”;

Now we pass on the list of Models from which to analyse

First element in the list — Model Name
Second element in the list — Boolean value to freeze base network or not
Third element in the list — Boolean value to use pretrained model as the starting point or not

models = [[“resnet34”, True, True], [“resnet50”, False, True],[“densenet121”, False, True], [“densenet169”, True, True], [“densenet201”, True, True]];

Set the Number of epochs for each experiment to run

epochs=5;

Select the Percentage of original dataset to take in for experimentation

percent_data=10;

Finally we run the analysis function to search for best performing models:

“keep_all” — Keeps all the experiments created
“keep_none” — Deletes all experiments created

ptf.Analyse_Models(analysis_name, models, 
                   percent_data, num_epochs=epochs, 
                   state=”keep_none”);

When the analysis is running, the estimated time is displayed for every experiment

Running Model analysis
Analysis Name : Model_Finder Running experiment : 1/5
Experiment name : Model_resnet34_freeze_base_pretrained
Estimated time : 2 min

Finally after the experiment is completed we receive the following output on training and validation accuracies and losses:

Experiment Output

Select the best performing CNN architecture, update your experiment and continue with further analysis. Don’t forget to reload the experiment after updating.

## Update Model Architecture
ptf.update_model_name(“densenet121”);
ptf.update_freeze_base_network(False);
ptf.update_use_pretrained(True);
ptf.Reload();

For further instructions on updating experiment parameters, check out the documentation.

Batch Size Finder

# Analysis Project Name
analysis_name = “Batch_Size_Finder”;# Batch sizes to explore
batch_sizes = [4, 8, 16, 32];# Num epochs for each experiment to run
epochs = 10;# Percentage of original dataset to take in for experimentation
percent_data = 10;
ptf.Analyse_Batch_Sizes(analysis_name, batch_sizes, 
                        percent_data, 
                        num_epochs=epochs, state=”keep_none”);

Generated output:

Experiment Output

Update the experiment :

## Update Batch Size
ptf.update_batch_size(8);
ptf.Reload();

Input Shape Finder

# Analysis Project Name
analysis_name = “Input_Size_Finder”;# Input sizes to explore
input_sizes = [224, 256, 512];# Num epochs for each experiment to run
epochs=5;# Percentage of original dataset to take in for experimentation
percent_data=10;
ptf.Analyse_Input_Sizes(analysis_name, 
                        input_sizes, percent_data, 
                        num_epochs=epochs, state=”keep_none”);

Generated output :
Experiment Output

Update the experiment :

## Update Input Sizeptf.update_input_size(224);
ptf.Reload();

Learning Rate Analysis

# Analysis Project Name
analysis_name = “Learning_Rate_Finder”# Learning rates to explore
lrs = [0.01, 0.005, 0.001, 0.0001];# Num epochs for each experiment to run
epochs=5# Percentage of original dataset to take in for experimentation
percent_data=10
ptf.Analyse_Learning_Rates(analysis_name, lrs, percent_data, 
                           num_epochs=epochs, state=”keep_none”);

Generated output :
Experiment Output

Update the experiment :

## Update Learning Rateptf.update_learning_rate(0.01);
ptf.Reload();

Optimiser Analysis

# Analysis Project Name
analysis_name = “Optimiser_Finder”;# Optimizers to explore
optimizers = [“sgd”, “adam”, “adamax”, “rmsprop”]; #Model name
epochs = 5;# Percentage of original dataset to take in for experimentation
percent_data = 10;
ptf.Analyse_Optimizers(analysis_name, optimizers, percent_data, 
                       num_epochs=epochs, state=”keep_none”);

Generated output :
Experiment Output

Update the experiment :

## Update Optimiserptf.optimizer_adamax(0.001);
ptf.Reload();

Training

Finally after setting the correct hyper-parameters, we can begin training the model.

ptf.Train();

Copy Experiment

We can visualise the accuracy and loss plots, located inside the workspace directory. From the plots we observe that the losses can go further down:

Loss curves

To continue training, we copy our previous experiment and resume from that state — DOCS

Compare Experiments

After finishing our training we can compare both these experiments to check if we actually improved performance using Compare Experiment feature in Monk.

Validation accuracy curves

Our ‘experiment 1’ ran for 5 epochs and ‘experiment 2’ ran for 10 epochs. Even though a minor improvement in validation accuracy, an increase from 96% to 97% could help achieve leader board positions for competitions hosted on Kaggle and EvalAi.

Hope you have fun building niche solutions with our tools.

Happy Coding!

Project management with Monk: Creating an indoor-scene classifier

Abhishek Annamraju — Tue, 03 Dec 2019 11:21:09 +0000

TLDR;
Colab Notebook

What will you build?

In this exercise we will create and fine-tune an ‘indoor scene classifier’ using Monk.

We will use Transfer Learning, wherein we pick a ‘Model’ which has already been trained to answer a similar problem and retrain it for our case.

Transfer Learning is a faster workaround than training a model from scratch.

Monk provides a syntax invariant transfer learning framework that supports Keras, Pytorch and Mxnet in the backend. (Read — Documentation).

Setup

We start by setting up Monk and it’s dependencies on colab. For further setup instructions on different platforms check out the DOCS.

$ git clone https://github.com/Tessellate-Imaging/monk_v1
$ cd monk_v1/installation && pip install -r requirements_cu10.txt
$ cd ../..

Dataset

We will utilise the image dataset gathered by these awesome researchers — LINK.

Optional : How to download large files from Google drive

Training

Next we will use Pytorch as our backend to create a new Project and use Resnet101 as our pre-trained model.

ptf = prototype(verbose=1);
ptf.Prototype("indoor_scene", "exp1");
ptf.Default(dataset_path="./Train/", 
            model_name="resnet101", 
            freeze_base_network=True, num_epochs=10);

With Monk you can download or upload your workspace onto different machines and resume your work from where you left.

To demo this we will run inference by downloading the workspace to our local machine. To keep the download size small, we will disable saving intermediate models.

ptf.update_save_intermediate_models(False);
ptf.Reload();
ptf.Train();

Inference

Let’s zip our workspace and download it locally for inferencing. Before we can begin doing this we have to setup Monk on our local system. Check the documentation for further instructions on this — SETUP

$ zip -r workspace.zip workspace

After importing ‘Monk’ in our local workspace, we can test our model on a single image and check the predicted label. (DOCS)

ptf.Prototype("indoor_scene", "exp1", eval_infer=True);
img_name = "test.jpg";
predictions = ptf.Infer(img_name=img_name, return_raw=False);print(predictions);

Hyper-Parameter Tuning

Now that we have achieved a baseline validation accuracy of ‘INSERT VALUE’, we can resume fine-tuning the generated model from the previous exercise. To do this we use the ‘copy_from’ feature in Monk.

ptf.Prototype("indoor_scene", "exp2", 
              copy_from=["indoor_scene", "exp1"]);

We can now update our hyper-parameters,

###########Update hyperparameters################
ptf.optimizer_asgd(0.001, weight_decay=0.00001);ptf.update_num_epochs(10);
################################################

Add randomised cropping

########Update Transforms#####################
ptf.apply_random_resized_crop(224, train=True, val=True);
###############################################

and freeze more layers from our pre-trained model to tune the weights and biases in the last few layers.

###########For freeze layers#####################
ptf.update_freeze_layers(100);
################################################

We can now retrain our model to see if we can churn a better validation accuracy.

NOTE : Don’t forget to reload the experiment after adding updates.

ptf.Reload();
ptf.Train();

Ohh and don’t forget to take a break, stretch your legs. You can always resume training from the last break point using Monk’s resume training feature.

All you have to do is load the experiment with resume state and begin. Read the docs for more information — (DOCS)

Compare

Finally to check if the second experiment was fruitful and if we can do further improvement, we can compare our experiments and visualise losses.

from compare_prototype import compare

ctf = compare(verbose=1);
ctf.Comparison("indoor_scene");
ctf.Add_Experiment("indoor_scene", "exp1");
ctf.Add_Experiment("indoor_scene", "exp2");
ctf.Generate_Statistics()

The final training accuracy plots show that we definitely improved from our baseline model.

Training Accuracies

We don’t have to stop here. We can utilise augmentation strategies on our dataset and further improve the model performance on newer data.

Happy Coding!

Quick prototyping with Monk - Creating a fashion classifier

Abhishek Annamraju — Tue, 03 Dec 2019 11:20:32 +0000

TLDR;
Colab Notebook

Creating a fashion classifier with Monk and Densenets

In this exercise we will take a look some of MONK’s auxiliary functions while switching our backend framework between Pytorch, Keras and Mxnet.

We will use Densenets as our CNN architecture. To get a deeper understanding of Dense blocks and Densenets do checkout this awesome blog post — LINK

As our dataset source we will be using Myntra Fashion dataset kindly shared by Param Aggarwal — LINK

Let’s begin.

The Setup

We start by setting up Monk and its dependencies on colab. For further setup instructions on different platforms check out the DOCS.

$ git clone https://github.com/Tessellate-Imaging/monk_v1
$ cd monk_v1/installation && pip install -r requirements_cu10.txt
$ cd ../..

Next we grab the data. You can either download the dataset using kaggle. To skip setting up kaggle API on colab, we’ve created a dropbox link to the dataset.

$ wget https://www.dropbox.com/s/wzgyr1dx4sejo5u/dataset.zip
$ unzip dataset.zip

Checking out the ground truth csv file we find different feature columns.

For this exercise we will classify fashion items into their sub category labels. To do this we extract the image ‘id’ column along with ‘subCategory’ to create a new labels file.

import pandas as pd
gt = pd.read_csv("./dataset/styles.csv",error_bad_lines=False)
label_gt = gt[['id','subCategory']]
label_gt['id'] = label_gt['id'].astype(str) + '.jpg'
label_gt.to_csv('./dataset/subCategory.csv',index=False)

Now that we have the images and label files ready, we can begin with creating experiments.

Monk with Pytorch

After importing MONK creating an experiment is simple. We can load Pytorch backend and create a new project and experiment.

import os
import sys
sys.path.append("./monk_v1/monk/");
import psutilfrom pytorch_prototype import prototype
ptf = prototype(verbose=1);
ptf.Prototype("fashion", "exp1");

Next we create the ‘Dataset’ object and setup the dataset and label path along with CNN architecture and number of epochs.(DOCS)

ptf.Default(dataset_path="./mod_dataset/images", 
             path_to_csv="./mod_dataset/subCategory.csv", 
             model_name="densenet121", 
             freeze_base_network=True, num_epochs=5);

We can check for missing or corrupt images and take a look at class imbalances using Monk’s EDA function (DOCS)

ptf.EDA(check_missing=True, check_corrupt=True);

Find missing images in your dataset

We have to update our labels file and remove rows containing missing or corrupt images. The notebook has a function ‘cleanCSV’ for this purpose.

After cleaning our labels file if we generated a new CSV, we have to update our ‘Dataset’ object to the location. This can be done by using the update functions (DOCS). Remember to run ‘Reload()’ after any making an update.

ptf.update_dataset(dataset_path="./dataset/images",
                   path_to_csv="./dataset/subCategory_cleaned.csv");
ptf.Reload()

And now we can start our training with:

ptf.Train()

For this experiment we are using densenet121 architecture. In the next experiments we will be using densenet169 and densenet201 architectures.

Different Densenet architectures

After training is complete we get our final model accuracies and losses saved in our workspace folder. Now we can continue with the experiment and test our the other 2 densenet architectures.

ptf = prototype(verbose=1);
ptf.Prototype("fashion", "exp2");
ptf.Default(dataset_path="./dataset/images", 
            path_to_csv="./dataset/subCategory_cleaned.csv", 
            model_name="densenet169", freeze_base_network=True, 
            num_epochs=5);
ptf.Train()

AND

ptf = prototype(verbose=1);
ptf.Prototype("fashion", "exp3");
ptf.Default(dataset_path="./dataset/images", 
            path_to_csv="./dataset/subCategory_cleaned.csv", 
            model_name="densenet201", 
            freeze_base_network=True, num_epochs=5);
ptf.Train()

After training is complete for the 3 experiments we can quickly compare these and find out differences in losses(both training and validation), compare training time and resource utilisation and select the best model.

Since we have trained only for 5 epochs, it still will not be clear which architecture to choose from the 3, however, you can test out with either more epochs or even with a different CNN architecture. To update the model check out our DOCUMENTATION

Compare experiments

To run comparison we have to import the ‘compare’ module and add the 3 experiments.

from compare_prototype import compare
ctf = compare(verbose=1);
ctf.Comparison("Fashion_Pytorch_Densenet");
ctf.Add_Experiment("fashion", "exp1");
ctf.Add_Experiment("fashion", "exp2");
ctf.Add_Experiment("fashion", "exp3");

After adding the experiments to compare, we can generate the statistics. To find out more about the compare module check out (DOCS)

Training accuracies over time

The training accuracy performance shows that ‘densenet169’ performs marginally better than the other 2. However looking at the validation accuracy shows us a different picture.

Validation accuracies over time

Experiment 3 with ‘densenet201’ performs better that the others. If we compare the best validation accuracies for the 3 models, they are pretty close:

Best validation accuracies

However one more metric to checkout before going ahead is the training times.

Training times

This shows that the more complex our CNN architecture the more time is takes to train per epoch.

Next we’ll create similar experiments but with Keras and Mxnet.

Monk with Mxnet

Monk is a syntax invariant library. So working with either of the available backend Deep Learning framework does not change the program you have to write.

The only change will be in import :

from gluon_prototype import prototype

Next we create experiments the same way with different densenet architectures and carry out training.

Gluon experiments —

gtf = prototype(verbose=1);
gtf.Prototype("fashion", "exp4");
gtf.Default(dataset_path="./dataset/images", 
            path_to_csv="./dataset/subCategory_cleaned.csv",
            model_name="densenet121", 
            freeze_base_network=True, num_epochs=5);
gtf.Train()

And so on…

Monk with Keras

As mentioned in the previous section, the only thing that changes is the import. So to utilise keras in the backend we import:

from keras_prototype import prototype

And then create some more experiments:

ktf = prototype(verbose=1);
ktf.Prototype("fashion", "exp7");
ktf.Default(dataset_path="./dataset/images", 
            path_to_csv="./dataset/subCategory_cleaned.csv", 
            model_name="densenet121", 
            freeze_base_network=True, num_epochs=5);
ktf.Train()

And so on…

Finally when we have trained the 9 experiments we can compare which framework and which architecture performed the best by using the compare utility in Monk(DOCS) and fine-tune the ones performing better.

Comparison plot for training accuracy generated from our exercise :

Training accuracy comparisons

To achieve better results, try changing the learning rate schedulers and train for more epochs. You can easily copy an experiment and modify the hyper parameters, to generate better results. Check out our documentation for further instructions on this.

Happy Coding!

Transfer learning with Monk

Abhishek Annamraju — Tue, 03 Dec 2019 11:16:23 +0000

Monk provides a syntax invariant transfer learning framework that supports Keras, Pytorch and Mxnet in the backend. (Read — Documentation).

Website
Github

Transfer Learning

Transfer learning is one of the most used techniques in training computer vision models. To put it simply, an already trained model is picked and retrained for a different use-case.

The key steps involved are

Data ingestion
Model selection
Setting parameters
Training
Evaluating results
Comparing results with previous training sessions
Changing the parameters and retraining until we find the best fit model
………… Iterating the same

Let’s compare steps carried out traditionally vs Monk

1. Write less code to begin the prototyping process

Typically to tackle transfer learning, frameworks like keras, pytorch or mxnet are used

Transfer learning using pytorch — Ref: https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html

The traditional way: The very first step to transfer learning is to understand pytorch and write these many lines of code!!!!

The Monk way: Start with 5 lines of code instead of 40 — Use Monk’s Quick Mode

from monk.pytorch_prototype import prototype
gtf = prototype(verbose=1);
gtf.Prototype("Project-1", "Experiment-1");
gtf.Default(dataset_path="train",
              model_name="resnet18_v1",num_epochs=10);
gtf.Train();

2.Seamlessly switch back-end frameworks

Transfer learning using keras — Ref: https://medium.com/@14prakash/transfer-learning-using-keras-d804b2e04ef8

The traditional way: Learn keras and again write many lines of code.

The Monk way: Import Keras utilities from Monk and write the same code

from monk.keras_prototype import prototype
gtf = prototype(verbose=1);
gtf.Prototype("Project-1", "Experiment-1");
gtf.Default(dataset_path="train",model_name="resnet18_v1",
           num_epochs=10);
gtf.Train()

Monk is Syntax Invariant up-to a certain level of abstraction.

Now, one may ask that with just these 5 lines of code one is losing the capabilities to manipulate parameters.

3. Manipulate parameters in a standardized way

Load your experiment in the quick mode and make changes to parameters at every stage

Dataset updates

gtf.update_input_size(256); - Change input shape
gtf.update_trainval_split(0.6); - Change splits

and many more….. Check out Monk’s Update Mode

Model updates

gtf.update_freeze_layers(10); - freeze layers
gtf.append_linear(final_layer=True); - append layers

and much more. Check out Monk’s Expert Mode

4. Compare all the experiments executed using Monk

The traditional way: Write extra code that creates comparisons, and make a lot of changes to training code in order to generate these metrics.

The Monk way: Invoke comparison capabilities in simple functional format

ctf = compare(verbose=1);
ctf.Comparison("Sample-Comparison-1")# Step 1 - Add experiments
ctf.Add_Experiment("Project-Testing-Optimizers", "SGD");
ctf.Add_Experiment("Project-Testing-Optimizers", "ADAM");
ctf.Add_Experiment("Project-Testing-Optimizers", "ADAGRAD");
ctf.Add_Experiment("Project-Testing-Optimizers", "NAG");
ctf.Add_Experiment("Project-Testing-Optimizers", "NADAM");
ctf.Add_Experiment("Project-Testing-Optimizers", "ADAMAX");# Step 2 - Compare
ctf.Generate_Statistics();

And generate Results

Compare Accuracies

5. Other benefits of using Monk

Resume training sessions when interrupted from the last epoch
Run Experimental data analysis — discover class imbalance, missing data, corrupt data
Copy experiment from one system to another, be it local or cloud
Estimate training times before actual runs
Semi-automatically find hyper-parameters by running mini-experiments

Upcoming stories: Tutorials on using MONK

Happy Coding!!!