DEV Community

Anurag Verma
Anurag Verma

Posted on

Fruit 360 CNN Classification Transfer Learning

The fruits dataset used in this model training is from Kaggle : fruits dataset

Importing library and modeules

import numpy as np
import tensorflow as tf
Enter fullscreen mode Exit fullscreen mode

Train and Test data

learn more about pathlib

import pathlib
train_dir = pathlib.Path("../input/fruits/fruits-360_dataset/fruits-360/Training")
test_dir = pathlib.Path("../input/fruits/fruits-360_dataset/fruits-360/Test")
Enter fullscreen mode Exit fullscreen mode

New function/Concept 'glog()'

  • The glob module is a useful part of the Python standard library. glob (short for global) is used to return all file paths that match a specific pattern.

  • We can use glob to search for a specific file pattern, or perhaps more usefully, search for files where the filename matches a certain pattern by using wildcard characters.

  • Learn more about glob

  • https://towardsdatascience.com/the-python-glob-module-47d82f4cbd2d

# Total number of images in training data-set

image_count = len(list(train_dir.glob('*/*.jpg')))
image_count
Enter fullscreen mode Exit fullscreen mode

67692

Showing / Visualize Image

  • here we are using matplotlib for visualizing our data

  • we have open our image and converted to digits using Pillow

import matplotlib.pyplot as plt
import PIL
fruits = list(train_dir.glob('Banana/*.jpg'))

plt.figure(figsize=(10, 10))

for i in range(3):
    plt.subplot(3, 3, i + 1)
    img = PIL.Image.open(str(fruits[i]))
    plt.imshow(img)
    plt.axis('off')

plt.show()
Enter fullscreen mode Exit fullscreen mode

png

Setting Up variables

batch_size = 32
img_height = 100
img_width = 100
Enter fullscreen mode Exit fullscreen mode

Collecting Data

Used keras 'image_dataset_from_directory' API for collrcting data from directories

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    train_dir,
    validation_split=0.2,
    subset='training',
    seed=42,
    image_size=(img_height, img_width),
    batch_size=batch_size
)
Enter fullscreen mode Exit fullscreen mode

Found 67692 files belonging to 131 classes. Using 54154 files for training.

val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    train_dir,
    validation_split=0.2,
    subset='validation',
    seed=42,
    image_size=(img_height, img_width),
    batch_size=batch_size
)
Enter fullscreen mode Exit fullscreen mode

Found 67692 files belonging to 131 classes. Using 13538 files for validation.

Visualizing friuts by classes

class_names = train_ds.class_names
num_classes = len(class_names)
Enter fullscreen mode Exit fullscreen mode
plt.figure(figsize=(10, 10))

for images, labels in train_ds.take(1):
    for i in range(25):
        plt.subplot(5, 5, i + 1)
        plt.imshow(images[i].numpy().astype('uint8'))
        plt.title(class_names[labels[i]])
        plt.axis('off')
Enter fullscreen mode Exit fullscreen mode

Preprocessing/Setting Up Base Model

prefetch the data for faster training while model is trained

AUTOTUNE = tf.data.experimental.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
Enter fullscreen mode Exit fullscreen mode

Data Augumentation

  • Data augmentation is a set of techniques to artificially increase the amount of data by generating new data points from existing data. This includes making small changes to data or using deep learning models to generate new data points.
data_augmentation = tf.keras.Sequential([
    tf.keras.layers.experimental.preprocessing.RandomFlip('horizontal'),
    tf.keras.layers.experimental.preprocessing.RandomRotation(0.2)
])
Enter fullscreen mode Exit fullscreen mode

Using ResNEt Model for Transfer Learning

preprocess_input = tf.keras.applications.resnet.preprocess_input
Enter fullscreen mode Exit fullscreen mode
base_model = tf.keras.applications.resnet.ResNet50(
    input_shape=(img_height, img_width, 3),
    include_top=False,
    weights='imagenet'
)
Enter fullscreen mode Exit fullscreen mode

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5 94773248/94765736 [==============================] - 0s 0us/step 94781440/94765736 [==============================] - 0s 0us/step

  • setting base model trainable to False so model take less time
base_model.trainable = False
Enter fullscreen mode Exit fullscreen mode
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
prediction_layer = tf.keras.layers.Dense(num_classes)
Enter fullscreen mode Exit fullscreen mode

Building Model

inputs = tf.keras.Input(shape=(100, 100, 3))
x = data_augmentation(inputs)
x = preprocess_input(x)
x = base_model(x, training=False)
x = global_average_layer(x)
x = tf.keras.layers.Dropout(0.2)(x)
outputs = prediction_layer(x)

model = tf.keras.Model(inputs=inputs, outputs=outputs)
Enter fullscreen mode Exit fullscreen mode
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)

model.compile(
    optimizer=optimizer,
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy']
)
Enter fullscreen mode Exit fullscreen mode
model.summary()
Enter fullscreen mode Exit fullscreen mode

Model: "model" _________________________________________________________________ Layer (type) Output Shape Param #

================================================================= input_2 (InputLayer) [(None, 100, 100, 3)] 0

_________________________________________________________________ sequential (Sequential) (None, 100, 100, 3) 0

_________________________________________________________________ tf.operators.getitem (Sl (None, 100, 100, 3) 0

_________________________________________________________________ tf.nn.bias_add (TFOpLambda) (None, 100, 100, 3) 0

_________________________________________________________________ resnet50 (Functional) (None, 4, 4, 2048) 23587712

_________________________________________________________________ global_average_pooling2d (Gl (None, 2048) 0

_________________________________________________________________ dropout (Dropout) (None, 2048) 0

_________________________________________________________________ dense (Dense) (None, 131) 268419

================================================================= Total params: 23,856,131 Trainable params: 268,419 Non-trainable params: 23,587,712 _________________________________________________________________

Training the model

model.evaluate(val_ds)
Enter fullscreen mode Exit fullscreen mode

2022-11-21 23:17:58.112151: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8005

424/424 [==============================] - 40s 73ms/step - loss: 6.1083 - accuracy: 0.0146

[6.10833215713501, 0.014551632106304169]

  • evolution accuracy is very bad but wait for traning
epochs = 15

history = model.fit(
    train_ds,
    epochs=epochs,
    validation_data=val_ds
)
Enter fullscreen mode Exit fullscreen mode

Epoch 1/15

2022-11-21 23:18:54.026118: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 143 of 1000 2022-11-21 23:19:04.017250: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 288 of 1000 2022-11-21 23:19:14.002639: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 431 of 1000 2022-11-21 23:19:24.049675: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 579 of 1000 2022-11-21 23:19:34.018446: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 720 of 1000 2022-11-21 23:19:44.006225: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 865 of 1000

3/1693 [..............................] - ETA: 1:41 - loss: 6.8432 - accuracy: 0.0208

2022-11-21 23:19:53.219212: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:228] Shuffle buffer filled.

1693/1693 [==============================] - 160s 51ms/step - loss: 1.3016 - accuracy: 0.7131 - val_loss: 0.2886 - val_accuracy: 0.9617 Epoch 2/15 1693/1693 [==============================] - 54s 32ms/step - loss: 0.2196 - accuracy: 0.9602 - val_loss: 0.1175 - val_accuracy: 0.9866 Epoch 3/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.1076 - accuracy: 0.9821 - val_loss: 0.0637 - val_accuracy: 0.9947 Epoch 4/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.0632 - accuracy: 0.9905 - val_loss: 0.0405 - val_accuracy: 0.9968 Epoch 5/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.0416 - accuracy: 0.9941 - val_loss: 0.0277 - val_accuracy: 0.9983 Epoch 6/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.0307 - accuracy: 0.9956 - val_loss: 0.0214 - val_accuracy: 0.9988 Epoch 7/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.0224 - accuracy: 0.9972 - val_loss: 0.0155 - val_accuracy: 0.9990 Epoch 8/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.0180 - accuracy: 0.9975 - val_loss: 0.0125 - val_accuracy: 0.9993 Epoch 9/15 1693/1693 [==============================] - 53s 31ms/step - loss: 0.0143 - accuracy: 0.9982 - val_loss: 0.0116 - val_accuracy: 0.9990 Epoch 10/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.0119 - accuracy: 0.9985 - val_loss: 0.0107 - val_accuracy: 0.9993 Epoch 11/15 1693/1693 [==============================] - 53s 31ms/step - loss: 0.0102 - accuracy: 0.9985 - val_loss: 0.0089 - val_accuracy: 0.9990 Epoch 12/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.0087 - accuracy: 0.9989 - val_loss: 0.0093 - val_accuracy: 0.9994 Epoch 13/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.0079 - accuracy: 0.9988 - val_loss: 0.0077 - val_accuracy: 0.9993 Epoch 14/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.0066 - accuracy: 0.9991 - val_loss: 0.0072 - val_accuracy: 0.9996 Epoch 15/15 1693/1693 [==============================] - 52s 31ms/step - loss: 0.0061 - accuracy: 0.9992 - val_loss: 0.0060 - val_accuracy: 0.9999

Visualization of Accuracy and loss

Loss

train_loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(12, 10))
plt.plot(epochs_range, train_loss, label="Training Loss")
plt.plot(epochs_range, val_loss, label="Validation Loss")
plt.legend(loc='upper left')
plt.title('Training and Validation Loss')

plt.show()
Enter fullscreen mode Exit fullscreen mode

Accuracy

train_acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

epochs_range = range(epochs)

plt.figure(figsize=(12, 10))
plt.plot(epochs_range, train_acc, label="Training Accuracy")
plt.plot(epochs_range, val_acc, label="Validation Accuracy")
plt.legend(loc='upper left')
plt.title('Training and Validation Accuracy')

plt.show()
Enter fullscreen mode Exit fullscreen mode

I hope you like this ;)

Top comments (0)