As a student, I've witnessed firsthand the frustration caused by our university's inefficient lost and found system. The current process, reliant on individual emails for each found item, often leads to delays and missed connections between lost belongings and their owners.
Driven by a desire to improve this experience for myself and my fellow students, I've embarked on a project to explore the potential of deep learning in revolutionizing our lost and found system. In this blog post, I'll share my journey of evaluating pretrained models - ResNet, EfficientNet, VGG, and NasNet - to automate the identification and categorization of lost items.
Through a comparative analysis, I aim to pinpoint the most suitable model for integrating into our system, ultimately creating a faster, more accurate, and user-friendly lost and found experience for everyone on campus.
ResNet
Inception-ResNet V2 is a powerful convolutional neural network architecture available in Keras, combining the Inception architecture's strengths with residual connections from ResNet. This hybrid model aims to achieve high accuracy in image classification tasks while maintaining computational efficiency.
Training Dataset: ImageNet
Image Format: 299 x 299
Preprocessing function
def readyForResNet(fileName):
pic = load_img(fileName, target_size=(299, 299))
pic_array = img_to_array(pic)
expanded = np.expand_dims(pic_array, axis=0)
return preprocess_input_resnet(expanded)
Predicting
data1 = readyForResNet(test_file)
prediction = inception_model_resnet.predict(data1)
res1 = decode_predictions_resnet(prediction, top=2)
VGG (Visual Geometry Group)
VGG (Visual Geometry Group) is a family of deep convolutional neural network architectures known for their simplicity and effectiveness in image classification tasks. These models, particularly VGG16 and VGG19, gained popularity due to their strong performance in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2014.
Training Dataset: ImageNet
Image Format: 224 x 224
Preprocessing function
def readyForVGG(fileName):
pic = load_img(fileName, target_size=(224, 224))
pic_array = img_to_array(pic)
expanded = np.expand_dims(pic_array, axis=0)
return preprocess_input_vgg19(expanded)
Predicting
data2 = readyForVGG(test_file)
prediction = inception_model_vgg19.predict(data2)
res2 = decode_predictions_vgg19(prediction, top=2)
EfficientNet
EfficientNet is a family of convolutional neural network architectures that achieve state-of-the-art accuracy on image classification tasks while being significantly smaller and faster than previous models. This efficiency is achieved through a novel compound scaling method that balances network depth, width, and resolution.
Training Dataset: ImageNet
Image Format: 480 x 480
Preprocessing function
def readyForEF(fileName):
pic = load_img(fileName, target_size=(480, 480))
pic_array = img_to_array(pic)
expanded = np.expand_dims(pic_array, axis=0)
return preprocess_input_EF(expanded)
Predicting
data3 = readyForEF(test_file)
prediction = inception_model_EF.predict(data3)
res3 = decode_predictions_EF(prediction, top=2)
NasNet
NasNet (Neural Architecture Search Network) represents a groundbreaking approach in deep learning where the architecture of the neural network itself is discovered through an automated search process. This search process aims to find the optimal combination of layers and connections to achieve high performance on a given task.
Training Dataset: ImageNet
Image Format: 224 x 224
Preprocessing function
def readyForNN(fileName):
pic = load_img(fileName, target_size=(224, 224))
pic_array = img_to_array(pic)
expanded = np.expand_dims(pic_array, axis=0)
return preprocess_input_NN(expanded)
Predicting
data4 = readyForNN(test_file)
prediction = inception_model_NN.predict(data4)
res4 = decode_predictions_NN(prediction, top=2)
Showdown
Accuracy
The table summarizes the claimed accuracy scores of the models above. EfficientNet B7 leads with the highest accuracy, followed closely by NasNet-Large and Inception-ResNet V2. VGG models exhibit lower accuracies. For my application I want to choose a model which has a balance between processing time and accuracy.
Time
As we can see, EfficientNetB0 provides us the fastest results, but InceptionResNetV2 is a better package when taken accuracy in account
Summary
For my smart lost and found system, I decided to go with InceptionResNetV2. While EfficientNet B7 looked tempting with its top-notch accuracy, I was concerned about its computational demands. In a university setting, where resources might be limited and real-time performance is often desirable, I felt it was important to strike a balance between accuracy and efficiency. InceptionResNetV2 seemed like the perfect fit - it offers strong performance without being overly computationally intensive.
Plus, the fact that it's pretrained on ImageNet gives me confidence that it can handle the diverse range of objects people might lose. And let's not forget how easy it is to work with in Keras! That definitely made my decision easier.
Overall, I believe InceptionResNetV2 provides the right mix of accuracy, efficiency, and practicality for my project. I'm excited to see how it performs in helping reunite lost items with their owners!
Top comments (0)