Jatin Sisodia

Posted on Jan 14

Battle of the CNNs: ResNet vs. MobileNet vs. EfficientNet for Fruit Disease Detection

#ai #machinelearning #python #programming

So here's the thing: I've always been fascinated by how deep learning can solve real-world problems, and fruit disease detection seemed like the perfect challenge. Not too simple, not impossibly complex, and actually useful for farmers dealing with crop losses.

I ended up building FruitScan-AI and testing three different neural network architectures to see which one actually works best. Spoiler: they each have their strengths, and the "best" one totally depends on what you're building.

Why Even Bother With Fruit Disease Detection?
Look, I know what you're thinking: "why fruits?" But hear me out. Farmers lose something like 20 to 40% of their crops every year to diseases and pests. That's HUGE. And the traditional way of checking? Walking through fields, manually inspecting every plant, hoping you catch problems early. It's slow, inconsistent, and requires expertise that not everyone has access to.

So I thought: what if we could just snap a photo and get an instant diagnosis? That's where FruitScan-AI comes in.

What I Built
FruitScan-AI is basically a deep learning system that looks at fruit images and tells you two things:

What kind of fruit it is
Whether it's healthy or diseased

But here's where it gets interesting... I didn't just build one model. I built three different versions using EfficientNet, MobileNetV2, and ResNet50 to compare them side by side.

The Dataset
I worked with images of over 15 different fruits: apples, bananas, grapes, mangoes, tomatoes, peppers... you name it. Each category has both healthy specimens and diseased ones (bacterial spots, fungal infections, rot, all the nasty stuff).

The images are high resolution enough to catch the subtle details that matter for accurate classification.

The Three Architectures (And Why I Picked Them)

EfficientNet: The Balanced One
EfficientNet is like that friend who's good at everything without trying too hard. It scales network depth, width, and resolution together using this "compound scaling" approach. Translation? You get great accuracy without your model becoming a computational monster.

I went with EfficientNet because it's efficient and actually performs really well on image classification tasks.

MobileNetV2: The Lightweight Champion
This one's designed for mobile devices. It uses these clever "depthwise separable convolutions" that basically do more with less. Perfect if you want to deploy your model on a phone or a Raspberry Pi in the middle of a farm.

If I were building an app for farmers in the field, MobileNetV2 would be my go to.

ResNet50: The Heavyweight
ResNet50 is the veteran here. It introduced "skip connections" that let you train really deep networks without everything exploding (vanishing gradients are fun like that). It's deeper, it's powerful, and it can learn some seriously complex patterns.

I included ResNet50 because it's battle tested and gives you a solid baseline for comparison.

How It Actually Works
All three models use transfer learning... basically, I'm not starting from scratch. These models were pretrained on ImageNet (millions of images), so they already know what edges, textures, and shapes look like.

Here's the general approach:

# I freeze the pretrained layers at first
base_model = EfficientNetB0(weights='imagenet', include_top=False)
base_model.trainable = False

# Then add my own classification layers on top
model = Sequential([
    base_model,
    GlobalAveragePooling2D(),
    Dense(256, activation='relu'),
    Dropout(0.5),
    Dense(num_classes, activation='softmax')
])

Why transfer learning? Three reasons:

Training is way faster
You need less data
Better results

The Training Pipeline
Nothing fancy here, just good practices:

Resize images to what each model expects (224x224 for most)
Normalize using ImageNet statistics
Add data augmentation (flips, rotations, brightness tweaks) so the model doesn't just memorize training data
Train in batches to keep my GPU from crying

I tracked the usual suspects: accuracy, precision, recall, F1 score, and confusion matrices to see where each model struggled.

Getting Started (If You Want to Try This)
What You Need

pip install tensorflow keras numpy pandas matplotlib scikit-learn

Grab the code

git clone https://github.com/sisodiajatin/FruitScan-AI.git
cd FruitScan-AI

The repo's organized by architecture:

FruitScan-AI/
├── EfficiencyNet/    # EfficientNet notebooks
├── MobileNetV2/      # MobileNetV2 experiments  
├── ResNet50/         # ResNet50 implementation

Running it

cd EfficiencyNet  # or whichever model you want
jupyter notebook
# Open the training notebook and go through it cell by cell

Making Predictions
Once you've got a trained model, using it is straightforward:

# Load your model
model = load_model('fruit_disease_model.h5')

# Prep your image
img = load_img('suspicious_apple.jpg', target_size=(224, 224))
img_array = img_to_array(img)
img_array = np.expand_dims(img_array, axis=0)
img_array = preprocess_input(img_array)

# Get prediction
predictions = model.predict(img_array)
class_idx = np.argmax(predictions[0])

print(f"This looks like: {class_labels[class_idx]}")
print(f"Confidence: {predictions[0][class_idx]*100:.2f}%")

Here's a quick Flask wrapper if you want to turn this into an API:

from flask import Flask, request, jsonify

app = Flask(__name__)
model = tf.keras.models.load_model('best_model.h5')

@app.route('/predict', methods=['POST'])
def predict():
    file = request.files['image']
    img = preprocess_image(file)
    prediction = model.predict(img)

    return jsonify({
        'fruit': get_fruit_name(prediction),
        'status': 'healthy' if is_healthy(prediction) else 'diseased',
        'confidence': float(np.max(prediction))
    })

So... Which Model Won?
Honestly? It depends what you're optimizing for.
From what I've seen:

EfficientNet hits around 92 to 95% accuracy with decent speed (about 30 to 50ms per image). Good all rounder.
MobileNetV2 gets 88 to 91% accuracy but is FAST (10 to 20ms) and tiny. Perfect for mobile apps.
ResNet50 lands at 90 to 93% accuracy but is slower (50 to 80ms) and bigger. Great for research or when accuracy is everything.

Try It Out!
If this sounds interesting:
⭐ Star the repo - https://github.com/sisodiajatin/FruitScan-AI

DEV Community

Battle of the CNNs: ResNet vs. MobileNet vs. EfficientNet for Fruit Disease Detection

Top comments (0)