Jacob

Posted on Aug 24

Day 1: Tried building my first CNN (and learned why 8 validation images are useless)

#tensorflow #beginners #radiology #python

Accidentally Public: My First CNN for Pneumonia Detection

Three days ago, I made my first repo public on GitHub by accident. Panic mode: activated. But instead of hiding it, I decided to share the journey.

I’m a final-year medical student diving into the tech side of healthcare, and I just built my first Convolutional Neural Network (CNN) to detect pneumonia from chest X-rays. Spoiler: it works… but my data setup was very questionable.

What I Built

Goal: Classify chest X-rays as normal or pneumonia
Dataset: Chest X-Ray Images (Pneumonia) – 5,863 pediatric X-rays
Tech: Python, TensorFlow/Keras, Google Colab, LLMs to support my pathetic coding skills
Model: Simple CNN (~5.4M parameters, 3 conv blocks + pooling + dense head)

So sort of, how to tell the left image from the two on the right apart:

This was my way of bridging medicine (I see more than 50 shades of grey in an X-ray by now, though I’m not a radiologist) with my growing interest in AI.

I didn’t know what a CNN really was at first. For anyone new, this video is a helpful, very basic introduction:

Watch on YouTube: "What is a Convolutional Neural Network? (CNNs)"

For a deeper dive, here’s a great in-depth playlist provided by the channel Futurology - An Optimistic Future:

Deep Learning Playlist

Lessons That Humbled Me

1. Tiny validation set = chaos 🎢

The dataset’s “validation” split? Just 8 normal + 8 pneumonia images. That’s 0.14% of the data.

My metrics looked wild:

Epoch 1 → val_acc: 0.56, val_auc: 0.95
Epoch 3 → val_acc: 0.75, val_auc: 0.98

Lesson: Data splitting isn’t just about having folders—it’s about meaningful sample sizes.

2. The lazy pneumonia-predicting machine 🤖

Class imbalance:

Normal → 1,583
Pneumonia → 4,273
Ratio ~1:2.7

My model started predicting “pneumonia” most of the time, scoring decent accuracy but missing normals. The only purpose of the confusion matrix was to confuse me (ngl, ChatGPT had to interpret it), but it showed the truth: strong on pneumonia, weak on normals.

Accuracy ≠ whole story—precision/recall matter.

3. Model summaries aren’t magic 📋

The model.summary() with 5.4M parameters intimidated me, until I broke it down:

Conv2D → edge/pattern detectors
Dense layers → where parameter count explodes (~5.3M of 5.4M)
Each “neuron” = input → math → output

From a medical lens, it felt like tracing how biological neurons fire.

4. Data quality > fancy architecture 💎

My biggest realization: no clever CNN can fix broken training splits or imbalance. I spent more time fixing the pipeline than tweaking hyperparameters—and that was the right call.

What’s Next

Proper 70/15/15 train/val/test splits
Class weights to balance mistakes
Better metrics (precision/recall per class)
Augmentation for underrepresented normal cases

Follow Along

The full learning process (project research and notes, confusions, breakthroughs, even “I don’t fully get this yet” moments) is on GitHub:

👉 chest-xray-cnn

Because learning isn’t just about polished results—it’s about the messy middle too.

I’ll keep sharing as I go—because this is just the start of my journey into AI + healthcare. Happy about any inputs or feedback 🚀

Acknowledgements

The Kaggle dataset authors for making this resource freely available
TensorFlow/Keras docs + tutorials that kept me afloat
The open-source community on GitHub
ChatGPT (yes, I used it heavily!) for debugging, explaining confusion matrices, emotional support, and not judging my messy code

DEV Community