DEV Community

Cover image for ๐Ÿง  Understanding CNN Generalisation with Data Augmentation (TensorFlow โ€“ CIFAR-10)
Maxwell Ororho
Maxwell Ororho

Posted on

๐Ÿง  Understanding CNN Generalisation with Data Augmentation (TensorFlow โ€“ CIFAR-10)

๐Ÿ“˜ Data Augmentation in CNNs and the impact on Generalisation (Using CIFAR-10 Experiments)

Data augmentation is widely used when training convolutional neural networks, especially for image classification tasks.

The idea is simple: by transforming training images โ€” rotating, flipping, or shifting them, we can introduce more variation and help the model generalise better.

However, one question that is often overlooked is:

๐Ÿ‘‰ Does more augmentation always improve performance?

In this post, I investigate how different levels of data augmentation affect a CNN trained on the CIFAR-10 dataset.

All experiments, code, and plots shown here are taken directly from my notebook.

๐Ÿ“‚ Dataset Overview: CIFAR-10

The CIFAR-10 dataset contains:

  • 60,000 colour images
  • 10 output classes
  • 32ร—32 resolution
  • A balanced distribution across classes

One key detail is the image resolution.

At 32ร—32 pixels, fine details are limited, and some classes (like cats and dogs) can look very similar. This becomes important when analysing model performance later.

โš™๏ธ Data Preparation

Before training, the dataset was preprocessed to ensure stable learning.

# Load the CIFAR-10 dataset
(x_train_full, y_train_full), (x_test, y_test) = cifar10.load_data()

# Define the class names
class_names = [
    "airplane", "automobile", "bird", "cat", "deer",
    "dog", "frog", "horse", "ship", "truck"
]

# Scale pixel values to the range [0, 1]
x_train_full = x_train_full.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# Convert class labels to one-hot encoded format
y_train_full_cat = to_categorical(y_train_full, 10)
y_test_cat = to_categorical(y_test, 10)

# Split the training data into training and validation sets
x_train, x_val, y_train_cat, y_val_cat, y_train, y_val = train_test_split(
    x_train_full,
    y_train_full_cat,
    y_train_full,
    test_size=0.2,
    random_state=42,
    stratify=y_train_full
)

# Print dataset shapes
print("Training set shape:", x_train.shape)
print("Validation set shape:", x_val.shape)
print("Test set shape:", x_test.shape)
Enter fullscreen mode Exit fullscreen mode
  1. Pixel values are scaled to [0, 1]
  2. Labels are converted into one-hot encoding
  3. Data is later split into training and validation sets

These steps ensure that the model trains efficiently and can be evaluated properly.

๐Ÿง  Model Architecture Used

All experiments use the same CNN architecture to ensure a fair comparison.

def build_cnn_model():
    model = Sequential([
        Conv2D(
            32, (3, 3), activation="relu",
            padding="same", input_shape=(32, 32, 3)
        ),
        BatchNormalization(),
        MaxPooling2D((2, 2)),

        Conv2D(64, (3, 3), activation="relu", padding="same"),
        BatchNormalization(),
        MaxPooling2D((2, 2)),

        Conv2D(128, (3, 3), activation="relu", padding="same"),
        BatchNormalization(),
        MaxPooling2D((2, 2)),

        Flatten(),
        Dense(128, activation="relu"),
        Dropout(0.5),
        Dense(10, activation="softmax")
    ])

    model.compile(
        optimizer="adam",
        loss="categorical_crossentropy",
        metrics=["accuracy"]
    )

    return model

epochs = 15
batch_size = 64
Enter fullscreen mode Exit fullscreen mode

๐Ÿ” Experiment Setup

Three models were trained:

  • Baseline โ†’ No augmentation
  • Light augmentation โ†’ small transformations
  • Strong augmentation โ†’ larger transformations

This setup allows us to isolate the effect of augmentation.

๐Ÿ›  Augmentation Setup

# Create a light augmentation generator
light_datagen = ImageDataGenerator(
    rotation_range=10,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True
)

# Create a stronger augmentation generator
strong_datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.15,
    height_shift_range=0.15,
    zoom_range=0.2,
    horizontal_flip=True
)
Enter fullscreen mode Exit fullscreen mode

These parameters control the transformation strength.

  • Smaller values โ†’ subtle variation
  • Larger values โ†’ stronger distortion

๐Ÿ“Š Results

Test accuracy comparison across models

Observations

Baseline โ†’ 0.752
Light augmentation โ†’ 0.750
Strong augmentation โ†’ 0.692

Interpretation

  • Light augmentation had almost no effect
  • Strong augmentation reduced performance

โžก More augmentation does not always mean better performance.

๐Ÿ” Model Behaviour (Confusion Matrix)

Confusion matrix showing class-level performance

Observations

  1. Strong performance:
    • airplane, ship, truck
  2. Weak performance:
    • cat vs dog
    • automobile vs truck

Insight

Errors are often due to:

  • low resolution
  • visual similarity between classes

โžก Some limitations come from the dataset itself.

๐Ÿงพ Conclusion

The experiments highlight the following:

  • The baseline model already performs well
  • Light augmentation has minimal impact
  • Strong augmentation reduces performance
  • Augmentation must be applied carefully

Top comments (0)