Anik Chand

Posted on Oct 7

A Beginner's GAN Adventure with Digits

#beginners #machinelearning #python #ai

It was a rainy afternoon in September. The view from my window was all gray and blurry. That reminded me of the first images my GAN made – just fuzzy chaos on the screen. 🌧️

I'd been learning Deep Learning for months. I wanted to create something new with code. Not just numbers, but something that looks real.

So, I tried Generative Adversarial Networks. I built it from scratch with Keras and TensorFlow, taking help from ChatGPT. 💻

The dataset? MNIST. It has 70,000 handwritten digits. They look like quick notes from people long ago – scribbled 7s and curvy 8s. 📝

Think of two neural networks working against each other:

The Generator takes random noise – just a bunch of random numbers. It tries to turn that into a digit image. At first, it makes blurry shapes. You might guess it's a 3... or something else. 🔄
The Discriminator checks the images. It knows real MNIST images. It says "fake" to the bad ones and "real" to the good ones. 👮‍♂️

Source: DZone Article on GAN Principles

Think of GANs like a counterfeiter (Generator) trying to make fake money that fools the police (Discriminator). The bank provides real money for training. Over time, the fakes get better! 💸

Architecture of A GAN network :

Source: Jonathan Hui on Medium

The cool part? They learn together. The Generator gets better at tricking the Discriminator. The Discriminator gets better at spotting fakes. After many training steps, the fake digits look real. Like they were drawn by hand. 🎨

I did this project as an experiment. I wanted to understand how GANs work. It was fun to see it come together. Next, I plan to use this to make realistic human faces. 😎

Building the Networks 🛠️

I used Jupyter notebook to build it. The Generator starts with 100 random numbers. It uses dense layers and LeakyReLU to shape them. Then it turns them into a 28x28 image. It's like building a picture from nothing. 🖼️

The Discriminator takes 28x28 images. It uses Conv2D layers to look for patterns. It ends with a yes or no: real or fake.

Parts	Generator	Discriminator
Input	100 random numbers	28x28 image
Main Layers	Dense then reshape	Conv2D then flatten
Activations	LeakyReLU, Tanh	LeakyReLU, Sigmoid
Output	Fake image	Real (1) or Fake (0)

At the last of the blog I have explained more about the Discriminator and Generator network which is have used in my code.

Training took time on my GPU. I saved the models so I could start again if needed. I also saved images every few steps to see progress. Later, I made a GIF from them. And I zipped all the images together. ⏳

Seeing the Digits Improve 📈

Around step 5, I saw a shaky 4 appear. That was exciting. By step 25, the digits looked good. 1s had straight lines. 9s had curves. At step 100, the Discriminator could not tell fakes from real very well. That meant success.

Here are some images from training:

Pure Noise	Epoch 1

From total randomness to the first blurry hints of digits – like fog lifting just a bit.

Epoch 2	Epoch 3

Faint shapes emerging... is that a 6? The Generator is starting to get the idea.

Epoch 9	Epoch 17

Things are sharpening up! Each digit feels a little more like real handwriting, with its own quirks.

Epoch 25	Epoch 50

Confidence building – straight lines for 1s, smooth curves for 9s. The duel is heating up.

Epoch 75	Epoch 100

Nearly there! These fakes could pass for real MNIST scribbles. Magic in the making. ✨

Here is a GIF of 16 digits improving over time.

It shows how random noise turns into clear digits in 10 seconds.

GAN Architecture in My Project 🔧

Discriminator Architecture

The Discriminator network processes 28x28 grayscale images and outputs a probability (0 for fake, 1 for real). Here's the layer-by-layer breakdown:

Layer Name	Type	Input Shape	Output Shape
conv2d_6	Conv2D	(None, 28, 28, 1)	(None, 14, 14, 64)
leaky_re_lu_30	LeakyReLU	(None, 14, 14, 64)	(None, 14, 14, 64)
dropout_6	Dropout	(None, 14, 14, 64)	(None, 14, 14, 64)
conv2d_7	Conv2D	(None, 14, 14, 64)	(None, 7, 7, 128)
leaky_re_lu_31	LeakyReLU	(None, 7, 7, 128)	(None, 7, 7, 128)
dropout_7	Dropout	(None, 7, 7, 128)	(None, 7, 7, 128)
flatten_3	Flatten	(None, 7, 7, 128)	(None, 6272)
dense_11	Dense	(None, 6272)	(None, 1)

This architecture uses convolutional layers for feature extraction, dropout for regularization to prevent overfitting, and LeakyReLU activations to maintain gradient flow.

Generator Architecture

The Generator network starts with random noise input and progressively upsamples it to produce 28x28 grayscale digit images. Here's the layer-by-layer breakdown:

Layer Name	Type	Input Shape	Output Shape
dense_9	Dense	(None, 100)	(None, 1254)
batch_normalization_18	BatchNormalization	(None, 1254)	(None, 1254)
leaky_re_lu_24	LeakyReLU	(None, 1254)	(None, 1254)
reshape_6	Reshape	(None, 1254)	(None, 7, 7, 256)
conv2d_transpose_18	Conv2DTranspose	(None, 7, 7, 256)	(None, 7, 7, 128)
batch_normalization_19	BatchNormalization	(None, 7, 7, 128)	(None, 7, 7, 128)
leaky_re_lu_25	LeakyReLU	(None, 7, 7, 128)	(None, 7, 7, 128)
conv2d_transpose_19	Conv2DTranspose	(None, 7, 7, 128)	(None, 14, 14, 64)
batch_normalization_20	BatchNormalization	(None, 14, 14, 64)	(None, 14, 14, 64)
leaky_re_lu_26	LeakyReLU	(None, 14, 14, 64)	(None, 14, 14, 64)
conv2d_transpose_20	Conv2DTranspose	(None, 14, 14, 64)	(None, 28, 28, 1)

This architecture uses transposed convolutions for upsampling, batch normalization for stability, and LeakyReLU activations to prevent vanishing gradients.

When the rain stopped, I looked at all the images. This project taught me a lot about GANs. It showed how trial and error leads to something new. Now, I want to try better ways to check quality, like FID scores. Maybe use more layers. Or try it on color images from CIFAR-10.

I also want to make Conditional GANs. That way, I can make a specific digit, like just 7s.

Check the code on GitHub. Try it yourself. What GAN project have you done? Tell me in the comments.