DEV Community

Seenivasa Ramadurai
Seenivasa Ramadurai

Posted on

Exploring Autoencoders:Anomaly Detection with TensorFlow and Keras Using the MNIST Dataset-Part-1

What is Autoencoders?

Autoencoders are a class of neural networks in deep learning that operate under unsupervised learning principles, meaning they don't require labeled data for training. Their primary function is to reconstruct input data by learning efficient representations, effectively capturing the essence of the data.

Image description

An autoencoder consists of two main components:

Encoder: This part compresses the input data into a latent space representation, often referred to as the bottleneck layer. Here, the input is reduced to a lower-dimensional vector, capturing the most critical features.

Decoder: This component reconstructs the original data from the compressed latent representation, aiming to produce an output as close as possible to the original input.

While it might seem counterintuitive to design a model that replicates its input, the compression process forces the network to learn the most salient features of the data, which can be highly beneficial for various applications.

Applications of Autoencoders:

Data Compression: Autoencoders can compress data by learning efficient encodings, which is particularly useful for reducing storage requirements.

Noise Reduction: By training on clean data, autoencoders can learn to reconstruct noise-free versions of inputs, making them effective for denoising tasks.

Anomaly Detection: Autoencoders trained on normal data can identify anomalies by measuring reconstruction errors; significant deviations suggest the presence of anomalies.

In this blog, we'll explore these applications using TensorFlow and Keras, applying them to the MNIST dataset—a collection of handwritten digits commonly used for training image processing systems. The MNIST dataset is readily available within the Keras library, facilitating straightforward experimentation.

import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
import matplotlib.pyplot as plt

# Load and preprocess data
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype("float32") / 255  # Normalize pixel values to [0, 1]
x_test = x_test.astype("float32") / 255    # Normalize pixel values to [0, 1]
x_train = x_train.reshape((-1, 28 * 28))   # Flatten images to 1D vectors
x_test = x_test.reshape((-1, 28 * 28))     # Flatten images to 1D vectors

# Define encoder
input_img = layers.Input(shape=(x_train.shape[1],))  # Input layer
encoded = layers.Dense(200, activation="relu")(input_img)  # First hidden layer
encoded = layers.Dense(100, activation="relu")(encoded)    # Second hidden layer
encoded = layers.Dense(50, activation="relu")(encoded)      # Third hidden layer
bottleneck = layers.Dense(25, activation="relu")(encoded)   # Bottleneck layer (latent space)
encoder = models.Model(input_img, bottleneck, name="encoder")  # Encoder model

# Define decoder
decoded = layers.Dense(50, activation="relu")(bottleneck)    # First hidden layer
decoded = layers.Dense(100, activation="relu")(decoded)      # Second hidden layer
decoded = layers.Dense(200, activation="relu")(decoded)      # Third hidden layer
decoded = layers.Dense(x_train.shape[1], activation="sigmoid")(decoded)  # Output layer
decoder = models.Model(bottleneck, decoded, name="decoder")   # Decoder model

# Define autoencoder
autoencoder_input = layers.Input(shape=(x_train.shape[1],))  # Input layer
encoded_repr = encoder(autoencoder_input)  # Encoder output
decoded_output = decoder(encoded_repr)     # Decoder output
autoencoder = models.Model(autoencoder_input, decoded_output, name="autoencoder")  # Autoencoder model
autoencoder.compile(optimizer="adam", loss="mse")  # Compile model with Adam optimizer and MSE loss

# Train autoencoder
autoencoder.fit(x_train, x_train, epochs=40, batch_size=256, validation_data=(x_test, x_test))  # Train on x_train to reconstruct x_train

# Calculate reconstruction loss
def calculate_reconstruction_loss(data, model):
    reconstructions = model.predict(data)  # Get model predictions
    reconstruction_errors = np.mean(np.abs(data - reconstructions), axis=1)  # Compute mean absolute error
    return reconstruction_errors

# Evaluate the model
anomaly_pic = x_test[0]  # Select a normal test image
anomaly_pic1 = np.random.rand(28 * 28)  # Generate a random anomalous image
reconstruction_loss_normal = calculate_reconstruction_loss(x_test, autoencoder)  # Calculate loss for normal data
reconstruction_loss_anomalous = calculate_reconstruction_loss(np.array([anomaly_pic, anomaly_pic1]), autoencoder)  # Calculate loss for anomalous data

# Print average reconstruction loss
print(f"Average Reconstruction Loss for Normal Data: {np.mean(reconstruction_loss_normal)}")
print(f"Reconstruction Loss for Normal Data: {reconstruction_loss_anomalous[0]}")
print(f"Reconstruction Loss for Anomalous Data: {reconstruction_loss_anomalous[1]}")

# Visualization of reconstruction error distribution
plt.figure(figsize=(6, 4))
plt.hist(reconstruction_loss_normal, bins=50, alpha=0.6, color='g', label='Normal')  # Histogram for normal data
plt.axvline(x=reconstruction_loss_anomalous[0], color='b', linestyle='dashed', linewidth=2, label='Normal Test Data')  # Line for normal test data
plt.axvline(x=reconstruction_loss_anomalous[1], color='r', linestyle='dashed', linewidth=2, label='Anomalous Test Data')  # Line for anomalous test data
plt.title('Reconstruction Error Distribution')
plt.xlabel('Reconstruction Error')
plt.ylabel('Frequency')
plt.legend()
plt.show()

Enter fullscreen mode Exit fullscreen mode

Image description

Here's a summary of the parameters for each layer in your autoencoder model, presented in a table format ,this table format This table summarizes the calculations for weights, biases, and total parameters for each layer in the model.

Image description

Epoch 1/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - loss: 0.0958 - val_loss: 0.0382
Epoch 2/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0348 - val_loss: 0.0277
Epoch 3/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0266 - val_loss: 0.0231
Epoch 4/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0225 - val_loss: 0.0203
Epoch 5/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0202 - val_loss: 0.0187
Epoch 6/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 8ms/step - loss: 0.0187 - val_loss: 0.0175
Epoch 7/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0177 - val_loss: 0.0168
Epoch 8/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0167 - val_loss: 0.0158
Epoch 9/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0159 - val_loss: 0.0150
Epoch 10/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0152 - val_loss: 0.0145
Epoch 11/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0146 - val_loss: 0.0141
Epoch 12/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0141 - val_loss: 0.0138
Epoch 13/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0137 - val_loss: 0.0133
Epoch 14/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0133 - val_loss: 0.0131
Epoch 15/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0129 - val_loss: 0.0126
Epoch 16/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0125 - val_loss: 0.0122
Epoch 17/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0123 - val_loss: 0.0121
Epoch 18/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0121 - val_loss: 0.0120
Epoch 19/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0119 - val_loss: 0.0117
Epoch 20/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0117 - val_loss: 0.0116
Epoch 21/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0115 - val_loss: 0.0114
Epoch 22/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - loss: 0.0113 - val_loss: 0.0114
Epoch 23/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0112 - val_loss: 0.0111
Epoch 24/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0110 - val_loss: 0.0108
Epoch 25/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0107 - val_loss: 0.0107
Epoch 26/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0106 - val_loss: 0.0106
Epoch 27/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0104 - val_loss: 0.0105
Epoch 28/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0103 - val_loss: 0.0102
Epoch 29/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0100 - val_loss: 0.0102
Epoch 30/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - loss: 0.0099 - val_loss: 0.0100
Epoch 31/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0098 - val_loss: 0.0099
Epoch 32/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0097 - val_loss: 0.0098
Epoch 33/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0096 - val_loss: 0.0097
Epoch 34/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0095 - val_loss: 0.0095
Epoch 35/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - loss: 0.0095 - val_loss: 0.0096
Epoch 36/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - loss: 0.0093 - val_loss: 0.0096
Epoch 37/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - loss: 0.0093 - val_loss: 0.0095
Epoch 38/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0092 - val_loss: 0.0094
Epoch 39/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0092 - val_loss: 0.0092
Epoch 40/40
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.0090 - val_loss: 0.0092
313/313 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step
Average Reconstruction Loss for Normal Data: 0.031503044068813324
Reconstruction Loss for Normal Data: 0.01813212214797447

Reconstruction Loss for Anomalous Data: 0.44787996631145593

In anomaly detection tasks, autoencoders are trained exclusively on normal data, enabling them to learn and reconstruct typical patterns effectively. When presented with anomalous data, the autoencoder struggles to reconstruct it accurately, resulting in higher reconstruction errors. This discrepancy serves as a key indicator for identifying anomalies.

For instance, consider the following reconstruction losses:

Average Reconstruction Loss for Normal Data: 0.0315
Reconstruction Loss for Anomalous Data: 0.4479

The substantial difference between these values highlights the autoencoder's proficiency in reconstructing normal data and its difficulty with anomalous data. By setting a threshold based on the reconstruction error distribution of normal data, we can classify data points with errors exceeding this threshold as anomalies.

This approach leverages the autoencoder's ability to model normal data distributions, making it a powerful tool for anomaly detection in various applications.

Image description

Anomaly detection using autoencoders is widely applied across various industries to identify unusual patterns that may indicate critical incidents, such as fraud, structural defects, or network intrusions. Key applications include:

Finance: Detecting fraudulent transactions by identifying deviations from typical spending behaviors.

Healthcare: Identifying rare diseases or unusual patient conditions that may require special attention.

Manufacturing: Monitoring equipment for signs of failure or defects in production processes.

Cybersecurity: Detecting unauthorized access or malicious activities within networks.

By training autoencoders on normal operational data, these systems can effectively flag anomalies, enabling timely interventions and maintaining system integrity.

PCA and Autoencoders: A CEO's Strategic Tools for Workforce Optimization

In the fast-paced world of IT, CEOs often face the daunting task of downsizing their workforce while maintaining operational efficiency and innovation. Just as data scientists employ Principal Component Analysis (PCA) and Autoencoders to distill complex data into its most essential components, CEOs can leverage these techniques to make informed decisions about their teams.

Here is Sreeni's application where PCA and Autoencoders can be used.

PCA: Identifying Key Contributions

Principal Component Analysis serves as a powerful tool for simplifying large datasets by identifying the most significant features that contribute to variance. For an IT CEO, this process is akin to analyzing the skill sets and contributions of each employee. By determining which roles or skills provide the most value to the organization, leaders can make targeted decisions about whom to retain and whom to let go. This not only preserves critical talent but also ensures that the organization can adapt quickly to changing market demands.

Autoencoders: Streamlining Operations

Similarly, Autoencoders can be viewed as a means of compressing and reconstructing data, enabling a more efficient representation of information. In the context of workforce management, this reflects the need to streamline operations without sacrificing quality. CEOs can use this analogy to identify overlapping roles, redundant processes, and areas where automation can enhance productivity. By focusing on essential functions and removing excess, they can create a leaner, more agile organization that is better equipped to thrive in a competitive landscape.

Balancing Cost and Innovation

Ultimately, both PCA and Autoencoders illustrate the importance of strategic reduction in complexity, whether in data analysis or workforce management. For IT CEOs, the goal is to balance cost efficiency with the need for innovation. By applying these analytical frameworks, leaders can navigate the difficult waters of downsizing, ensuring that their organization remains robust and forward-thinking, even in challenging times.

Thanks
Sreeni Ramadorai.

Top comments (0)