Hiroshi Watanabe

Posted on Feb 11

A Hands-On Introduction to Restricted Boltzmann Machines with a Minimal NumPy Implementation

#machinelearning #python #opensource #deeplearning

We have developed and released a Python library that makes it easy to experiment with a Restricted Boltzmann Machine (RBM), a classic machine learning model. The main developer of this library is Mr. Kobayashi, and the project is available under the MIT License.

https://github.com/watanabe-appi/simple_rbm

The library can be used easily both in a local Python environment and in Google Colab. If CuPy is available, it also supports GPU acceleration via GPGPU.

In the following sections, I will introduce the basics of RBMs and provide a step-by-step guide to using this library.

What Is a Restricted Boltzmann Machine?

Originally, Hinton and Sejnowski proposed the Boltzmann Machine as an associative memory network. This model corresponds to a physical system in which spins are arranged on a network, and system states appear according to a defined energy function and the Boltzmann distribution. By properly learning the network weights, the model can memorize and represent various types of data.

Although the Boltzmann Machine is theoretically fascinating, its training cost is extremely high, making it impractical for many real-world applications.

To address this issue, the Restricted Boltzmann Machine (RBM) was introduced. In this model, units are divided into two groups: a visible layer and a hidden layer. By prohibiting connections between units within the same group, the model becomes much more efficient to train.

While RBMs often do not match the performance of similarly sized deep neural networks, they remain theoretically interesting. Their strong connections to statistical physics make them an attractive subject for research.

An RBM can be viewed as a network in which spins are placed on nodes. Each node has a bias parameter that controls how likely the corresponding spin is to point “up,” and each edge has a weight parameter that determines whether the two connected spins tend to align in the same direction.

Training an RBM means optimizing these bias and weight parameters so that the model exhibits the desired behavior.

An RBM can memorize given data in advance—for example, images. Once trained, it can reconstruct an input image from its internal representation.

This is somewhat similar to how humans recall information: imagine you are shown a handwritten character, then it is hidden, and you are asked to write down the character that was on the paper. You may not reproduce the exact same strokes, but you can reproduce the same character. In a similar way, a trained RBM reconstructs inputs based on what it has learned.

In this situation, when we see an image of the digit “9,” we compress the visual information and recognize it as the abstract concept “the number 9.” From that abstract representation, we can then reconstruct the character “9” again.

An RBM can perform a similar process. In the following sections, we will use the MNIST handwritten digit dataset as an example to train an RBM and demonstrate image reconstruction.

Using the RBM Library in Google Colab

Although the library can also be used in a local environment, using Google Colab is the easiest way to get started. Let’s try it there. First, open a new notebook in Google Colab.

Installing the Library

In the first cell, install the RBM library using pip:

!pip install "git+https://github.com/watanabe-appi/simple_rbm.git"

Importing the Required Libraries

Next, import all the necessary libraries:

import tensorflow as tf
from PIL import Image
from simple_rbm import RBM
import numpy as np
import IPython
import matplotlib.pyplot as plt

By installing the library in the previous step, you can now import simple_rbm from the RBM package.

Initializing the RBM

Next, initialize the RBM. The required parameters are the number of visible units and the number of hidden units. Since MNIST images are 28×28 pixels, we set the number of visible units accordingly. The hidden layer can be any size smaller than the visible layer; for example, let’s use 64 units.

rbm = RBM(visible_num=28 * 28, hidden_num=64)

Preparing the Data

Now, prepare the MNIST dataset. Because an RBM is an unsupervised learning model, we only use the image data and ignore the labels.

We also normalize the pixel values to the range 0.0 to 1.0.

(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()
x_train = np.array(x_train) / 255
x_test = np.array(x_test) / 255
x_train = x_train.reshape(-1, 28 * 28).astype(np.float32)
x_test = x_test.reshape(-1, 28 * 28).astype(np.float32)

Here, x_train is the training dataset and x_test is the test dataset.

Training the RBM

To train the RBM, simply pass the dataset and call fit, as in many similar frameworks. You can also specify the number of epochs and the batch size.

rbm.fit(x_train, epochs=10, batch_size=1000)

You will see output similar to the following:

# Computation will proceed on the CPU.
Epoch [1/10], KL Divergence: 0.3689
Epoch [2/10], KL Divergence: 0.2504
Epoch [3/10], KL Divergence: 0.2144
Epoch [4/10], KL Divergence: 0.1982
Epoch [5/10], KL Divergence: 0.1875
Epoch [6/10], KL Divergence: 0.1797
Epoch [7/10], KL Divergence: 0.1736
Epoch [8/10], KL Divergence: 0.1685
Epoch [9/10], KL Divergence: 0.1645
Epoch [10/10], KL Divergence: 0.1612

Since GPU acceleration was not specified, a message indicates that computation will proceed on the CPU.

As the cost function, we use the Kullback–Leibler (KL) divergence. Training is performed using the Contrastive Divergence (CD) algorithm. The input image is first encoded into the hidden layer, then reconstructed back to the visible layer. The weights are updated so that the reconstructed image becomes closer to the original input.

A Helper Function for Image Reconstruction

Next, we let the RBM reconstruct input images. To visualize the results, we define a helper function:

def show_restored_image(input, output):
  fig, axes = plt.subplots(1, 2, figsize=(4, 2))
  axes[0].axis('off')
  axes[0].set_title('Input Image')
  axes[0].imshow(input.reshape((28,28)), cmap='gray')
  axes[1].axis('off')
  axes[1].set_title('Restored Image')
  axes[1].imshow(output.reshape((28,28)), cmap='gray')
  plt.show()

This function simply takes the original image (input) and the RBM-reconstructed image (output) and displays them side by side using Matplotlib.

Reconstructing Images

Now, let’s reconstruct some images.

We use x_test, the portion of the dataset that was not used for training. Passing this data to rbm.reconstruct returns reconstructed images from the trained RBM.

The reconstruction is computed through the following procedure:

Fix the visible layer units to the input data and sample the hidden layer units.
Fix the sampled hidden layer units and compute the expected values of the visible layer units.

The simple_rbm library uses a Bernoulli–Bernoulli model with Ising spins in both the visible and hidden layers. However, RBM::reconstruct returns the expected values of the visible units, so the output consists of real-valued numbers.

The following code feeds the first 10 test images into the RBM and displays their reconstructions:

for i in range(10):
  show_restored_image(x_test[i], rbm.reconstruct(x_test[i].reshape(1, 28 * 28))[0])

The output looks like this:

Although the reconstructed images are not identical to the inputs, we can clearly recognize that they represent the same digits.

In this example, the RBM compresses the original 28×28 = 784-bit visible representation into a 64-bit hidden representation, and then reconstructs it back to 784 units. This demonstrates how the RBM performs information compression and reconstruction.

Other Usage Options

Using GPGPU

If you want to enable GPGPU acceleration, specify use_GPU=True in the constructor:

rbm = RBM(visible_num=28 * 28, hidden_num=64, use_GPU=True)

If CuPy is available in your environment, GPU acceleration will be automatically used when calling RBM::fit.

# GPU usage has been enabled. Computation will proceed on the GPU.
Epoch [1/10], KL Divergence: 0.3716
Epoch [2/10], KL Divergence: 0.2513
Epoch [3/10], KL Divergence: 0.2144
Epoch [4/10], KL Divergence: 0.1968
Epoch [5/10], KL Divergence: 0.1857
Epoch [6/10], KL Divergence: 0.1780
Epoch [7/10], KL Divergence: 0.1723
Epoch [8/10], KL Divergence: 0.1677
Epoch [9/10], KL Divergence: 0.1639
Epoch [10/10], KL Divergence: 0.1607

In Google Colab, simply selecting a GPU runtime enables the accelerated version of the RBM. The GitHub repository also provides a sample Google Colab notebook demonstrating GPU usage.

Using the Library in Your Own Project

If you want to use the RBM library in your own code, creating a virtual environment is recommended:

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install https://github.com/watanabe-appi/simple_rbm.git

After this setup, the RBM library will be ready to use in your project.

Conclusion

In this article, we introduced how to use the RBM library developed in our laboratory. RBMs have a simple structure, which makes them both analytically tractable and intellectually interesting. We hope this library will contribute to further research and experimentation with RBMs.

DEV Community