DEV Community

Cover image for PyTorch Introductory Experiments
Trevor Lee
Trevor Lee

Posted on • Edited on

PyTorch Introductory Experiments

PyTorch Introductory Experiments

In this project, I will revisit some of my previous basic TensorFlow Deep Learning experiments / training exercises

Nevertheless this time, those experiments will be retried (reimplemented)

  • The experiments will be using PyTorch as the deep learning framework (and stil be with "dense" layers only)
  • They will not be targeted for microcontroller; but still be demonstrable, with the help of DumbDisplay using regular Python and PyTorch

Installation with VSCode

This project is developed with VSCode (with Python extension); here, I will assume VSCode development environment as well.

To try along, please clone this project -- PyTorchIntroductoryExperiments -- from GitHub to your local machine, and open the folder with VSCode.

Create a virtual environment by selecting the command Python: Select Interpreter from the command palette and choose to create a new virtual environment.

In the process, you will be given an option to also install the required dependent packages specified in requirements.txt file.
In case you missed installing those dependent package, you still can install them, including MicroPython DumbDisplay Library, with an opened terminal (virtual environment activated) by running pip
like

pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

I have tested the Python code of the project to work with Python 3.12.8

For older version of Python, when installing the packages, you might see error like

ModuleNotFoundError: No module named 'setuptools.config.expand'; 'setuptools.config' is not a package
Enter fullscreen mode Exit fullscreen mode

try upgrade pip like

python -m pip install --upgrade pip
Enter fullscreen mode Exit fullscreen mode

then run

pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

again.

When open the Jupyter Notebook, the needed components will be installed when necessary.

"Hello World" Deep Learning Model Training

The "Hello World" of Deep Learning here refers to the training of a DL model for the sine Mathematical function -- train_sine.ipynb.

I guess modeling the Mathematical function sine is considered the "Hello World" of Deep Learning because
1) The training data is very easy to generate.
2) The architecture of the model can simply be just a few "dense" layers.
3) When it comes to implementation, the Mathematical function sine might not be as trival as it sounds, and therefore a good demonstration of the Deep Learning "magic".

Moreover, just for fun, I extended the architecture of the sine model to output cosine values at the same time -- train_sine_cosine.ipynb.

Highlights of train_sine.ipynb

The target of this "Hello World" model is the sine Mathematic function for the input range from 0 to 2π (0° to 360°).

Hence, to create the data for the training the model, can generate randomized data like

x_values = np.random.uniform(low=0, high=2*math.pi, size=SAMPLES)
np.random.shuffle(x_values)
y_values = np.sin(x_values)
Enter fullscreen mode Exit fullscreen mode

Since the default datatype for PyTorch is float32, first convert the data to float32, and reshape them as

x_values = x_values.astype(np.float32).reshape(-1, 1)
y_values = y_values.astype(np.float32).reshape(-1, 1)
Enter fullscreen mode Exit fullscreen mode

Notes:

  • x_values is a two-dimensional matrix, with a single column, as the input to the model is a single value (the input angle in radian)
  • y_values is also a two-dimensional matrix, with a single column, as the output of the model is also a single value (the sine value of the input angle)

Then, 70% of the data will be treated as "train" data set, and the rest is treated as "test" data set

train_split = int(0.7 * SAMPLES)

x_train, x_test = np.split(x_values, [train_split])
y_train, y_test = np.split(y_values, [train_split])
Enter fullscreen mode Exit fullscreen mode

Now, turn the above data into something suitable for the PyTorch DL framework.

First define the dataset class to capture the input data

class SineDataset(Dataset):
    def __init__(self, x_values, y_values):
        self.x_values = x_values
        self.y_values = y_values

    def __len__(self):
        return len(self.x_values)

    def __getitem__(self, idx):
        return (self.x_values[idx], self.y_values[idx])
Enter fullscreen mode Exit fullscreen mode
  • __len__ -- the size of the data
  • __getitem__ -- given an index of the data, returns the (input, label) pair

With SineDataset, instantiate data loaders (DataLoader) for the "train" and "test" data sets

train_dataset = SineDataset(x_train, y_train)
test_dataset = SineDataset(x_test, y_test)

train_dataloader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
test_dataloader = DataLoader(test_dataset, batch_size=BATCH_SIZE)
Enter fullscreen mode Exit fullscreen mode
  • Here, BATCH_SIZE is the size of each training batch for each training epoch.
  • For the "train" data set, it is better to shuffle it.

As the core, define the model class

class SineModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(1, 16),
            nn.ReLU(),
            nn.Linear(16, 16),
            nn.ReLU(),
            nn.Linear(16, 1)
        )

    def forward(self, x):
        return self.linear_relu_stack(x)
Enter fullscreen mode Exit fullscreen mode
  • The model basically consist of:
    • an input "dense" layer -- a single input (the input angle in radian); 16 output, with relu activation
    • another "dense" layer -- 16 input; 16 output, with relu activation
    • an output "dense" layer -- 16 input; a single output (the sine value of the input angle)
  • During training / inference, the method forward will be called. Here, it simply calls the stack of layers to do the forward pass
    • x is the input -- from the dataset SineDataset during training; from inference input during inference

It is equally important to define the training and testing functions, which involves
the "loss function" and the "optimizer". Later on, the "loss function" and "optimizer" be selected from the standard ones that come with the PyTorch framework.

For training:

def train(device, dataloader, model, loss_fn, optimizer) -> float:
    num_batches = len(dataloader)
    model.train()
    train_loss = 0
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)
        train_loss += loss.item()

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

    train_loss /= num_batches
    return train_loss
Enter fullscreen mode Exit fullscreen mode
  • The function train will be called during training (as will be apparent in later traning loop)
  • During training, "grad info" will be accumulated, as model.train() is called.
  • For each batch:
    • prediction is made with the model -- pred = model(X)
    • loss is calculated -- loss = loss_fn(pred, y)
    • back-propagation involves:
    • loss.backward()
    • optimizer.step()
    • optimizer.zero_grad() -- reset the "grad info", for the next batch

For testing:

def test(device, dataloader, model, loss_fn) -> float:
    num_batches = len(dataloader)
    model.eval()
    test_loss = 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
    test_loss /= num_batches
    return test_loss
Enter fullscreen mode Exit fullscreen mode
  • The function test will be called during training as well, but for testing, the model will not accumulate grad info, as torch.no_grad() is called

At last, the actual training loop can be implemented as

EPOCH_COUNT = 600

loss_fn = nn.MSELoss()
optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)

for t in range(EPOCH_COUNT):
    train_loss = train(device, train_dataloader, model, loss_fn, optimizer)
    test_loss = test(device, test_dataloader, model, loss_fn)
    print(f"$ Epoch {t+1} / {EPOCH_COUNT} -- Train loss: {train_loss:.6f} / Test loss: {test_loss:.6f}")
Enter fullscreen mode Exit fullscreen mode
  • EPOCH_COUNT -- the number of epoch for the training
  • The "loss function" selection is MSELoss
  • The "optimizer" selection is RMSprop
  • In lock steps, train() and test() are called for each epoch iteration.
    • all batches of the "train" dataset are walked through in train(), with back-propagation to alter the model's parameters
    • all batches of the "test" dataset are walked through just to find out the average of loss of the "test" dataset

To evaluate the trained model, can run the test data again like:

x_values = [math.pi * (5 * a) / 180.0 for a in range(0, 73)]
x_values = np.array(x_values).reshape(-1, 1)
x_values = torch.tensor(x_values, device=device, dtype=torch.float32)
model.eval()
pred_y_values = model(x_values).tolist()
Enter fullscreen mode Exit fullscreen mode

Highlights of train_sine_cosine.ipynb

As previous "Hello World" model, this extended model (sine + cosine) will also have input range from 0 to 2π (0° to 360°).

Hence, to create the data for training the model, need to randomize to get both -- y_sin_values and y_cos_values

x_values = np.random.uniform(low=0, high=2*math.pi, size=SAMPLES)
np.random.shuffle(x_values)
y_sin_values = np.sin(x_values)
y_cos_values = np.cos(x_values)
Enter fullscreen mode Exit fullscreen mode

Again, since the default datatype for PyTorch is float32, convert the data to float32, and reshaped, as

x_values = x_values.astype(np.float32).reshape(-1, 1)         
y_sin_values = y_sin_values.astype(np.float32).reshape(-1, 1) 
y_cos_values = y_cos_values.astype(np.float32).reshape(-1, 1)
Enter fullscreen mode Exit fullscreen mode

Notes:

  • The input data x_values is the same as the previous model -- a two-dimensional array with a single column
  • Both of the output data y_sin_values and y_cos_values are also two-dimensional arrays with a single column -- the sine and cosine values respectively

As there are two sets of output values, it follows that there will also be two sets of values for the train and test data

x_train, x_test = np.split(x_values, [train_split])
y_sin_train, y_sin_test = np.split(y_sin_values, [train_split])
y_cos_train, y_cos_test = np.split(y_cos_values, [train_split])
Enter fullscreen mode Exit fullscreen mode

And the two sets of output data are stacked together for the model like

y_train = np.stack((y_sin_train, y_cos_train), axis=-1).reshape(-1, 2)
y_test = np.stack((y_sin_test, y_cos_test), axis=-1).reshape(-1, 2)
Enter fullscreen mode Exit fullscreen mode

Notes:

  • y_train is two-dimensional with two columns -- 1st column is for sine; 2nd column is for cosine
  • hence, y_train[0] are the sine and cosine values of the first input angle for the "train" data
  • similarly, y_test has the same shape as y_train, but for the "test" data

In other words, for the training, there are still two output values -- train output values y_train and test output values y_test

To capture the data set, the following classes -- SineCosineDataset and SineCosineModel -- are defined

class SineCosineDataset(Dataset):
    def __init__(self, x_values, y_values):
        self.x_values = x_values
        self.y_values = y_values

    def __len__(self):
        return len(self.x_values)

    def __getitem__(self, idx):
        return self.x_values[idx], self.y_values[idx]

class SineCosineModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(1, 16),
            nn.ReLU(),
            nn.Linear(16, 16),
            nn.ReLU(),
            nn.Linear(16, 2)
        )

    def forward(self, x):
        return self.linear_relu_stack(x)        
Enter fullscreen mode Exit fullscreen mode

Notice that there are 2 output values as the last layer of SineCosineModel -- one for sine and one for cosine

The rest of the code for training the model is basically the same as the sine one.

To evaluate the trained model, can run the test data again like:

x_values = [math.pi * (5 * a) / 180.0 for a in range(0, 73)]
x_values = np.array(x_values).reshape(-1, 1)
x_values = torch.tensor(x_values, device=device, dtype=torch.float32)
model.eval()
pred_y_values = model(x_values).tolist()
Enter fullscreen mode Exit fullscreen mode

Mnist Dataset DL Training and Demo App

The popular Mnist dataset is also frequently used as introductory demonstration to Deep Learning -- train_mnist.ipynb.

In this project, not only will a simple DL training of Mnist dataset be presented, demonstration UI will be realized wireless on your Android mobile phone with the help of DumbDisplay -- start_dd_mnist.py

UI is coded with (and driven by) Python using MicroPython DumbDisplay Library package; and is realized wirelessly on your Android mobile phone with DumbDisplay Android App

Nowadays, code generation with AI is a common practice. Indeed, I did prompt LLM to generate a Jupyter Notebook to train a PyTorch DL model for the Mnist dataset -- train_mnist_ai.ipynb.

The AI-generated DL model is more complex (and certainly more standard) than the one presented in train_mnist.ipynb. At least, mine basically only involves "dense" layers, while the AI-generated one involves "convolutional" layers.

The actual model architecture (the simple version) is captured by the class MnistModel defined in the file model_mnist.py

class MnistModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(784, 64),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(64, 64),
            nn.ReLU(),
            nn.Linear(64, 10)
        )

    def forward(self, x):
        logits = self.linear_relu_stack(x)
        return F.log_softmax(logits, dim=1)
Enter fullscreen mode Exit fullscreen mode

Please refer to train_mnist.ipynb for the complete training exercise.

Demo UI for the Trained Mnist DL Model

The Python script start_dd_mnist.py starts the DumbDisplay program which uses MicroPython DumbDisplay Library to drive a wireless UI on your Android mobile phone with DumbDisplay Android App.

After starting the Python script start_dd_mnist.py, you should see from VSCode terminal that it waits for connection from DumbDisplay Android App.

**********
*****
*** reading model output/mnist_model.pth ...
*****
**********



**********
*****
*** starting mnist with model output/mnist_model.pth ...
*****
**********

connecting socket ... listing on 192.168.0.46:10201 ...
Enter fullscreen mode Exit fullscreen mode

You can open the DumbDisplay Android App installed in your Android phone -- DumbDisplay Android App -- and make connection like

When connected, the VSCode should show something like

... connected 192.168.0.46:10201 from ('192.168.0.98', 41578)
Enter fullscreen mode Exit fullscreen mode
Once connected, on the [black] canvas of the UI, draw a dight you want the DL model to recognize, like the digit 8, and press the >>> button to trigger inference of the drawn digit data (stored in the memory of the running Python process)

The inference with PyTorch is actually performed by the following Python function mnist_inference

def mnist_inference(inference_data, model) -> int:
    try:
        x = np.array(inference_data).reshape((1, 784))
        torch_x = torch.tensor(x, dtype=torch.float32).reshape(1, 784)
        output = model(torch_x)
        pred = output.argmax()
        return pred.item()
    except Exception as e:
        print(f"XXX error during inference: {e}")
        raise e
Enter fullscreen mode Exit fullscreen mode

Note that mnist_inference is just a callback function for the DumbDisplay "driver" Python code implemented as an example of MicroPython DumbDisplay Library

Now, draw the digit 9 on the canvas ...

If you want to clear what drawn, press the clear button.

The center button toggles whether the drawn digit will be auto centered before calling mnist_inference for inference.

It is interesting to see that even without auto-centering, the digit recognition is pretty good, especially with the AI generated model -- start_dd_mnist_ai.py.

Sliding Puzzle DL Training and Demo App

The DL model presented in this project for the classical Sliding Puzzle game is a simple and naive "next move" suggesting model -- train_sliding_puzzle.ipynb

I came up with this simple and naive "next move" suggesting model by referencing to the above-mentioned "Hello World" and Mnist DL models, just for fun.

Say, for a 4x4 board.

  • There will be 16 tiles, with the 0th tile being the empty space; e.g. the board "orientation" can be represented as
  |  0 |  1 |  2 |  3 |
  |  4 |  5 |  6 |  7 |
  |  8 |  9 | 10 | 11 |
  | 12 | 13 | 14 | 15 |
Enter fullscreen mode Exit fullscreen mode
  • With respect to the empty space (i.e. the 0th tile), there can be 4 possible moves (but some might be invalid)

    • 0: from left
    • 1: from right
    • 2: from top
    • 3: from bottom
  • For example, the above solved board can be randomized by a single step of the move 1 (from right) to

  |  1 |  0 |  2 |  3 |
  |  4 |  5 |  6 |  7 |
  |  8 |  9 | 10 | 11 |
  | 12 | 13 | 14 | 15 |
Enter fullscreen mode Exit fullscreen mode
  • The "next move" of this randomized board is apparently an "undo" move to undo the randomization step, in this case, the move 0 (from left)

    • 1 => undo with 0
    • 0 => undo with 1
    • 2 => undo with 3
    • 3 => undo with 2
  • In other words, the "undo" moves are the "next moves" toward solving a randomized board

  • The way to capture the "undo" moves can be as easy as

    • during a randomization step, a valid "move" is select
    • the "undo" move is recorded as the "next move" for the randomized board "orientation"
    • so, for a board randomized by 5 steps, there will be 5 "next moves" recorded -- one for each board "orientation"
  • The board "orientations" is the input to the DL model. The "undo" moves is the output of the DL model.

The actual model architecture is captured by the class SlidingPuzzleModel defined in the file model_sliding_puzzle.py

class SlidingPuzzleModel(nn.Module):
    def __init__(self, tile_count):
        super().__init__()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(tile_count * tile_count, 256),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(256, 256),
            nn.ReLU(),
            nn.Linear(256, 4)
        )

    def forward(self, x):
        logits = self.linear_relu_stack(x)
        return F.softmax(logits, dim=1)
Enter fullscreen mode Exit fullscreen mode

Please refer to train_sliding_puzzle.ipynb for the complete training exercise, which is a bit more involving, mostly because generation of the training data.

Demo UI for the Trained Sliding Puzzle Model

The demo UI Python script this time is start_dd_sliding_puzzle.py, which is mostly based on the example of MicroPython DumbDisplay Library

After starting the Python script start_dd_sliding_puzzle.py, you should see from VSCode terminal that it waits for connections from the DumbDisplay Android App.

**********
*****
*** reading model output/sp_model_4.pth ...
*****
**********



**********
*****
*** starting sliding puzzle with model output/sp_model_4.pth ...
*****
**********

connecting socket ... listing on 192.168.0.46:10201 ...
Enter fullscreen mode Exit fullscreen mode

You can open the DumbDisplay Android App and make connection like previously.

Once connect, double-press the sliding puzzle game board to randomize it by 5 steps. You can try to solve the puzzle manually by moving / sliding the appropriate tile to the empty space. If stuck, press the suggest button to have the DL model suggest the "next move" for you.

When "next move" is needed, the following Python function suggest_next_move will be called

def suggest_next_move(board_manager: BoardManager, model, tile_count) -> int:
    x = torch.tensor([board_manager.board_tiles], dtype=torch.float32) / (tile_count * tile_count)
    prediction = model(x)
    move_ans = prediction.argmax(dim=1).item()
    return move_ans
Enter fullscreen mode Exit fullscreen mode

Messed up or not, double-press the 🔄 Reset 🔄 button to reset the game; but this time, press the continuous button to have "next move" suggested [and made] continuously.

If things go well, the game should be solved in 5 suggested "next moves".

Once solved, double-press the board to randomize it again; but this time, it will be randomized by 10 steps (5 more than last time).

Once solved again, double-press the board again ...

It is interesting to see how randomized the board is, the DL model can still suggest the correct "next moves" to solve it.
My experience is, around 15 to 20 randomize steps.

If you are interested, try tuning the model to see if it can achieve more randomize steps!

Two Simple DumbDisplay Samples

In the Python script test_run_dd.py are two simple DumbDisplay samples

run_blink run_graphical

Hopefully, these two samples can kick start those interested friends to implement Android apps using Python with DumbDisplay,
for purposes like that of this project.

Have Fun!

Peace be with you!
May God bless you!
Jesus loves you!
Amazing Grace!

Top comments (0)