PyTorch Introductory Experiments
In this project, I will revisit some of my previous basic TensorFlow Deep Learning experiments / training exercises
- Trying Out TensorFlow Lite Hello World Model With ESP32 and DumbDisplay
- Mnist Dataset -- From Training to Running With ESP32 / ESP32S3
- Sliding Puzzle 'Next Move' Suggesting Simple DL Model With ESP32 TensorFlow Lite
Nevertheless this time, those experiments will be retried (reimplemented)
- The experiments will be using PyTorch as the deep learning framework (and stil be with "dense" layers only)
- They will not be targeted for microcontroller; but still be demonstrable, with the help of DumbDisplay using regular Python and PyTorch
Installation with VSCode
This project is developed with VSCode (with Python extension); here, I will assume VSCode development environment as well.
To try along, please clone this project -- PyTorchIntroductoryExperiments
-- from GitHub to your local machine, and open the folder with VSCode.
Create a virtual environment by selecting the command Python: Select Interpreter
from the command palette and choose to create a new virtual environment.
In the process, you will be given an option to also install the required dependent packages specified in requirements.txt
file.
In case you missed installing those dependent package, you still can install them, including MicroPython DumbDisplay Library, with an opened terminal (virtual environment activated) by running pip
like
pip install -r requirements.txt
I have tested the Python code of the project to work with Python 3.12.8
For older version of Python, when installing the packages, you might see error like
ModuleNotFoundError: No module named 'setuptools.config.expand'; 'setuptools.config' is not a package
try upgrade pip
like
python -m pip install --upgrade pip
then run
pip install -r requirements.txt
again.
When open the Jupyter Notebook, the needed components will be installed when necessary.
"Hello World" Deep Learning Model Training
The "Hello World" of Deep Learning here refers to the training of a DL model for the sine
Mathematical function -- train_sine.ipynb
.
I guess modeling the Mathematical function sine
is considered the "Hello World" of Deep Learning because
1) The training data is very easy to generate.
2) The architecture of the model can simply be just a few "dense" layers.
3) When it comes to implementation, the Mathematical function sine
might not be as trival as it sounds, and therefore a good demonstration of the Deep Learning "magic".
Moreover, just for fun, I extended the architecture of the sine
model to output cosine
values at the same time -- train_sine_cosine.ipynb
.
![]() |
![]() |
Highlights of train_sine.ipynb
The target of this "Hello World" model is the sine
Mathematic function for the input range from 0 to 2π (0° to 360°).
Hence, to create the data for the training the model, can generate randomized data like
x_values = np.random.uniform(low=0, high=2*math.pi, size=SAMPLES)
np.random.shuffle(x_values)
y_values = np.sin(x_values)
Since the default datatype for PyTorch is float32
, first convert the data to float32
, and reshape them as
x_values = x_values.astype(np.float32).reshape(-1, 1)
y_values = y_values.astype(np.float32).reshape(-1, 1)
Notes:
-
x_values
is a two-dimensional matrix, with a single column, as the input to the model is a single value (the input angle in radian) -
y_values
is also a two-dimensional matrix, with a single column, as the output of the model is also a single value (thesine
value of the input angle)
Then, 70% of the data will be treated as "train" data set, and the rest is treated as "test" data set
train_split = int(0.7 * SAMPLES)
x_train, x_test = np.split(x_values, [train_split])
y_train, y_test = np.split(y_values, [train_split])
Now, turn the above data into something suitable for the PyTorch DL framework.
First define the dataset class to capture the input data
class SineDataset(Dataset):
def __init__(self, x_values, y_values):
self.x_values = x_values
self.y_values = y_values
def __len__(self):
return len(self.x_values)
def __getitem__(self, idx):
return (self.x_values[idx], self.y_values[idx])
-
__len__
-- the size of the data -
__getitem__
-- given an index of the data, returns the (input, label) pair
With SineDataset
, instantiate data loaders (DataLoader
) for the "train" and "test" data sets
train_dataset = SineDataset(x_train, y_train)
test_dataset = SineDataset(x_test, y_test)
train_dataloader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
test_dataloader = DataLoader(test_dataset, batch_size=BATCH_SIZE)
- Here,
BATCH_SIZE
is the size of each training batch for each training epoch. - For the "train" data set, it is better to shuffle it.
As the core, define the model class
class SineModel(nn.Module):
def __init__(self):
super().__init__()
self.linear_relu_stack = nn.Sequential(
nn.Linear(1, 16),
nn.ReLU(),
nn.Linear(16, 16),
nn.ReLU(),
nn.Linear(16, 1)
)
def forward(self, x):
return self.linear_relu_stack(x)
- The model basically consist of:
- an input "dense" layer -- a single input (the input angle in radian); 16 output, with
relu
activation - another "dense" layer -- 16 input; 16 output, with
relu
activation - an output "dense" layer -- 16 input; a single output (the
sine
value of the input angle)
- an input "dense" layer -- a single input (the input angle in radian); 16 output, with
- During training / inference, the method
forward
will be called. Here, it simply calls the stack of layers to do the forward pass-
x
is the input -- from the datasetSineDataset
during training; from inference input during inference
-
It is equally important to define the training and testing functions, which involves
the "loss function" and the "optimizer". Later on, the "loss function" and "optimizer" be selected from the standard ones that come with the PyTorch framework.
For training:
def train(device, dataloader, model, loss_fn, optimizer) -> float:
num_batches = len(dataloader)
model.train()
train_loss = 0
for batch, (X, y) in enumerate(dataloader):
X, y = X.to(device), y.to(device)
# Compute prediction error
pred = model(X)
loss = loss_fn(pred, y)
train_loss += loss.item()
# Backpropagation
loss.backward()
optimizer.step()
optimizer.zero_grad()
train_loss /= num_batches
return train_loss
- The function
train
will be called during training (as will be apparent in later traning loop) - During training, "grad info" will be accumulated, as
model.train()
is called. - For each batch:
- prediction is made with the model --
pred = model(X)
- loss is calculated --
loss = loss_fn(pred, y)
- back-propagation involves:
loss.backward()
optimizer.step()
-
optimizer.zero_grad()
-- reset the "grad info", for the next batch
- prediction is made with the model --
For testing:
def test(device, dataloader, model, loss_fn) -> float:
num_batches = len(dataloader)
model.eval()
test_loss = 0
with torch.no_grad():
for X, y in dataloader:
X, y = X.to(device), y.to(device)
pred = model(X)
test_loss += loss_fn(pred, y).item()
test_loss /= num_batches
return test_loss
- The function
test
will be called during training as well, but for testing, the model will not accumulate grad info, astorch.no_grad()
is called
At last, the actual training loop can be implemented as
EPOCH_COUNT = 600
loss_fn = nn.MSELoss()
optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)
for t in range(EPOCH_COUNT):
train_loss = train(device, train_dataloader, model, loss_fn, optimizer)
test_loss = test(device, test_dataloader, model, loss_fn)
print(f"$ Epoch {t+1} / {EPOCH_COUNT} -- Train loss: {train_loss:.6f} / Test loss: {test_loss:.6f}")
-
EPOCH_COUNT
-- the number of epoch for the training - The "loss function" selection is
MSELoss
- The "optimizer" selection is
RMSprop
- In lock steps,
train()
andtest()
are called for each epoch iteration.- all batches of the "train" dataset are walked through in
train()
, with back-propagation to alter the model's parameters - all batches of the "test" dataset are walked through just to find out the average of loss of the "test" dataset
- all batches of the "train" dataset are walked through in
To evaluate the trained model, can run the test data again like:
x_values = [math.pi * (5 * a) / 180.0 for a in range(0, 73)]
x_values = np.array(x_values).reshape(-1, 1)
x_values = torch.tensor(x_values, device=device, dtype=torch.float32)
model.eval()
pred_y_values = model(x_values).tolist()
Highlights of train_sine_cosine.ipynb
As previous "Hello World" model, this extended model (sine
+ cosine
) will also have input range from 0 to 2π (0° to 360°).
Hence, to create the data for training the model, need to randomize to get both -- y_sin_values
and y_cos_values
x_values = np.random.uniform(low=0, high=2*math.pi, size=SAMPLES)
np.random.shuffle(x_values)
y_sin_values = np.sin(x_values)
y_cos_values = np.cos(x_values)
Again, since the default datatype for PyTorch is float32
, convert the data to float32
, and reshaped, as
x_values = x_values.astype(np.float32).reshape(-1, 1)
y_sin_values = y_sin_values.astype(np.float32).reshape(-1, 1)
y_cos_values = y_cos_values.astype(np.float32).reshape(-1, 1)
Notes:
- The input data
x_values
is the same as the previous model -- a two-dimensional array with a single column - Both of the output data
y_sin_values
andy_cos_values
are also two-dimensional arrays with a single column -- thesine
andcosine
values respectively
As there are two sets of output values, it follows that there will also be two sets of values for the train
and test
data
x_train, x_test = np.split(x_values, [train_split])
y_sin_train, y_sin_test = np.split(y_sin_values, [train_split])
y_cos_train, y_cos_test = np.split(y_cos_values, [train_split])
And the two sets of output data are stacked together for the model like
y_train = np.stack((y_sin_train, y_cos_train), axis=-1).reshape(-1, 2)
y_test = np.stack((y_sin_test, y_cos_test), axis=-1).reshape(-1, 2)
Notes:
-
y_train
is two-dimensional with two columns -- 1st column is forsine
; 2nd column is forcosine
- hence,
y_train[0]
are thesine
andcosine
values of the first input angle for the "train" data - similarly,
y_test
has the same shape asy_train
, but for the "test" data
In other words, for the training, there are still two output values -- train
output values y_train
and test
output values y_test
To capture the data set, the following classes -- SineCosineDataset
and SineCosineModel
-- are defined
class SineCosineDataset(Dataset):
def __init__(self, x_values, y_values):
self.x_values = x_values
self.y_values = y_values
def __len__(self):
return len(self.x_values)
def __getitem__(self, idx):
return self.x_values[idx], self.y_values[idx]
class SineCosineModel(nn.Module):
def __init__(self):
super().__init__()
self.linear_relu_stack = nn.Sequential(
nn.Linear(1, 16),
nn.ReLU(),
nn.Linear(16, 16),
nn.ReLU(),
nn.Linear(16, 2)
)
def forward(self, x):
return self.linear_relu_stack(x)
Notice that there are 2 output values as the last layer of SineCosineModel
-- one for sine
and one for cosine
The rest of the code for training the model is basically the same as the sine
one.
To evaluate the trained model, can run the test data again like:
x_values = [math.pi * (5 * a) / 180.0 for a in range(0, 73)]
x_values = np.array(x_values).reshape(-1, 1)
x_values = torch.tensor(x_values, device=device, dtype=torch.float32)
model.eval()
pred_y_values = model(x_values).tolist()
Mnist Dataset DL Training and Demo App
The popular Mnist dataset is also frequently used as introductory demonstration to Deep Learning -- train_mnist.ipynb
.
In this project, not only will a simple DL training of Mnist dataset be presented, demonstration UI will be realized wireless on your Android mobile phone with the help of DumbDisplay -- start_dd_mnist.py
UI is coded with (and driven by) Python using MicroPython DumbDisplay Library package; and is realized wirelessly on your Android mobile phone with DumbDisplay Android App
Nowadays, code generation with AI is a common practice. Indeed, I did prompt LLM to generate a Jupyter Notebook to train a PyTorch DL model for the Mnist dataset -- train_mnist_ai.ipynb
.
The AI-generated DL model is more complex (and certainly more standard) than the one presented in train_mnist.ipynb
. At least, mine basically only involves "dense" layers, while the AI-generated one involves "convolutional" layers.
The actual model architecture (the simple version) is captured by the class MnistModel
defined in the file model_mnist.py
class MnistModel(nn.Module):
def __init__(self):
super().__init__()
self.linear_relu_stack = nn.Sequential(
nn.Linear(784, 64),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(64, 64),
nn.ReLU(),
nn.Linear(64, 10)
)
def forward(self, x):
logits = self.linear_relu_stack(x)
return F.log_softmax(logits, dim=1)
Please refer to train_mnist.ipynb
for the complete training exercise.
Demo UI for the Trained Mnist DL Model
The Python script start_dd_mnist.py
starts the DumbDisplay program which uses MicroPython DumbDisplay Library to drive a wireless UI on your Android mobile phone with DumbDisplay Android App.
After starting the Python script start_dd_mnist.py
, you should see from VSCode terminal that it waits for connection from DumbDisplay Android App.
**********
*****
*** reading model output/mnist_model.pth ...
*****
**********
**********
*****
*** starting mnist with model output/mnist_model.pth ...
*****
**********
connecting socket ... listing on 192.168.0.46:10201 ...
You can open the DumbDisplay Android App installed in your Android phone -- DumbDisplay Android App -- and make connection like
![]() |
![]() |
![]() |
![]() |
When connected, the VSCode should show something like
... connected 192.168.0.46:10201 from ('192.168.0.98', 41578)
Once connected, on the [black] canvas of the UI, draw a dight you want the DL model to recognize, like the digit 8 , and press the >>> button to trigger inference of the drawn digit data (stored in the memory of the running Python process) |
![]() |
The inference with PyTorch is actually performed by the following Python function mnist_inference
def mnist_inference(inference_data, model) -> int:
try:
x = np.array(inference_data).reshape((1, 784))
torch_x = torch.tensor(x, dtype=torch.float32).reshape(1, 784)
output = model(torch_x)
pred = output.argmax()
return pred.item()
except Exception as e:
print(f"XXX error during inference: {e}")
raise e
Note that mnist_inference
is just a callback function for the DumbDisplay "driver" Python code implemented as an example of MicroPython DumbDisplay Library
Now, draw the digit 9
on the canvas ...
If you want to clear what drawn, press the clear
button.
The center
button toggles whether the drawn digit will be auto centered before calling mnist_inference
for inference.
It is interesting to see that even without auto-centering, the digit recognition is pretty good, especially with the AI generated model -- start_dd_mnist_ai.py
.
Sliding Puzzle DL Training and Demo App
The DL model presented in this project for the classical Sliding Puzzle game is a simple and naive "next move" suggesting model -- train_sliding_puzzle.ipynb
I came up with this simple and naive "next move" suggesting model by referencing to the above-mentioned "Hello World" and Mnist DL models, just for fun.
Say, for a 4x4 board.
- There will be 16 tiles, with the 0th tile being the empty space; e.g. the board "orientation" can be represented as
| 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
-
With respect to the empty space (i.e. the 0th tile), there can be 4 possible moves (but some might be invalid)
-
0
: from left -
1
: from right -
2
: from top -
3
: from bottom
-
For example, the above solved board can be randomized by a single step of the move
1
(from right) to
| 1 | 0 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
-
The "next move" of this randomized board is apparently an "undo" move to undo the randomization step, in this case, the move
0
(from left)-
1
=> undo with0
-
0
=> undo with1
-
2
=> undo with3
-
3
=> undo with2
-
In other words, the "undo" moves are the "next moves" toward solving a randomized board
-
The way to capture the "undo" moves can be as easy as
- during a randomization step, a valid "move" is select
- the "undo" move is recorded as the "next move" for the randomized board "orientation"
- so, for a board randomized by 5 steps, there will be 5 "next moves" recorded -- one for each board "orientation"
The board "orientations" is the input to the DL model. The "undo" moves is the output of the DL model.
The actual model architecture is captured by the class SlidingPuzzleModel
defined in the file model_sliding_puzzle.py
class SlidingPuzzleModel(nn.Module):
def __init__(self, tile_count):
super().__init__()
self.linear_relu_stack = nn.Sequential(
nn.Linear(tile_count * tile_count, 256),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(256, 256),
nn.ReLU(),
nn.Linear(256, 4)
)
def forward(self, x):
logits = self.linear_relu_stack(x)
return F.softmax(logits, dim=1)
Please refer to train_sliding_puzzle.ipynb
for the complete training exercise, which is a bit more involving, mostly because generation of the training data.
Demo UI for the Trained Sliding Puzzle Model
The demo UI Python script this time is start_dd_sliding_puzzle.py
, which is mostly based on the example of MicroPython DumbDisplay Library
After starting the Python script start_dd_sliding_puzzle.py
, you should see from VSCode terminal that it waits for connections from the DumbDisplay Android App.
**********
*****
*** reading model output/sp_model_4.pth ...
*****
**********
**********
*****
*** starting sliding puzzle with model output/sp_model_4.pth ...
*****
**********
connecting socket ... listing on 192.168.0.46:10201 ...
You can open the DumbDisplay Android App and make connection like previously.
Once connect, double-press the sliding puzzle game board to randomize it by 5 steps. You can try to solve the puzzle manually by moving / sliding the appropriate tile to the empty space. If stuck, press the suggest button to have the DL model suggest the "next move" for you. |
![]() |
When "next move" is needed, the following Python function suggest_next_move
will be called
def suggest_next_move(board_manager: BoardManager, model, tile_count) -> int:
x = torch.tensor([board_manager.board_tiles], dtype=torch.float32) / (tile_count * tile_count)
prediction = model(x)
move_ans = prediction.argmax(dim=1).item()
return move_ans
Messed up or not, double-press the 🔄 Reset 🔄
button to reset the game; but this time, press the continuous
button to have "next move" suggested [and made] continuously.
If things go well, the game should be solved in 5 suggested "next moves".
Once solved, double-press the board to randomize it again; but this time, it will be randomized by 10 steps (5 more than last time).
Once solved again, double-press the board again ...
It is interesting to see how randomized the board is, the DL model can still suggest the correct "next moves" to solve it.
My experience is, around 15 to 20 randomize steps.
If you are interested, try tuning the model to see if it can achieve more randomize steps!
Two Simple DumbDisplay Samples
In the Python script test_run_dd.py
are two simple DumbDisplay samples
run_blink |
run_graphical |
---|---|
![]() |
![]() |
Hopefully, these two samples can kick start those interested friends to implement Android apps using Python with DumbDisplay,
for purposes like that of this project.
Have Fun!
Peace be with you!
May God bless you!
Jesus loves you!
Amazing Grace!
Top comments (0)