Deep Learning Workflow in PyTorch

#python #pytorch #deeplearning #workflow

Buy Me a Coffee☕

*Memos:

My post explains Linear Regression in PyTorch.
My post explains Batch, Mini-Batch and Stochastic Gradient Descent with DataLoader() in PyTorch.
My post explains Batch Gradient Descent without DataLoader() in PyTorch.
My post explains how to save a model in PyTorch.
My post explains how to load a saved model in PyTorch.
My repo has models.

Prepare dataset.
Prepare a model.
Train model.
Test model.
Save model.

1. Prepare dataset.

(1) Get dataset like images, video, sound, text, etc.

(2) Divide the dataset into the one for training(Train data) and the one for testing(Test data). *Basically, train data is 80% and test data is 20%.

(3) Shuffle datasets with DataLoader():
*Memos:

Basically, datasets are shuffled to mitigate Overfitting.
Basically, only train data is shuffled so test data is not shuffled.
My post explains Overfitting and Underfitting.
My post explains DataLoader().

2. Prepare a model.

(1) Select the suitable layers for the dataset. *My post explains layers in PyTorch.

(2) Select activation functions if necesarry. *My post explains activation functions in PyTorch.

3. Train the model.

(1) Select the suitable loss function and optimizer for the dataset:
*Memos:

My post explains loss functions in PyTorch.
My post explains optimizers in PyTorch.

(2) Calculate the model's predictions with true values(train data), working from input layer to output layer. *This calculation is called Forward Propagation or Forward Pass.

(3) Calculate the mean(average) of the sum of the losses(differences) between the model's predictions and true values(train data) using a loss function.

(4) Zero out the gradients of all tensors every training(epoch) for proper calculation. *The gradients are accumulated in buffers, then they are not overwritten until backward() is called.

(5) Calculate a gradient using the average loss(difference) calculated by (3), working from output layer to input layer. *This calculation is called Backpropagation or Backward Pass.

(6) Update the model's parameters(weight and bias) by gradient descent using the gradient calculated by (5) to minimize the mean(average) of the sum of the losses(differences) between the model's predictions and true values(train data) using an optimizer.

*Memos:

The tasks from (2) to (6) are one training(epoch).
Basically, the training(epoch) is repeated with a for loop to minimize the mean(average) of the sum of the losses(differences) between the model's predictions and true values(train data).
Basically, a model is tested (4. Test the model) after (6) every training(epoch) or once every n trainings(epochs).

4. Test the model.

(1) Calculate the model's predictions with true values(test data).

(2) Calculate the mean(average) of the sum of the losses(differences) between the model's predictions and true values(test data) with a loss function.

(3) Show each mean(average) of the sum of the losses(differences) with true values(train and test data) by text or graph.

5. Save the model.

Finally, save the model.

From Infra to Platforms: PulumiUP 2025 Panel

Don’t miss the expert panel at PulumiUP 2025 on May 6. Learn how teams are evolving from infrastructure engineering to platform engineering—faster, more secure, and at scale.

Save Your Spot