DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

Activation functions in PyTorch (5)

Buy Me a Coffee

*Memos:

(1) Tanh:

  • can convert an input value(x) to the output value between -1 and 1. *0 and 1 are exclusive.
  • 's formula is y = (ex - e-x) / (ex + e-x).
  • is also called Hyperbolic Tangent Function.
  • is Tanh() in PyTorch.
  • is used in:
    • RNN(Recurrent Neural Network). *RNN in PyTorch.
    • LSTM(Long Short-Term Memory). *LSTM() in PyTorch.
    • GRU(Gated Recurrent Unit). *GRU() in PyTorch.
    • GAN(Generative Adversarial Network).
  • s'pros:
    • It normalizes input values.
    • The convergence is stable.
    • It mitigates Exploding Gradient Problem.
    • It mitigates Dying ReLU Problem. *0 is still produced for the input value 0 so Dying ReLU Problem is not completely avoided.
  • s'cons:
    • It causes Vanishing Gradient Problem.
    • It's computationally expensive because of exponential and complex operation.
  • 's graph in Desmos:

Image description

(2) Softsign:

  • can convert an input value(x) to the output value between 1 and -1. *1 and -1 are exclusive.
  • 's formula is y = x / (1 + |x|).
  • is Softsign() in PyTorch.
  • 's pros:
    • It normalizes input values.
    • The convergence is stable.
    • It mitigates Exploding Gradient Problem.
    • It mitigates Dying ReLU Problem. *0 is still produced for the input value 0 so Dying ReLU Problem is not completely avoided.
  • 's cons:
    • It causes Vanishing Gradient Problem.
  • 's graph in Desmos:

Image description

(3) Sigmoid:

  • can convert an input value(x) to the output value between 0 and 1. *0 and 1 are exclusive.
  • 's formula is y = 1 / (1 + e-x).
  • is Sigmoid() in PyTorch.
  • is used in:
    • Binary Classification Model.
    • Logistic Regression.
    • LSTM.
    • GRU.
    • GAN.
  • 's pros:
    • It normalizes input values.
    • The convergence is stable.
    • It mitigates Exploding Gradient Problem.
    • It avoids Dying ReLU Problem.
  • 's cons:
    • It causes Vanishing Gradient Problem.
    • It's computationally expensive because of exponential operation.
  • 's graph in Desmos:

Image description

(4) Softmax:

  • can convert input values(xs) to the output values between 0 and 1 each and whose sum is 1(100%): *Memos:
    • *0 and 1 are exclusive.
    • If input values are [5, 4, -1], then the output values are [0.730, 0.268, 0.002] which is 0.730(73%) + 0.268(26.8%) + 0.002(0.2%) = 1(100%). *Sometimes, approximately 100%.
  • 's formula is: Image description
  • is Softmax() in PyTorch.
  • is used in:
    • Multi-Class Classification Model.
  • 's pros:
    • It normalizes input values.
    • The convergence is stable.
    • It mitigates Exploding Gradient Problem.
    • It avoids Dying ReLU Problem.
  • 's cons:
    • It causes Vanishing Gradient Problem.
    • It's computationally expensive because of exponential and complex operation.
  • 's graph in Desmos:

Image description

  • 's code from scratch in PyTorch. *dim=0 must be set for sum() otherwise different values are returned for a different D(Dimensional) tensor.
import torch

def softmax(input):
    e_i = torch.exp(input)
    return e_i / e_i.sum(dim=0)
                       # ↑↑↑↑↑ Must be set.

my_tensor = torch.tensor([8., -3., 0., 1.])

print(softmax(my_tensor))
# tensor([9.9874e-01, 1.6681e-05, 3.3504e-04, 9.1073e-04])

my_tensor = torch.tensor([[8., -3.], [0., 1.]])

print(softmax(my_tensor))
# tensor([[9.9966e-01, 1.7986e-02],
#         [3.3535e-04, 9.8201e-01]])

my_tensor = torch.tensor([[[8.], [-3.]], [[0.], [1.]]])

print(softmax(my_tensor))
# tensor([[[9.9966e-01],
#          [1.7986e-02]],
#         [[3.3535e-04],
#          [9.8201e-01]]])
Enter fullscreen mode Exit fullscreen mode

Heroku

Build apps, not infrastructure.

Dealing with servers, hardware, and infrastructure can take up your valuable time. Discover the benefits of Heroku, the PaaS of choice for developers since 2007.

Visit Site

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay