DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

Activation functions in PyTorch (1)

Buy Me a Coffee

*Memos:

An activation function is the function or layer which enables neural network to learn complex(non-linear) relationships by transforming the output of the previous layer. *Without activation functions, neural network can only learn linear relationships.

(1) Step function:

  • can convert an input value(x) to 0 or 1. *If x < 0, then 0 while if x >= 0, then 1.
  • is also called Binary step function, Unit step function, Binary threshold function, Threshold function, Heaviside step function or Heaviside function.
  • is heaviside() in PyTorch.
  • 's pros:
    • It's simple, only expressing the two values 0 and 1.
    • It avoids Exploding Gradient Problem.
  • 's cons:
    • is rarely used in Deep Learning because the cons are more than other activation functions.
    • It can only express the two values 0 and 1 so the created model has bad accuracy, predicting inaccurately. *The activation functions which can express wider values can create the model of good accuracy, predicting accurately.
    • It causes Dying ReLU Problem.
    • It's non-differentiable at x = 0. *The gradient for step function doesn't exist at x = 0 during Backpropagation which does differential to calculate and get a gradient.
  • 's graph in Desmos:

Image description

(2) Identity:

  • can just return the same value as an input value(x) without any conversion.
  • 's formula is y = x.
  • is also called Linear function.
  • is Identity() in PyTorch.
  • 's pros:
    • It's simple, just returning the same value as an input value.
  • 's cons:
    • It's non-differentiable at x = 0.
  • 's graph in Desmos:

Image description

(3) ReLU(Rectified Linear Unit):

  • can convert an input value(x) to the output value between 0 and x. *If x < 0, then 0 while if 0 <= x, then x.
  • 's formula is y = max(0, x).
  • is ReLU() in PyTorch.
  • is used in:
    • Binary Classification Model.
    • Multi-Class Classification Model.
    • CNN(Convolutional Neural Network).
    • RNN(Recurrent Neural Network). *RNN in PyTorch.
    • Transformer. *Transformer() in PyTorch.
    • NLP(Natural Language Processing) based on RNN.
    • GAN(Generative Adversarial Network).
  • 's pros:
    • It mitigates Vanishing Gradient Problem.
  • 's cons:
    • It causes Dying ReLU Problem.
    • It's non-differentiable at x = 0.
  • 's graph in Desmos:

Image description

Imagine monitoring actually built for developers

Billboard image

Join Vercel, CrowdStrike, and thousands of other teams that trust Checkly to streamline monitor creation and configuration with Monitoring as Code.

Start Monitoring

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Dive into an ocean of knowledge with this thought-provoking post, revered deeply within the supportive DEV Community. Developers of all levels are welcome to join and enhance our collective intelligence.

Saying a simple "thank you" can brighten someone's day. Share your gratitude in the comments below!

On DEV, sharing ideas eases our path and fortifies our community connections. Found this helpful? Sending a quick thanks to the author can be profoundly valued.

Okay