Rijul Rajesh

Posted on Dec 23

Activation Functions: How Simple Curves Power Neural Networks

#ai #machinelearning #rnn #neuralnetwork

In my previous article, we touched upon the sequence-to-sequence model, and we came across RNNs (Recurrent Neural Networks).

To understand Recurrent Neural Networks, first we need to understand neural networks.

This article will start with one of the components in a neural network, called activation functions.

Let's begin!

What is an activation function?

A neural network is made up of layers, and each layer is made up of neurons.

Inside a neuron, two things will happen.

Inputs are combined using weights
The result is passed through an activation function

So you can think of the activation function as a weighing scale. It decides how much a neuron should activate for a given input.

Activation functions as curves

Activation functions are basically curves that transform the input values.

Let's look at the commonly discussed ones:

ReLU
Softplus
Sigmoid

But first, let's set up some tiny Python code so that this doesn’t get too theoretical, which I myself hate :)

We will just import NumPy and Matplotlib to demo this.

import numpy as np
import matplotlib.pyplot as plt

We’ll create a range of input values that we can pass through each activation function to see how they behave across different inputs.

# Generate 400 evenly spaced values from -10 to 10
x = np.linspace(-10, 10, 400)

-10 → start of the range (most negative input)
10 → end of the range (most positive input)
400 → number of points, giving us a smooth curve when we plot

This array x represents the inputs to the activation functions. By passing all these values through a function like ReLU or Sigmoid, we can visualize how the function transforms inputs to outputs.

1. ReLU (Rectified Linear Unit)

The above is basically what we mean by ReLU.
If the input is negative, the output is 0.
If the input is positive, the output is the same value.

We can easily define this as a Python function like so:

def relu(x):
    return np.maximum(0, x)

Let's plot the graph.

plt.figure()
plt.plot(x, relu(x), label="ReLU")
plt.title("ReLU Activation Function")
plt.xlabel("Input")
plt.ylabel("Output")
plt.grid(True)
plt.legend()
plt.show()

So this is our ReLU graph.

You can observe how simple it is. If it’s negative, just assign it as 0. Otherwise, just assign the exact value.

ReLU is the default choice for most neural networks today.

Let's check Softplus next.

2. Softplus (Smooth ReLU)

Softplus is basically ReLU but smooth.
In ReLU, we could see a sharp change to 0; in Softplus, the transition is smoother.

We can represent this in Python like so:

def softplus(x):
    return np.log(1 + np.exp(x))

Let's plot this and see what it gives.

plt.figure()
plt.plot(x, softplus(x), label="Softplus")
plt.title("Softplus Activation Function")
plt.xlabel("Input")
plt.ylabel("Output")
plt.grid(True)
plt.legend()
plt.show()

As you can see, the sharp turns become smooth.

Softplus avoids some problems of ReLU, but it’s computationally more expensive, so it’s used less often in practice.

3. Sigmoid

Sigmoid squashes any input into a value between 0 and 1.

Let's express this in Python:

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

And plot it as well.

plt.figure()
plt.plot(x, sigmoid(x), label="Sigmoid")
plt.title("Sigmoid Activation Function")
plt.xlabel("Input")
plt.ylabel("Output")
plt.grid(True)
plt.legend()
plt.show()

The output is an S-shaped curve. It is great for binary classification outputs.

That's it for this article. Next, we will look into gradients of these activation functions.

You can try the examples out via the Colab notebook

Just like activation functions make neurons work efficiently in a network, having the right tools can make your work as a developer much easier. If you’ve ever struggled with repetitive tasks, obscure commands, or debugging headaches, this platform is here to make your life easier. It’s free, open-source, and built with developers in mind.