Shivank Pandey

Posted on Oct 2, 2023

Building a Neural Network from Scratch in Python

#ai #python #neuralnetwork

Neural networks are powerful machine learning models inspired by the human brain's structure and functioning. In this tutorial, we'll walk through the process of building a basic neural network from scratch using Python.
A computational model called a neural network is based on how the human brain works and is organized. It is an effective technique used in artificial intelligence and machine learning for tasks including classification, regression, pattern recognition, and more.

A neural network's basic building blocks are interconnected nodes, or "neurons," arranged in layers. Information transmission and processing are carried out by these neurons. Here is a brief explanation of what makes up a neural network:

Input Layer: This layer takes in the introductory information or features and transfers them to the following layer. A aspect of the input data is represented by each neuron in this layer.

Hidden Layers: There may be one or more layers known as hidden layers between the input and output layers. These layers process the data by performing calculations. Every neuron in a hidden layer is linked to every other neuron in the layer above and below it.

Weights: Each neuronal connection has a corresponding weight that reflects the strength of the connection. These weights are modified during training in order to reduce prediction error.

Activation Function: Each neuron applies an activation function to the weighted sum of its inputs. This function introduces non-linearity into the model, allowing the neural network to learn complex relationships.

Output Layer: The final layer produces the network's output. The number of neurons in this layer depends on the type of problem the network is designed to solve. For example, in binary classification, there might be one neuron producing an output between 0 and 1.

Bias: Each neuron also has a bias term, which allows it to have some flexibility in its activation. This is important for learning different patterns.

Prerequisites

Before we begin, make sure you have the latest Python installed on your system. We will be using NumPy library for numerical computations. If you haven't already, install NumPy by running:

pip install numpy

Step 1: Initialize Weights and Biases

The first step to build a neural network is initializing the weights and biases. We will create a simple neural network with only one input layer, one hidden layer, and one output layer.

import numpy as np

# Define the architecture of the neural network
input_size = 2
hidden_size = 3
output_size = 1

# Initialize weights and biases
weights_input_hidden = np.random.randn(input_size, hidden_size)
bias_hidden = np.zeros((1, hidden_size))

weights_hidden_output = np.random.randn(hidden_size, output_size)
bias_output = np.zeros((1, output_size))

Step 2: Define Activation Functions

Activation functions introduce non-linearity to the neural network, allowing it to learn complex patterns.

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    return x * (1 - x)

What is Sigmoid function ?

Sigmoid function is a mathematical function that maps any real-valued number to a value between 0 and 1. The most commonly used sigmoid function is the logistic function, also known as the standard logistic function. sigmoid function tends to squash very large positive or negative inputs to very close to 0 or 1, respectively. This can lead to the vanishing gradient problem in deep learning, which can make training slower and less stable. For this reason, other activation functions like ReLu (Rectified Linear Unit) have become more popular in many neural network architectures. But for this tutorial we'll be using the Sigmoid function only.

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    return x * (1 - x)

Step 3: Define the Forward Pass

The forward pass involves computing the output of the neural network given the input.

def forward(x):
    hidden_input = np.dot(x, weights_input_hidden) + bias_hidden
    hidden_output = sigmoid(hidden_input)

    final_input = np.dot(hidden_output, weights_hidden_output) + bias_output
    final_output = sigmoid(final_input)

    return final_output

Step 4: Define the Backpropagation Algorithm

Backpropagation is used to update the weights and biases based on the error in the predictions.

def backward(x, y, output):
    error = y - output

    d_output = error * sigmoid_derivative(output)
    error_hidden = d_output.dot(weights_hidden_output.T)

    d_hidden = error_hidden * sigmoid_derivative(hidden_output)

    weights_hidden_output += hidden_output.T.dot(d_output) * learning_rate
    weights_input_hidden += x.T.dot(d_hidden) * learning_rate
    bias_output += np.sum(d_output, axis=0, keepdims=True) * learning_rate
    bias_hidden += np.sum(d_hidden, axis=0, keepdims=True) * learning_rate

Step 5: Train the Neural Network

Now, let's train the neural network by passing a simple dataset.

# Define the dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])

# Set hyperparameters
learning_rate = 0.1
epochs = 10000

# Train the neural network
for epoch in range(epochs):
    output = forward(X)
    backward(X, y, output)

# Test the trained network
output = forward(X)
print("Final Output after Training:")
print(output)

Final thougts

In this article, we demonstrated how to create a fundamental neural network using Python from scratch. Initializing weights, establishing activation functions, putting the forward pass into practice, and running backpropagation for training were all topics we covered. This provides a fundamental overview of neural networks and can serve as a springboard for deeper research in the area of deep learning.

Here's a post to go indepth of how neural network works.

Resources for building AI applications with Neon Postgres 🤖

Core concepts, starter applications, framework integrations, and deployment guides. Use these resources to build applications like RAG chatbots, semantic search engines, or custom AI tools.

Explore AI Tools →

DEV Community