zekcrates

Posted on Jan 5 • Edited on Jan 8

Let’s Build a Deep Learning Library from Scratch Using NumPy (Part 1)

#showdev #python #deeplearning #machinelearning

Introduction

We are going to build our own PyTorch-like deep learning library from scratch. We will call it babygrad. We are starting with a blank file and NumPy, and we won't stop until we have a functional autograd engine and train some decent models (MNIST,CNN) using it.

What This Series Is About

This is not a deep learning “how to use libraries” tutorial.

Instead, we’ll:

Start from a blank Python file
Wrap NumPy arrays
Track operations
Build a computation graph
Implement backpropagation ourselves

Want to skip the series and read the full book now?

Read it for free online: https://zekcrates.quarto.pub/deep-learning-library/

What is a Tensor?

In any deep learning library(PyTorch),Tensor is the fundamental building block. Tensor is a wrapper around Numpy Arrays. Think of NumPy as the raw data and Tensor as the container that holds this raw data and also remembers the history of its parents . This history is important when we will do the backpropagation.

a = babygrad.Tensor([1,2,3])
b = babygrad.Tensor([1,2,3])
c = a+b 
print(c._inputs)
>>> [Tensor(1,2,3), Tensor[1,2,3]] 
print(c.op)
>>> <babygrad.ops.Add object at 0x7f0cfcfcc3a0>

Who created C? A and B. So they become the inputs of C
What happened between A and B that lead to C? The + operation.
This becomes the op of C. These have now become part of the Computation graph which will be used to do backpropagation.

What is a Computation graph?

A graph that shows

Numbers (Tensors) as nodes.
Operations (ops) as nodes.
Edges showing how data flows from inputs → operations → outputs.

Implementing the Tensor class.

Let's look at the backbone of our library. A Tensor needs to track its data, its gradient, and its parents(if any).
A Tensor needs to:

Store its data
Store its gradient (computed later)
Know whether it should track gradients
Remember how it was created


import numpy as np
NDArray = np.ndarray
def _ensure_tensor(val):
    return val if isinstance(val, Tensor) else Tensor(val,
     requires_grad=False)
class Tensor:
    def __init__(self, data, *, device=None, dtype="float32",
     requires_grad=False):
        if isinstance(data, Tensor):
            if dtype is None:
                dtype = data.dtype
            self.data = data.numpy().astype(dtype)
        elif isinstance(data, np.ndarray):
            self.data = data.astype(dtype if dtype is not None else data.dtype)
        else:
            self.data = np.array(data, dtype=dtype if dtype is not None
             else "float32")
        self.grad = None
        self.requires_grad = requires_grad
        self._op = None       
        self._inputs = []     
        self._device = device if device else  "cpu"    

    @property
    def shape(self):
        return self.data.shape
    @property
    def dtype(self):
        return self.data.dtype
    @property
    def ndim(self):
        return self.data.ndim
    @property
    def size(self):
        return self.data.size    
    @property
    def device(self):
        return self._device        
    def __repr__(self):
        return f"Tensor({self.data}, requires_grad={self.requires_grad})"    
    def __str__(self):
        return str(self.data)
    def backward(self):
        # We will do this in the next part.

No matter what the input data is , we must always convert the input data into NDArray.

The input data could be

A Tensor
An NDArray
A List

All of these inputs must be converted to NDArray no matter what.

What is requires_grad?
requires_grad controls whether a Tensor participates in the computation graph.

requires_grad=True → gradients will be tracked
requires_grad=False → no gradients will be computed

Simple Methods for Tensor class

We will like to introduce some simple methods for the Tensor class that will come in handy in the future.

.numpy()

When we create a Tensor we would like to extract the raw Numpy array from the Tensor without screwing anything.

class Tensor: 
    def numpy(self):
        return self.data.copy()

.detach()

Sometimes we would like to have the same Tensor but it should not be a part of Computation graph. So we just create a new Tensor with the same data and requires_grad=False.

class Tensor:
    def detach(self):
        return Tensor(self.data, requires_grad=False)

What's Next?

We've laid the foundation!
In Part 2, we'll implement the heart of deep learning: Autograd
https://dev.to/zekcrates/lets-build-a-deep-learning-library-from-scratch-using-numpy-part-2-autograd-i17

Liked it?

Read it for free online: https://zekcrates.quarto.pub/deep-learning-library/

DEV Community