DEV Community

Cover image for Let’s Build a Deep Learning Library from Scratch Using NumPy (Part 1)
zekcrates
zekcrates

Posted on

Let’s Build a Deep Learning Library from Scratch Using NumPy (Part 1)

Introduction

We are going to build our own PyTorch-like deep learning library from scratch. We will call it babygrad. We are starting with a blank file and NumPy, and we won't stop until we have a functional autograd engine and train some decent models (MNIST,CNN) using it.

What This Series Is About

This is not a deep learning “how to use libraries” tutorial.

Instead, we’ll:

  • Start from a blank Python file
  • Wrap NumPy arrays
  • Track operations
  • Build a computation graph
  • Implement backpropagation ourselves

Want to skip the series and read the full book now?

What is a Tensor?

In any deep learning library(PyTorch),Tensor is the fundamental building block. Tensor is a wrapper around Numpy Arrays. Think of NumPy as the raw data and Tensor as the container that holds this raw data and also remembers the history of its parents . This history is important when we will do the backpropagation.

a = babygrad.Tensor([1,2,3])
b = babygrad.Tensor([1,2,3])
c = a+b 
print(c._inputs)
>>> [Tensor(1,2,3), Tensor[1,2,3]] 
print(c.op)
>>> <babygrad.ops.Add object at 0x7f0cfcfcc3a0>
Enter fullscreen mode Exit fullscreen mode

Who created C? A and B. So they become the inputs of C
What happened between A and B that lead to C? The + operation.
This becomes the op of C. These have now become part of the Computation graph which will be used to do backpropagation.

What is a Computation graph?

A graph that shows

  • Numbers (Tensors) as nodes.
  • Operations (ops) as nodes.
  • Edges showing how data flows from inputs → operations → outputs.

Implementing the Tensor class.

Let's look at the backbone of our library. A Tensor needs to track its data, its gradient, and its parents(if any).
A Tensor needs to:

  • Store its data
  • Store its gradient (computed later)
  • Know whether it should track gradients
  • Remember how it was created

import numpy as np
NDArray = np.ndarray
def _ensure_tensor(val):
    return val if isinstance(val, Tensor) else Tensor(val,
     requires_grad=False)
class Tensor:
    def __init__(self, data, *, device=None, dtype="float32",
     requires_grad=False):
        if isinstance(data, Tensor):
            if dtype is None:
                dtype = data.dtype
            self.data = data.numpy().astype(dtype)
        elif isinstance(data, np.ndarray):
            self.data = data.astype(dtype if dtype is not None else data.dtype)
        else:
            self.data = np.array(data, dtype=dtype if dtype is not None
             else "float32")
        self.grad = None
        self.requires_grad = requires_grad
        self._op = None       
        self._inputs = []     
        self._device = device if device else  "cpu"    

    @property
    def shape(self):
        return self.data.shape
    @property
    def dtype(self):
        return self.data.dtype
    @property
    def ndim(self):
        return self.data.ndim
    @property
    def size(self):
        return self.data.size    
    @property
    def device(self):
        return self._device        
    def __repr__(self):
        return f"Tensor({self.data}, requires_grad={self.requires_grad})"    
    def __str__(self):
        return str(self.data)
    def backward(self):
        # We will do this in the next part.
Enter fullscreen mode Exit fullscreen mode

No matter what the input data is , we must always convert the input data into NDArray.

The input data could be

  • A Tensor
  • An NDArray
  • A List

All of these inputs must be converted to NDArray no matter what.

What is requires_grad?
requires_grad controls whether a Tensor participates in the computation graph.

  • requires_grad=True → gradients will be tracked
  • requires_grad=False → no gradients will be computed

Simple Methods for Tensor class

We will like to introduce some simple methods for the Tensor class that will come in handy in the future.

.numpy()

When we create a Tensor we would like to extract the raw Numpy array from the Tensor without screwing anything.

class Tensor: 
    def numpy(self):
        return self.data.copy()

Enter fullscreen mode Exit fullscreen mode

.detach()

Sometimes we would like to have the same Tensor but it should not be a part of Computation graph. So we just create a new Tensor with the same data and requires_grad=False.

class Tensor:
    def detach(self):
        return Tensor(self.data, requires_grad=False) 
Enter fullscreen mode Exit fullscreen mode

What's Next?

We've laid the foundation!
In Part 2, we'll implement the heart of deep learning: Autograd

Liked it?

Top comments (0)