Dev Khadka

Posted on Nov 10, 2019

Learning Numpy by Playing: Indexing, Broadcasting, Vectorization and More

#numpy #beginners #broadcasting #vectorization

Introduction

This blog is meant to be playground for beginners to learn about numpy by trying real codes. I have kept texts content as little as possible and code examples as much as possible.

This is also meant to be quick future reference guide of numpy features you already learned. The output of each cell has details describing results.

Prerequisite

Basic programming knowledge
Some familiarity with python (loops, arrays etc.)

What Will We Cover

Basics

Creating Array
Understanding Structure of Numpy Array (dimension, shape and strides)
Data Types and Casting
Indexing Methods
Array Operations

Advance

Broadcasting
Vectorization
Ufunc and Numba

More for Exploration

What is Numpy?

Numpy is fundamental computing library for python. It supports N-Dimensional array and provides simple and efficient array operations.

NumPy is library of algorithms written in the C language which stores data in a contiguous block of memory, independent of other built-in Python objects and can operate on this memory without any type checking or other python overheads. NumPy arrays also use much less memory than built-in Python sequences.

Why Use Numpy?

Python was not initially designed for numerical computation. As python is interpreted language it is inherently slower than compiled languages like C. So numpy fills that gap here, following are few advantages of using numpy

It provides efficient multidimensional array operations, both memory and computation wise
It provides fast mathematical operations on entire array without need to use loop
It also provides scientific operations related to linear algebra, statistics, Fourier transform and more
It provides tool for interoperability with c and c++

How to Play With Numpy?

I would recommend two ways to play around with Numpy

kaggle or Google Colab: you can jump right into coding without needing any setup
Jupyter Notebook: you need to install jupyter notebook and then install numpy library using pip (numpy may be already installed if you have anaconda or miniconda)

Following This Tutorial

You can try this tutorial on kaggle or google colab by forking or making copy of following notebooks respectively

Fork this kaggle notebook
or
Make copy of this colab notebook

Some Tips On Jupyter Notebook

To see auto completion suggestions press tab
To see parameters of a function press shift + tab after typing function name and '('. eg. type np.asarray( then press shift + tab
To view doc string use '?' like np.asarray? then press shift + enter

Enough of reading!!
Lets get our hand dirty

Creating Array

From python list

import numpy as np
print(np.array([1,2,3,4]))

print('\n', 'array of 16 bit integers')
print(np.array([1,2,3,4], dtype=np.int16))

print('\n', '2 dimensional array')
print(np.array([[1,2,3], [4,5,6]]))

    [1 2 3 4]

     array of 16 bit integers
    [1 2 3 4]

     2 dimensional array
    [[1 2 3]
     [4 5 6]]

Numpy's Methods

print('Numpy array from range')
print(np.arange(3,8))

print('\n', '2D 3X3 array of zeros')
print(np.zeros((3,3)))

print('\n', '2D 2X3 array of ones')
print(np.ones((2,3)))

print('\n', 'Triangular array with ones at and below diagonal')
print(np.tri(3, 4))

print('\n', 'Index matrix with ones at diagonal')
print(np.eye(3))

print('\n', '20 equally spaced values between 1 and 5')
print(np.linspace(1, 5, 20))

    Numpy array from range
    [3 4 5 6 7]

     2D 3X3 array of zeros
    [[0. 0. 0.]
     [0. 0. 0.]
     [0. 0. 0.]]

     2D 2X3 array of ones
    [[1. 1. 1.]
     [1. 1. 1.]]

     Triangular array with ones at and below diagonal
    [[1. 0. 0. 0.]
     [1. 1. 0. 0.]
     [1. 1. 1. 0.]]

     Index matrix with ones at diagonal
    [[1. 0. 0.]
     [0. 1. 0.]
     [0. 0. 1.]]

     20 equally spaced values between 1 and 5
    [1.         1.21052632 1.42105263 1.63157895 1.84210526 2.05263158
     2.26315789 2.47368421 2.68421053 2.89473684 3.10526316 3.31578947
     3.52631579 3.73684211 3.94736842 4.15789474 4.36842105 4.57894737
     4.78947368 5.        ]

Using `np.random`

print('3X2 array of uniformly distributed number between 0 and 1')
print(np.random.rand(3,2))

print('\n', 'Normally distributed random numbers with mean=0 and std=1')
print(np.random.randn(3,3))

print('\n', 'Randomly choose integers from a range (>=5, <11)')
print(np.random.randint(5, 11, size=(2,2)))

print('\n', "Randomly selects a permutation from array")
print(np.random.permutation([2,3,4,5,6]))

print('\n', "This is equivalent to rolling dice 10 times and counting \
occurance of getting each side")
print(np.random.multinomial(10, [1/6]*6))

    3X2 array of uniformly distributed number between 0 and 1
    [[0.32564739 0.97679242]
     [0.26588925 0.89020385]
     [0.18024366 0.90879681]]

     Normally distributed random numbers with mean=0 and std=1
    [[-1.18661884 -0.43561077  1.21316858]
     [-1.47847545  0.69296328  1.01348937]
     [-0.03562709 -1.90675623  0.44639003]]

     Randomly choose integers from a range (>=5, <11)
    [[8 7]
     [8 9]]

     Randomly selects a permutation from array
    [6 4 3 2 5]

     This is equivalent to rolling dice 10 times and counting occurance of getting each side
    [3 2 2 2 0 1]

Understanding Structure of Numpy Array (dimension, shape and strides)

import numpy as np
arr = np.array([[1,2,3], [2,3,1], [3,3,3]])

print('Number of array dimensions')
print(arr.ndim)

print('\nShape of array is tuple giving size of each dimension')
print(arr.shape)

print('\nstrides gives byte steps to be moved in memory to get to next \
index in each dimension')
print(arr.strides)

print('\nByte size of each item')
print(arr.itemsize)

    Number of array dimensions
    2

    Shape of array is tuple giving size of each dimension
    (3, 3)

    strides gives byte steps to be moved in memory to get to next index in each dimension
    (24, 8)

    Byte size of each item
    8

More on Strides

print('Slice indexing is done by changing strides, as in examples below')

print('Strides of original array')
print(arr.strides)

print('\n', 'Slice with step of 2 is done by multiplying stride(byte step size) by 2 in that dimension')
print(arr[::2].strides)

print('\n', 'Reverse index will negate the stride')
print(arr[::-1].strides)

print('\n', 'Transpose will swap the stride of the dimensions')
print(arr.T.strides)

    Slice indexing is done by changing strides, as in examples below
    Strides of original array
    (24, 8)

     Slice with step of 2 is done by multiplying stride(byte step size) by 2 in that dimension
    (48, 8)

     Reverse index will negate the stride
    (-24, 8)

     Transpose will swap the stride of the dimensions
    (8, 24)

Some Stride Tricks: Inner product by changing strides

It is very rare that you may want to use these tricks but it helps us understand how indexing in numpy works

as_strided function returns a view to an array with different strides and shape

from numpy.lib.stride_tricks import as_strided

arr1 = np.arange(5)
print('arr1: ', arr1)

arr2 = np.arange(3)
print('arr2: ', arr2)

print('\n', 'Adding a dimension with stride 0 allows us to repeat array in that dimension without making copy')

print('\n', 'Making stride 0 for rows repeats rows.')
print('As step size is zero to move to next row it will give same row repeatedly')
r_arr1 = as_strided(arr1, strides=(0,arr1.itemsize), shape=(len(arr2),len(arr1)))
print(r_arr1)

print('\n', 'Making stride 0 for columns repeats columns.')
r_arr2 = as_strided(arr1, strides=(arr2.itemsize, 0), shape=(len(arr2),len(arr1)))
print(r_arr2, '\n')

print('Inner product: product of every value of arr1 to every value of arr2')
print(r_arr1 * r_arr2)

    arr1:  [0 1 2 3 4]
    arr2:  [0 1 2]

     Adding a dimension with stride 0 allows us to repeat array in that dimension without making copy

     Making stride 0 for rows repeats rows.
    As step size is zero to move to next row it will give same row repeatedly
    [[0 1 2 3 4]
     [0 1 2 3 4]
     [0 1 2 3 4]]

     Making stride 0 for columns repeats columns.
    [[0 0 0 0 0]
     [1 1 1 1 1]
     [2 2 2 2 2]] 

    Inner product: product of every value of arr1 to every value of arr2
    [[0 0 0 0 0]
     [0 1 2 3 4]
     [0 2 4 6 8]]

Using Broadcast

print('Above example is equivalent to using broadcast to do inner product')
print(arr1[np.newaxis, :] * arr2[:, np.newaxis])

print('arr1[np.newaxis, :].strides => ', arr1[np.newaxis, :].strides)
print('arr2[:, np.newaxis].strides => ', arr2[:, np.newaxis].strides)

    Above example is equivalent to using broadcast to do inner product
    [[0 0 0 0 0]
     [0 1 2 3 4]
     [0 2 4 6 8]]
    arr1[np.newaxis, :].strides =>  (0, 8)
    arr2[:, np.newaxis].strides =>  (8, 0)

Data Types and Casting

Notes

Numpy array can store items of only one data type
np_array.dtype attribute will give dtype of the array
following table shows some common datatypes with their string names


Numpy Attribute                                 | String Name                | Description
------------------------------------------------------------------------------------------------------
np.int8, np.int16, np.int32, np.int64           | '<i1', '<i2', '<i4', '<i8' | signed int
np.uint8, np.uint16, np.uint32, np.uint64       | '<u1', '<u2', '<u4', '<u8' | unsigned int
np.float16, np.float32, np.float64, np.float128 | '<f2', '<f4', '<f8', '<f16'| floats
np.string_                                      |'S1', 'S10', 'S255'         | string of bytes (ascii)
np.str                                          |'U1', 'U255'                | string of unicode characters
np.datetime64                                   |'M8'                        | date time
np.Object                                       |'O'                         | python object
np.bool                                         |'?'                         | boolean

Break down of string name '<u8': here '<' means little-endian byte order, 'u' means unsigned int and '8' means 8 bytes. Other options for byte order are '>' big endian and '=' system default
All of the array initialization functions discussed above takes 'dtype' parameter to set datatype of the array eg: np.random.randint(5, 11, size=(2,2), dtype=np.int8)

Casting

import numpy as np
arr = np.arange(5, dtype='<f4')
print('arr: ', arr)

print('\n', 'Cast to integer using astype function which will make copy of the array')
display(arr.astype(np.int8))

print('\n', 'By default casting is unsafe which will ignore the overflow. e.g. `2e10` is converted to 0')
arr[3] = 2e10
print(arr.astype('<i1'))

print('\n', 'Casting from string to float')
sarr = np.array("1 2 3 4 5.0".split())
print(sarr)
print(sarr.astype('<f4'))

print('\n', 'Use casting="safe" for doing safe casting, which will raise error if overflow')
# print(arr.astype('<i1', casting='safe'))

    arr:  [0. 1. 2. 3. 4.]

     Cast to integer using astype function which will make copy of the array



    array([0, 1, 2, 3, 4], dtype=int8)



     By default casting is unsafe which will ignore the overflow. e.g. 2e10 is converted to 0
    [0 1 2 0 4]

     Casting from string to float
    ['1' '2' '3' '4' '5.0']
    [1. 2. 3. 4. 5.]

     Use casting="safe" for doing safe casting, which will raise error if overflow

Reshaping

ndarray can be reshaped to any shape as long as total number of element are same

arr = np.arange(20)
print('arr: ', arr)

print('\n', 'reshape 1D arr of length 20 to shape (4,5)')
print(arr.reshape(4,5))

print('\n', 'One item of shape tuple can be -1 in which case the item will be calculated by numpy')
print('For total size to be 20 missing value must be 5')
print(arr.reshape(2,2,-1))

    arr:  [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

     reshape 1D arr of length 20 to shape (4,5)
    [[ 0  1  2  3  4]
     [ 5  6  7  8  9]
     [10 11 12 13 14]
     [15 16 17 18 19]]

     One item of shape tuple can be -1 in which case the item will be calculated by numpy
    For total size to be 20 missing value must be 5
    [[[ 0  1  2  3  4]
      [ 5  6  7  8  9]]

     [[10 11 12 13 14]
      [15 16 17 18 19]]]

Array View With Different dtype

arr.view() method gives new view for same data with new dtype. Creating view with different dtype is not same as casting. eg. if we have ndarray of np.float32 ('<f4') creating view with dtype byte ('<i8') will read 4 bytes float data as individual bytes

arr = np.arange(5, dtype='<i2')
print('arr: ', arr)

print('\n', 'View with dtype "<i1" for array of dtype "<i2" will breakdown items to bytes')
print(arr.view('<i1'))

print('\n', 'Changing little-endian to big-endian will change value as they use different byte order')
print(arr.view('>i2'))

print('\n', 'Following will give individual bytes in memory of each items')
arr = np.arange(5, dtype='<f2')
print(arr)
print(arr.view('<i1'))

    arr:  [0 1 2 3 4]

     View with dtype "<i1" for array of dtype "<i2" will breakdown items to bytes
    [0 0 1 0 2 0 3 0 4 0]

     Changing little-endian to big-endian will change value as they use different byte order
    [   0  256  512  768 1024]

     Following will give individual bytes in memory of each items
    [0. 1. 2. 3. 4.]
    [ 0  0  0 60  0 64  0 66  0 68]

Indexing Methods

Integer and Slice Indexing

This method of indexing is similar to indexing used in python list
Slicing always create view to the array ie. does not copy the array

import numpy as np
arr = np.arange(20)
print("arr: ", arr)

print('\n', 'Get item at index 4(5th item) of the array')
print(arr[4])

print('\n', 'Assign 0 to index 4 of array')
arr[4] = 0
print(arr)

print('\n', 'Get items in the index range 4 to 10 not including 10')
print(arr[4:10])

print('\n', 'Set 1 to alternate items starting at index 4 to 10 ')
arr[4:10:2] = 1
print(arr)

    arr:  [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

     Get item at index 4(5th item) of the array
    4

     Assign 0 to index 4 of array
    [ 0  1  2  3  0  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

     Get items in the index range 4 to 10 not including 10
    [0 5 6 7 8 9]

     Set 1 to alternate items starting at index 4 to 10 
    [ 0  1  2  3  1  5  1  7  1  9 10 11 12 13 14 15 16 17 18 19]

Slice Indexing in 2D Array

For multidimensional array slice index can be separated using comma

arr = np.arange(20).reshape(4,5)

print('arr:\n', arr)

print('\n', 'Set 0 to first 3 rows and and last 2 columns')
arr[:3, -2:] = 1
print(arr)

    arr:
     [[ 0  1  2  3  4]
     [ 5  6  7  8  9]
     [10 11 12 13 14]
     [15 16 17 18 19]]

     Set 0 to first 3 rows and and last 2 columns
    [[ 0  1  2  1  1]
     [ 5  6  7  1  1]
     [10 11 12  1  1]
     [15 16 17 18 19]]

Boolean Indexing

Boolean array of same shape as original array (or broadcastable to the shape) can be used as index. Which will select the items where index value is true
Boolean array can also be used to filter array with certain conditions
Boolean indexing will return copy instead of view to the array

arr = np.arange(6).reshape(2,3)
print('arr:\n', arr)

print('\n', 'Following index will gives last two items of 1st row and 1st element of 2nd row')
indx = np.array([[False, True, True], [True, False,False]])
arr[indx]

print('\n', 'Boolean index to filter values greater than 3 from arr')
filter_indx = arr>3
print('Filter Index:\n', filter_indx)

print('\n', 'Set 3 to values greater than 3 in arr')
arr[filter_indx] = 3
print(arr)

    arr:
     [[0 1 2]
     [3 4 5]]

     Following index will gives last two items of 1st row and 1st element of 2nd row

     Boolean index to filter values greater than 3 from arr
    Filter Index:
     [[False False False]
     [False  True  True]]

     Set 3 to values greater than 3 in arr
    [[0 1 2]
     [3 3 3]]

Fancy Indexing

Fancy Indexing means using array of index(integer) as index to get all items at once
Fancy Indexing will also return copy instead of view to the array

import numpy as np

arr = np.arange(10)
print('arr:\n', arr)

print('\n', 'Get items at indexes 3,5 and 7 at once')
print(arr[[3,5,7]])

print('\n', 'Sorting arr based on another array "values"')
np.random.seed(5)
values = np.random.rand(10)
print('values:\n', values)
print('\n', 'np.argsort instead of returning sorted values will return array of indexes which will sort the array')
indexes = np.argsort(values) 
print('indexes:\n', indexes)
print('Sorted array:\n', arr[indxes])

print('\n', 'You can also use fancy indexing to get same item multiple times')
print(arr[[0,1,1,2,2,2,3,3,3,3]])

    arr:
     [0 1 2 3 4 5 6 7 8 9]

     Get items at indexes 3,5 and 7 at once
    [3 5 7]

     Sorting arr based on another array "values"
    values:
     [0.22199317 0.87073231 0.20671916 0.91861091 0.48841119 0.61174386
     0.76590786 0.51841799 0.2968005  0.18772123]

     np.argsort instead of returning sorted values will return array of indexes which will sort the array
    indexes:
     [9 2 0 8 4 7 5 6 1 3]



    ---------------------------------------------------------------------------

    NameError                                 Traceback (most recent call last)

    <ipython-input-14-30ac3b9b19e2> in <module>()
         14 indexes = np.argsort(values)
         15 print('indexes:\n', indexes)
    ---> 16 print('Sorted array:\n', arr[indxes])
         17 
         18 print('\n', 'You can also use fancy indexing to get same item multiple times')


    NameError: name 'indxes' is not defined

Tuple Indexing

Multi-dimensional array can be indexed using tuple of integer array of equal length, where each array in tuple will index corresponding dimension
If number of index-array in tuple is less than the dimension of array being indexed they will be used to index lower dimension (i.e. dimension starting from 0 to length of tuple)

arr2 = np.arange(15).reshape(5,3)
print('arr2:\n', arr2)

print('\n', 'Will give items at index (4,0) and (1,2)')
indx = ([4,1],[0,2])
print(arr2[indx])

print('\n', 'Tuple of length one will return rows')
indx = ([4,1],)
print(arr2[indx])

    arr2:
     [[ 0  1  2]
     [ 3  4  5]
     [ 6  7  8]
     [ 9 10 11]
     [12 13 14]]

     Will give items at index (4,0) and (1,2)
    [12  5]

     Tuple of length one will return rows
    [[12 13 14]
     [ 3  4  5]]

Assignment With Advance Indexing

Advance Indexing(i.e. Boolean, Fancy and Tuple) returns copy instead of view to the indexed array. But direct assignment using those index will change the original array, this feature is for convenience. But if we chain the indexing it may behaves in a way which seems to be somewhat unexpected

import numpy as np

arr = np.arange(10)
print('arr: ', arr)

print('\n', 'Direct assignment will change the original array')
arr[[3,5,7]] = -1
print(arr)

print('\n', 'When we chain the indexing it will not work')
arr[[3,5,7]][0] = -2
print(arr)

print('\n', 'But chaining index will work with slicing indexing')
arr[3:8:2][0] = -2
print(arr)

    arr:  [0 1 2 3 4 5 6 7 8 9]

     Direct assignment will change the original array
    [ 0  1  2 -1  4 -1  6 -1  8  9]

     When we chain the indexing it will not work
    [ 0  1  2 -1  4 -1  6 -1  8  9]

     But chaining index will work with slicing indexing
    [ 0  1  2 -2  4 -1  6 -1  8  9]

Mixed Indexing

In multi dimensional array we can use different indexing method (slicing, boolean and fancy) for each dimension at same time
For mixture of boolean and fancy index to work, number of True's in boolean index must be equal to length of fancy index

arr = np.arange(64).reshape(4,4,4)
print('arr: ', arr)
print('\n', 'Following mixed indexing will select 1st and 3rd item in 0th dimension')
print('and item at index 0 and 2 at 1st dimension and item at index >=2')
print(arr[[True, False, True, False], [0,2], 2:])

    arr:  [[[ 0  1  2  3]
      [ 4  5  6  7]
      [ 8  9 10 11]
      [12 13 14 15]]

     [[16 17 18 19]
      [20 21 22 23]
      [24 25 26 27]
      [28 29 30 31]]

     [[32 33 34 35]
      [36 37 38 39]
      [40 41 42 43]
      [44 45 46 47]]

     [[48 49 50 51]
      [52 53 54 55]
      [56 57 58 59]
      [60 61 62 63]]]

     Following mixed indexing will select 1st and 3rd item in 0th dimension
    and item at index 0 and 2 at 1st dimension and item at index >=2
    [[ 2  3]
     [42 43]]

Array Operations

Simple Array Operations

Numpy provides simple syntax to perform mathematical and logical operations between array of compatible shapes. Here compatible shape means, shape of one of the array can be expanded to match shape of other using broadcast rule, which we'll discuss below
For this section we'll only see two cases
- Arrays has same shape in which case operation will be element wise
- One of operand is scalar in which case operation will be done between scalar and each element of array
These operations between arrays are called vectorization and are way faster than same operation using loop.
Vectorization are faster because it is implemented in C and don't have overheads like type checking etc.

import numpy as np

print('Evaluate expression (x1*x2 - 3*x1 + 30) for x1 and x2 in range 0 to 10')
x1 = np.linspace(0,10,20)
x2 = np.linspace(0, 10, 20)
z = x1*x2 - 3*x1 + 30
print(z)

print('\n', 'Spatial distance between corresponding points in two array')
p1 = np.random.rand(20,2)
p2 = np.random.rand(20,2)

'''np.sum will add values along given axis (dimension). If shape of array is (3,4,5)')
then axis 0,1 and 2 corresponds to dimension with length 3, 4 and 5 respectively'''
d = np.sum((p1-p2)**2, axis=1)
print(d)

print('\n', 'Element wise comparison, ">=" will give boolean array with True where element')
print('of p2 is greater than or equal to p1')
r = p2>=p1
print(r)

print('\n', 'Element wise logical operation, "&" will give True where point of p2 is ahead')
print('in both x and y direction from corresponding point in p1')
print(r[:,0] & r[:,1])

    Evaluate expression (x1*x2 - 3*x1 + 30) for x1 and x2 in range 0 to 10
    [ 30.          28.69806094  27.9501385   27.75623269  28.11634349
      29.03047091  30.49861496  32.52077562  35.09695291  38.22714681
      41.91135734  46.14958449  50.94182825  56.28808864  62.18836565
      68.64265928  75.65096953  83.2132964   91.32963989 100.        ]

     Spatial distance between corresponding points in two array
    [0.54052263 0.17505988 0.59108818 0.41593393 0.03548522 0.29946201
     0.84649163 0.24975051 0.90016153 0.54062043 0.00097261 0.39826495
     0.64710327 0.40655563 0.00531519 0.94567232 0.33333277 0.01713418
     0.53797027 0.48080742]

     Element wise comparison, ">=" will give boolean array with True where element
    of p2 is greater than or equal to p1
    [[ True False]
     [False False]
     [False False]
     [ True  True]
     [ True False]
     [ True  True]
     [ True  True]
     [ True False]
     [ True False]
     [False  True]
     [ True False]
     [False False]
     [ True False]
     [ True False]
     [False False]
     [ True False]
     [False False]
     [ True False]
     [False  True]
     [False False]]

     Element wise logical operation, "&" will give True where point of p2 is ahead
    in both x and y direction from corresponding point in p1
    [False False False  True False  True  True False False False False False
     False False False False False False False False]

Functions for Array Operations

Numpy also has function version of above operations like np.add, np.substract, np.divide, np.greater_equal, np.logical_and and more
Array operations we see in section above using operators like +, * are operator overloaded version of function operation
Function version of the operation will give us extra parameters to customize, one of commonly used parameter is out. It is None by default, which will create a new array for result.
If We pass an array with shape and dtype matching to expected result to out parameter result will be filled to the passed array. It will be efficient memory wise if we are doing multiple operations

import numpy as np

print('Evaluate expression (x1*x2 - 3*x1 + 30) using functions')
x1 = np.linspace(0,10,20)
x2 = np.linspace(0, 10, 20)

'''Create empty output array with expected shape'''
z = np.empty_like(x1)

'''Code is not very clean as using operator but it will perform very well memory wise'''
np.multiply(x1, x2, out=z)
np.subtract(z, 3*x1, out=z)
np.add(z, 30, out=z)
print(z)

    Evaluate expression (x1*x2 - 3*x1 + 30) using functions
    [ 30.          28.69806094  27.9501385   27.75623269  28.11634349
      29.03047091  30.49861496  32.52077562  35.09695291  38.22714681
      41.91135734  46.14958449  50.94182825  56.28808864  62.18836565
      68.64265928  75.65096953  83.2132964   91.32963989 100.        ]

Broadcasting

Rules for broadcasting

When doing array operations between two array's whose shape doesn't match exactly then few simple steps are taken which changes shapes to match each other if they are compatible.

Check Array Dimensions: If dimensions doesn't match added to left of array with smaller dimension
Match Shape On Each Dimension: If shape in any dimension doesn't match and shape is 1 for one of the array then repeat it to match shape of other array in that dimension
Raise Error if Dimension and Shape Not Matched: If dimension and shape don't match till this step then raise error

Let's Visualize the Broadcasting Rule With Custom Implementation

lets do our custom implementation to visualize in code how the broadcast rule works

import numpy as np

arr1 = np.arange(20).reshape(10,2)
arr2 = np.random.rand(2)
arr3 = arr2.copy()

print('arr1.shape: ', arr1.shape)
print('arr2.shape: ', arr2.shape)

# Step 1: Check Array Dimensions
print('\n', 'arr1 has dimension 2 and arr2 has dimension 1, so add 1 dimension to\
left side of arr2')
# np.newaxis is convenient way of adding new dimension
arr2 = arr2[np.newaxis, :]
print('arr1.shape: ', arr1.shape)
print('arr2.shape: ', arr2.shape)

# Step 2: Match Shape On Each Dimension
print('\n', 'Now in axis=0 arr1 has 10 items and arr2 has one item, so repeat it 10\
times to match arr2')
arr2 = np.repeat(arr2, 10, axis=0)
print('arr1.shape: ', arr1.shape)
print('arr2.shape: ', arr2.shape)

print('\n', 'Now both array has same dimension and shape, we can multiply them')
print('arr1*arr2:\n', arr1*arr2)

print('\n', 'Lets see if broadcasting also produce same result')
print('arr1*arr3:\n', arr1*arr3)

    arr1.shape:  (10, 2)
    arr2.shape:  (2,)

     arr1 has dimension 2 and arr2 has dimension 1, so add 1 dimension toleft side of arr2
    arr1.shape:  (10, 2)
    arr2.shape:  (1, 2)

     Now in axis=0 arr1 has 10 items and arr2 has one item, so repeat it 10times to match arr2
    arr1.shape:  (10, 2)
    arr2.shape:  (10, 2)

     Now both array has same dimension and shape, we can multiply them
    arr1*arr2:
     [[ 0.          0.11111075]
     [ 1.71941377  0.33333225]
     [ 3.43882755  0.55555375]
     [ 5.15824132  0.77777525]
     [ 6.8776551   0.99999675]
     [ 8.59706887  1.22221824]
     [10.31648264  1.44443974]
     [12.03589642  1.66666124]
     [13.75531019  1.88888274]
     [15.47472397  2.11110424]]

     Lets see if broadcasting also produce same result
    arr1*arr3:
     [[ 0.          0.11111075]
     [ 1.71941377  0.33333225]
     [ 3.43882755  0.55555375]
     [ 5.15824132  0.77777525]
     [ 6.8776551   0.99999675]
     [ 8.59706887  1.22221824]
     [10.31648264  1.44443974]
     [12.03589642  1.66666124]
     [13.75531019  1.88888274]
     [15.47472397  2.11110424]]

Let's Try Few Examples With Shapes

you can try it by creating arrays of given shape and doing some operation between them

Before Broadcast        |Step 1                      | Step 2 and 3  
Shapes of arr1 and arr2 |                            | Shapes of result 
-------------------------------------------------------------------------
(3, 1, 5); (4, 1)       | (3, 1, 5); (1, 4, 1)       | (3, 4, 5)        
(10,); (1, 10)          | (10, 1); (1, 10)           | (10, 10)         
(2, 2, 2); (2, 3)       | (2, 2, 2); (1, 2, 3)       | Not Broadcastable
(2, 2, 2, 1); (2, 3)    | (2, 2, 2, 1); (1, 1, 2, 3) | (2, 2, 2, 2, 3)

Some Usage of Broadcast

Evaluate Linear Equation Using Broadcast

print("Let's evaluate equation c1*x1 + c2*x2 + c3*x3 for 100 points at once")
points = np.random.rand(100,3)
coefficients = np.array([5, -2, 11])
results = np.sum(points*coefficients, axis=1)
print('results first 10:\n', results[:10])
print('results.shape: ', results.shape)

    Let's evaluate equation c1*x1 + c2*x2 + c3*x3 for 100 points at once
    results first 10:
     [ 6.35385279  0.85639146 12.87683079  5.99433896  4.50873972 10.44691041
      3.87407211  6.62954602 11.00386582 10.09247866]
    results.shape:  (100,)

Find Common Elements Between Arrays

np.random.seed(5)
## Get 20 random value from 0 to 99
arr1 = np.random.choice(50, 20, replace=False)
arr2 = np.random.choice(50, 15, replace=False)
print("arr1: ", arr1)
print("arr2: ", arr2)
print('\n', 'arr1 and arr2 are 1D arrays of length 20, 15 respectively.')
print('To make them broadcastable Change shape of arr2 to (15, 1)')
arr2 = arr2.reshape(15, 1)
print('\n', 'Then both arrays will be broadcasted to (15, 20) matrix with all possible pairs')
comparison = (arr1 == arr2)
print('\n', 'comparison.shape: ', comparison.shape)
print('\n', 'Elements of arr1 also in arr2: ', arr1[comparison.any(axis=0)])

    arr1:  [42 29  6 19 28 17  2 43  3 21 31  4 32  0 23  5 48 34 37 26]
    arr2:  [40 37 41 48  4 20 10 18 34 28 19 32 17 22 23]

     arr1 and arr2 are 1D arrays of length 20, 15 respectively.
    To make them broadcastable Change shape of arr2 to (15, 1)

     Then both arrays will be broadcasted to (15, 20) matrix with all possible pairs

     comparison.shape:  (15, 20)

     Elements of arr1 also in arr2:  [19 28 17  4 32 23 48 34 37]

Find k-nearest Neighbors

import numpy as np

np.random.seed(5)

points = np.random.rand(20, 2)
print('To calculate distance between every pair of points make copy of points ')
print('with shape (20, 1, 2) which will broadcast both array to shape (20, 20, 2)', '\n')
cp_points = points.reshape(20, 1, 2)

## calculate x2-x1, y2-y1
diff = (cp_points - points)
print('diff.shape: ', diff.shape)

## calculate (x2-x1)**2 + (y2-y1)**
distance_matrix = np.sum(diff**2, axis=2)
print('distance_matrix.shape: ', distance_matrix.shape, '\n')

## sort by distance along axis 1 and take top 4, one of which is the point itself
top_3 = np.argsort(distance_matrix, axis=1)[:,:4]
print("Get the points with it's 3 nearest neighbors")
points[top_3]

    To calculate distance between every pair of points make copy of points 
    with shape (20, 1, 2) which will broadcast both array to shape (20, 20, 2) 

    diff.shape:  (20, 20, 2)
    distance_matrix.shape:  (20, 20) 

    Get the points with it's 3 nearest neighbors





    array([[[0.22199317, 0.87073231],
            [0.20671916, 0.91861091],
            [0.16561286, 0.96393053],
            [0.08074127, 0.7384403 ]],

           [[0.20671916, 0.91861091],
            [0.22199317, 0.87073231],
            [0.16561286, 0.96393053],
            [0.08074127, 0.7384403 ]],

           [[0.48841119, 0.61174386],
            [0.62878791, 0.57983781],
            [0.69984361, 0.77951459],
            [0.76590786, 0.51841799]],

           [[0.76590786, 0.51841799],
            [0.62878791, 0.57983781],
            [0.69984361, 0.77951459],
            [0.87993703, 0.27408646]],

           [[0.2968005 , 0.18772123],
            [0.32756395, 0.1441643 ],
            [0.28468588, 0.25358821],
            [0.44130922, 0.15830987]],

           [[0.08074127, 0.7384403 ],
            [0.02293309, 0.57766286],
            [0.22199317, 0.87073231],
            [0.20671916, 0.91861091]],

           [[0.44130922, 0.15830987],
            [0.32756395, 0.1441643 ],
            [0.41423502, 0.29607993],
            [0.2968005 , 0.18772123]],

           [[0.87993703, 0.27408646],
            [0.96022672, 0.18841466],
            [0.76590786, 0.51841799],
            [0.5999292 , 0.26581912]],

           [[0.41423502, 0.29607993],
            [0.28468588, 0.25358821],
            [0.44130922, 0.15830987],
            [0.2968005 , 0.18772123]],

           [[0.62878791, 0.57983781],
            [0.48841119, 0.61174386],
            [0.76590786, 0.51841799],
            [0.69984361, 0.77951459]],

           [[0.5999292 , 0.26581912],
            [0.41423502, 0.29607993],
            [0.44130922, 0.15830987],
            [0.87993703, 0.27408646]],

           [[0.28468588, 0.25358821],
            [0.2968005 , 0.18772123],
            [0.32756395, 0.1441643 ],
            [0.41423502, 0.29607993]],

           [[0.32756395, 0.1441643 ],
            [0.2968005 , 0.18772123],
            [0.44130922, 0.15830987],
            [0.28468588, 0.25358821]],

           [[0.16561286, 0.96393053],
            [0.20671916, 0.91861091],
            [0.22199317, 0.87073231],
            [0.08074127, 0.7384403 ]],

           [[0.96022672, 0.18841466],
            [0.87993703, 0.27408646],
            [0.5999292 , 0.26581912],
            [0.76590786, 0.51841799]],

           [[0.02430656, 0.20455555],
            [0.28468588, 0.25358821],
            [0.2968005 , 0.18772123],
            [0.32756395, 0.1441643 ]],

           [[0.69984361, 0.77951459],
            [0.62878791, 0.57983781],
            [0.63979518, 0.9856244 ],
            [0.76590786, 0.51841799]],

           [[0.02293309, 0.57766286],
            [0.00164217, 0.51547261],
            [0.08074127, 0.7384403 ],
            [0.22199317, 0.87073231]],

           [[0.00164217, 0.51547261],
            [0.02293309, 0.57766286],
            [0.08074127, 0.7384403 ],
            [0.02430656, 0.20455555]],

           [[0.63979518, 0.9856244 ],
            [0.69984361, 0.77951459],
            [0.48841119, 0.61174386],
            [0.62878791, 0.57983781]]])

Vectorization

In numpy vectorization means performing optimized operations on sequence of same type of data.
In addition to having clean structure of code vectorized operations are also very performant because codes are compiled an avoids overhead of python, like type checking, memory management etc.
The examples we see on Broadcast section above are also good example of vectorization

Vectorization Vs Loop

Let's say we have a polynomial equation of degree 10 of single variable like a1*x + a2*x^2 + a3*x^3 ... + a10*x^10. Let's try evaluating the equation for large number of x using python only and numpy vectorization see how they compare

import numpy as np
np.random.seed(32)
X = np.random.rand(10000)
coefficients = np.random.randn(10)*20 + 50

def evaluate_polynomial_loop():
  result_loop = np.empty_like(X)
  for i in range(X.shape[0]):
    exp_part = 1
    total = 0
    for j in range(coefficients.shape[0]):
      exp_part *= X[i]
      total+=coefficients[j]*exp_part
    result_loop[i] = total
  return result_loop


def evaluate_polynomial_vect():
  ## repeates x's in 10 columns
  exponents = X[:, np.newaxis] + np.zeros((1, coefficients.shape[0]))
  exponents.cumprod(axis=1, out=exponents)
  result_vect = np.sum(exponents * coefficients, axis=1)
  return result_vect

print('Verify that both gives same result')
print('Loop:\n', evaluate_polynomial_loop()[:10])
print('Vectorization:\n', evaluate_polynomial_vect()[:10])

    Verify that both gives same result
    Loop:
     [222.57782534  30.62439847  59.69953776 373.52687079 123.89007218
     179.70369976   6.49315699 321.685257    73.14575517  69.71437596]
    Vectorization:
     [222.57782534  30.62439847  59.69953776 373.52687079 123.89007218
     179.70369976   6.49315699 321.685257    73.14575517  69.71437596]

Compare

For fair comparison I used numpy array in both whose indexing is much faster than python list. By comparison we see the vectorization is about 80 times faster

%timeit evaluate_polynomial_loop()

    114 ms ± 5.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit evaluate_polynomial_vect()

    1.19 ms ± 79 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Ufunc and Numba

Ufunc: Also called Universal Functions are vectorized wrapper for a function. Ufunc can operate on ndarray and support broadcasting and type casting. np.add, np.multiply etc are examples of Ufunc which are implemented in C. We can create custom Ufunc using np.frompyfunc or using numba.

Numba: Numba is just in time compiler which generate optimized machine code from pure python array and numerical functions. You can use numba.jit decorator on a function which makes the function to be compiled on its 1st run. You can use numba.vectorize decorator to convert python function to Ufunc.

Let's compare different implementations of adding two big arrays as follow

Create Big Arrays

arr1 = np.arange(1000000, dtype='int64')
arr2 = np.arange(1000000, dtype='int64')

Implementation Using Python Loop

def add_arr(arr1, arr2):
  assert len(arr1)==len(arr2), "array must have same length"
  result = np.empty_like(arr1)
  for i in range(len(arr1)):
    result[i] = arr1[i] + arr2[i]
  return result

%timeit _ = add_arr(arr1, arr2)

    596 ms ± 24.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Creating Ufunc Using `np.frompyfunc`

def add(a, b):
  return a+b

vect_add =  np.frompyfunc(add,2,1)

%timeit _ = vect_add(arr1, arr2)

    195 ms ± 12.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Using Numba JIT

'nopython=True' means convert all to machine level code, if can't convert raise error

import numba
@numba.jit(nopython=True)
def add_arr_jit(arr1, arr2):
  assert len(arr1)==len(arr2), "array must have same length"
  result = np.empty_like(arr1)
  for i in range(len(arr1)):
    result[i] = arr1[i] + arr2[i]
  return result

_ = add_arr_jit(arr1, arr2)
%timeit _ = add_arr_jit(arr1, arr2)

    3.66 ms ± 467 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Creating Ufunc Using `numba.vectorize`

'numba.vectorize' takes signature of function being converted as parameter. 'int64(int64,int64)' means takes 2 'int64' parameters and returns 'int64'

import numba

@numba.vectorize(['int64(int64,int64)'], nopython=True)
def vect_add(a, b):
  return a+b

%timeit _ = vect_add(arr1, arr2)

    2.93 ms ± 569 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Conclusion solution using numba.jit and numba.vectorize are performing much better. You can also check how numpy vectorization compares with these

More for Exploration

Some Useful Functions

np.where: element wise if .. then .. Else
np.select: select values from multiple arrays based on multiple conditions
np.concatenate, np.vstack, np.r_, np.hstack, np.c_: join multiple ndarray row wise, column wise or in a given axis
np.ravel, np.flatten: converts multidimensional array to 1D array
np.roll: do circular shift of the array along given axis

Set Operations

*np.unique(x): * gives array of unique elements on the array
*Intersect1d(x, y): * gives 1D array of elements common to both arrays
*Union1d(x, y): * gives 1D array of unique elements from both arrays
*In1d(x, y): * check if each element of x is also present on y and returns array of length equal to x with boolean values
*Setdiff1d(x, y): * gives elements of x not in y
*Setxor1d(x, y): * give elements that is either in x or y but not in both

Save and Load ndarray From Disk

np.save("filename.npy", x): save single numpy array to disk
np.load("filename.npy"): load single numpy array from disk
*np.savez("filename.npz", key1=arr1, key2=arr2): * saves multiple arrays with given key
np.savetxt("filename.npy", x): save single numpy array to disk as delimited text file
np.loadtxt("filename.npy"): load single numpy array from text file

Memory Mapping

To work with large numpy array that doesn't fit in RAM you can use numpy.memmap function to map the array to file in disk. It will transparently loads only segments of array needed for current operations.

np.memmap(filename, dtype, mode, shape): create a memory mapped array to a given file
mmap.flush(): flush all in memory changes to disk

Thank you

Heartily thanks for reading through the blog. Hope it was helpful for getting you up and running in numpy. Any comment, suggestion and constructive criticism are welcome. If you like the content please don’t forget to like 💕

Introduction

Prerequisite

What Will We Cover

What is Numpy?

Why Use Numpy?

How to Play With Numpy?

Following This Tutorial

Some Tips On Jupyter Notebook

Creating Array

From python list

Numpy's Methods

Using np.random

Understanding Structure of Numpy Array (dimension, shape and strides)

More on Strides

Some Stride Tricks: Inner product by changing strides

Using Broadcast

Data Types and Casting

Casting

Reshaping

Array View With Different dtype

Indexing Methods

Integer and Slice Indexing

Slice Indexing in 2D Array

Boolean Indexing

Fancy Indexing

Tuple Indexing

Assignment With Advance Indexing

Mixed Indexing

Array Operations

Simple Array Operations

Functions for Array Operations

Broadcasting

Rules for broadcasting

Let's Visualize the Broadcasting Rule With Custom Implementation

Let's Try Few Examples With Shapes

Some Usage of Broadcast

Evaluate Linear Equation Using Broadcast

Find Common Elements Between Arrays

Find k-nearest Neighbors

Vectorization

Vectorization Vs Loop

Compare

Ufunc and Numba

Create Big Arrays

Implementation Using Python Loop

Creating Ufunc Using np.frompyfunc

Using Numba JIT

Creating Ufunc Using numba.vectorize

More for Exploration

Some Useful Functions

Set Operations

Save and Load ndarray From Disk

Memory Mapping

Thank you

Using `np.random`

Creating Ufunc Using `np.frompyfunc`

Creating Ufunc Using `numba.vectorize`