chamodperera

Posted on

# Replacing For Loops with Vectorization in Python

Loops are an integral part of programming languages, as they allow for the repetition of a block of code a specific number of times or until a certain condition is met. However, loops can be slow and resource-intensive, especially when the operations inside the loop are repeated many times. This can be a disadvantage when working with large datasets, as loops can limit performance.

One way to improve the performance of these types of operations is through a technique called `Vectorization`. With this approach, operations can be performed on entire arrays or datasets at once, rather than looping through each element individually. This can be much more efficient, particularly when working with large datasets or when the operations are repeated many times. Vectorization is a useful tool for tasks such as data analysis, machine learning, and financial modeling. In this article, let's see how this technique can enhance the performance of these types of operations.

## What is Vectorization

Vectorization is a technique that allows for faster computation by storing and manipulating data in an array or vector format rather than as individual units. The fundamental structure for storing and manipulating data in a vectorized manner is the N-dimensional array, or ND array, which is a data structure that represents a multi-dimensional grid of values.

Vectorized operations enable element-wise operations to be performed on entire ND arrays using a single function call, rather than iterating over the array and applying the operation to each element individually. When working with ND arrays that have more than two dimensions, vectorized operations can still be used by processing each element in the array one at a time. For example, with a 2D array, you can iterate over the rows and process one row at a time.

``````for  i  in range(len(A)):
B[i] = A[i] * 2
``````

This loop multiplies each element in the array A by 2 and stores the result in the corresponding element of B.

``````# Vectorized implementation
B = A * 2
``````

This vectorized operation performs the same operation as the loop, but the operation is performed on the entire array at once.

## Getting Started with NumPy

`NumPy` is the most used library for working with vectorized operations in Python. It provides N-dimensional arrays, a wide variety of vectorized operations, and an easy-to-use API. NumPy also provides functions for performing vectorized operations on ND arrays, including mathematical operations, statistical operations, and operations related to sorting, searching, and indexing. These features make NumPy well-suited for working with vectorized operations on multi-dimensional arrays.

Installation :

``````pip install numpy
``````
``````import numpy as np  #import package
``````

Create an ND array

``````a = np.array([[1,2,3],[4,5,6]]) #2-Dimentional array
``````

In the above example, you've created a 2D array. The dimensions of the array are represented by sets of square brackets, `[]`. The dimensions of a ND array are called `axes`. The number of axes is called the `rank`. You can check the no of dimensions, and `shape` ( A tuple having the no of elements in each dimension ) by calling the following methods.

``````a.ndim
``````

This returns the no of dimensions

``````>> 2
``````
``````a.shape
``````

This returns a tuple containing the no of elements in each dimension

``````>> (2,3)
``````

## Why Vectorization

When you perform a vectorized operation on an ND array using NumPy, the operation is applied to each element in the array simultaneously, rather than iterating over the array and performing the operation on each element individually. This can be much faster than using a loop to perform the same operation, particularly when working with large datasets or when the operation is computationally expensive. Numpy vectorized operations are implemented using low-level, highly optimized code that can take advantage of the specialized architecture of modern processors, resulting in significant performance improvements NumPy. Learn more. Let’s see an example,

``````b = np.random.rand(10000,1)
``````

Here `b` is a two-dimensional array of shape `(10000,1)`. `random.rand()` is a numpy method that assigns a random value (between 0,1) to every element in the array.

Now Let's get the sum of all the values in

``````
using *For loops* and *Numpy*.

![](https://i.imgur.com/H72n7eg.png)

Calculate sum using For loops

```python
sum = 0
for i in range(b.shape[0]):
sum += i
``````

Vectorized Implementation

``````sum = np.sum(b)
``````

Performance comparision,

``````import timeit   #timeit is a package used for tracking elapsed time for both operations
``````

Measure the execution time of the for loop implementation

``````for_loop_time = timeit.timeit(
# Define the for loop code as a string
stmt="""
sum = 0
for i in range(b.shape[0]):
sum += i
""",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"b = np.random.rand(10000,1)",
# Execute the code 1000 times
number=1000
)
print("For loop elapsed time: {:.5f} seconds".format(for_loop_time))
``````
``````>> For loop elapsed time: 0.61476 seconds
``````

Measure the execution time of the vectorized implementation

``````vectorized_time = timeit.timeit(
# Define the vectorized code as a string
stmt="sumt = np.sum(b)",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"b = np.random.rand(10000,1)",
# Execute the code 1000 times
number=1000
)
print("vectorized elapsed time: {:.5f} seconds".format(vectorized_time))
``````
``````>> vectorized elapsed time: 0.01074 seconds
``````

The vectorized implementation is nearly 50 times faster than the non-vectorized version. This performance difference becomes more significant for larger calculations.

From now on, we'll get into some important methods of NumPy that will be essential for your project if you use vectorization. By understanding how these operations work, you can make more informed decisions about how to structure your code and choose the most efficient approaches for your specific use case. In some cases, working with multi-dimensional arrays can be challenging. The `reshape()` function in NumPy is a useful tool for converting an array to a different shape. It allows you to change the dimensions of an array by specifying the new shape as an argument. Remember you can always use this to convert your multidimensional array into a 2D or 1D array.

``````b = np.random.rand(10,5,8) #3 dimensional array
a = np.reshape(b,(400)) #convert to 1D array

a.shape
``````
``````>> (400,)
``````

Note that here, the total number of elements in an array should match the new shape when reshaping the array. For example, think that the original dimensions of the array are `d1, d2, d3,..., dN` and the new reshaped dimensions are `r1, r2,..,rM`. The condition that must be satisfied in order to reshape the array is `d1 * d2 * d3 * ... * dN = r1 * r2 * r3 * ... * rM`.

## Operations on Vectors

Now let's look at some common vectorized operations using `NumPy` and compare the performance of for loop implementations to vectorized implementations.

### Element-wise operations

Element-wise operations are performed on each element of an array individually. Some examples of element-wise operations include addition, subtraction, multiplication, and division.

To add arrays using a for loop, we can iterate over their elements and add them element-wise. For example, consider the following 2D arrays.:

Define two 2D arrays

``````a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
``````

Add the arrays using a nested for loop

``````result = np.empty(a.shape) # Initialize an empty result array

for i in range(a.shape[0]):
for j in range(a.shape[1]):
result[i, j] = a[i, j] + b[i, j]
print(result)
``````
``````>> [[ 6  8] [10 12]]
``````

To add two vectors using vectorization, we can simply use the `+` operator.

``````result = a + b
print(result)
``````
``````>> [[ 6  8] [10 12]]
``````

To compare the performance of these two approaches, we can use the `%timeit`magic command in IPython or Jupyter notebooks, or the `timeit` module in a script.

Measure the execution time of the for loop implementation

``````for_loop_time = timeit.timeit(
# Define the for loop code as a string
stmt="""
for i in range(a.shape[0]):
for j in range(a.shape[1]):
result[i, j] = a[i, j] + b[i, j]
""",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"b = np.array([[5, 6], [7, 8]])\n" +
"result = np.empty(a.shape)",
# Execute the code 1000 times
number=1000
)
print("For loop elapsed time: {:.5f} seconds".format(for_loop_time))
``````
``````For loop elapsed time: 0.00324 seconds
``````

Measure the execution time of the vectorized implementation

``````vectorized_time = timeit.timeit(
# Define the vectorized code as a string
stmt="result = a + b",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"b = np.array([[5, 6], [7, 8]])",
# Execute the code 1000 times
number=1000
)
print("Vectorized elapsed time: {:.5f} seconds".format(vectorized_time))
``````
``````Vectorized elapsed time: 0.00051 seconds
``````

Similarly, subtraction, multiplication, and division can be performed using the same method.

`Broadcasting` is a powerful feature in element-wise operations that allows operations to be performed between arrays of different shapes. It allows the smaller array to be `"broadcast"` or `resized` to match the shape of the larger array so that the element-wise operation can be performed.

#### Multiplying by a scalar

To multiply the array a by a scalar using a for loop, we can iterate over its components and multiply them by the scalar. For example,

Define a 2D array

``````  a = np.array([[1, 2], [3, 4]])
``````

Define a scalar

``````  s = 2
``````

Multiply the array by the scalar using a nested for loop

``````  result = np.empty(a.shape) # Initialize an empty result array

for i in range(a.shape[0]):
for j in range(a.shape[1]):
result[i, j] = a[i, j] * s
print(result)
``````
``````  >> [[2 4] [6 8]]
``````

To multiply a vector by a scalar using vectorization, we can simply use the `*` operator. Here, the scalar value `s` is `broadcast` to a temporary array of the same shape as `a`, and the multiplication is performed element-wise. This allows us to easily multiply a vector by a scalar value without having to write a loop.

Multiply the array by the scalar using vectorization

``````  result = a * s
print(result)
``````
``````  >> [[2 4] [6 8]]
``````

Performance comparition,

Measure the execution time of the for loop implementation

``````  for_loop_time = timeit.timeit(
# Define the for loop code as a string
stmt="""
for i in range(a.shape[0]):
for j in range(a.shape[1]):
result[i, j] = a[i, j] * s
""",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"s = 2\n" +
"result = np.empty(a.shape)",
# Execute the code 1000 times
number=1000
)
print("For loop elapsed time: {:.5f} seconds".format(for_loop_time))
``````
``````For loop elapsed time: 0.00320 seconds
``````

Measure the execution time of the vectorized implementation

``````  vectorized_time = timeit.timeit(
# Define the vectorized code as a string
stmt="result = a * s",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"s = 2",
# Execute the code 1000 times
number=1000
)
print("Vectorized elapsed time: {:.5f} seconds".format(vectorized_time))
``````
``````  Vectorized elapsed time: 0.00106 seconds
``````

### Dot product

The `dot product`, also known as the inner product or scalar product, is a binary operation that takes two arrays and returns a scalar. It is originally defined as the sum of the element-wise products of the two 1D arrays. However, when applied to multidimensional arrays, the output can be different than a scalar. It may return an array of higher dimensions, depending on the implementation.

#### Vector dot product

The `vector dot product` takes two `1D arrays` (vectors) and returns a scalar. For example, given two vectors a and b with n elements each, the vector dot product is calculated as `a[0]*b[0] + a[1]*b[1] + ... + a[n-1]*b[n-1]`.

To calculate the vector dot product of two 1D arrays (vectors) using a for loop, we can iterate over the elements of each array and multiply them element-wise, then sum the results.

Define two vectors

``````a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
b = np.array([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])
``````
``````dot_product = 0

for i in range(len(a)):
dot_product += a[i] * b[i]
print(dot_product)
``````
``````>> 385
``````

To calculate the vector dot product using vectorization, use the `np.dot()` function.

``````dot_product = np.dot(a, b)
print(dot_product)
``````
``````>> 385
``````

Performance comparision,

Measure the execution time of the for loop implementation

``````for_loop_time = timeit.timeit(
# Define the for loop code as a string
stmt="""
dot_product = 0
for i in range(a.shape[0]):
dot_product += a[i] * b[i]
""",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])\n" +
"b = np.array([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])",
# Execute the code 1000 times
number=1000
)
print("For loop elapsed time: {:.5f} seconds".format(for_loop_time))
``````
``````For loop elapsed time: 0.00366 seconds
``````

Measure the execution time of the vectorized implementation

``````vectorized_time = timeit.timeit(
# Define the vectorized code as a string
stmt="dot_product = np.dot(a, b)",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])\n" +
"b = np.array([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])",
# Execute the code 1000 times
number=1000
)
print("vectorized elapsed time: {:.5f} seconds".format(vectorized_time))
``````
``````vectorized elapsed time: 0.00193 seconds
``````

#### Matrix multipication

The `matrix multipication` is the dot product applied to two `2D arrays`(matrices) which returns another 2D array. It is calculated by taking the dot product of each row of the first matrix with each column of the second matrix. For example, given two matrices A with m rows and n columns, and B with p rows and q columns, the matrix multipication is poerformed as follows:

``````result[i, j] = A[i, 0]*B[0, j] + A[i, 1]*B[1, j] + ... + A[i, n-1]*B[n-1, j]
``````

This results in a 2D array with `m` rows and `q` columns.
For example, given the following matrices:

``````A = [[1, 2, 3], [4, 5, 6]]  # A has 2 rows and 3 columns
B = [[7, 8], [9, 10], [11, 12]]  # B has 3 rows and 2 columns
``````
``````result = [[58 64], [139 154]]  # Result has 2 rows and 2 columns
``````

To calculate the dot product of two matrices using a for loop, we can iterate over their elements and multiply them element-wise, then sum the products. For example:

Define two 2D arrays

``````a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
``````
``````result = np.zeros((a.shape[0], b.shape[1]))

for i in range(a.shape[0]):
for j in range(b.shape[1]):
for k in range(a.shape[1]):
result[i, j] += a[i, k] * b[k, j]
print(result)
``````
``````>> [[19 22] [43 50]]
``````

To calculate the dot product of two matrices using the numpy library, we can use the `np.matmul()` function.

``````result = np.matmul(a, b)
print(result)
``````
``````>> [[19 22] [43 50]]
``````

Performace comparision,

Measure the execution time of the for loop implementation

``````for_loop_time = timeit.timeit(
# Define the for loop code as a string
stmt="""
for i in range(a.shape[0]):
for j in range(b.shape[1]):
for k in range(a.shape[1]):
result[i, j] += a[i, k] * b[k, j]
""",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"b = np.array([[5, 6], [7, 8]])\n" +
"result = np.zeros((a.shape[0], b.shape[1]))",
# Execute the code 1000 times
number=1000
)
print("For loop elapsed time: {:.5f} seconds".format(for_loop_time))
``````
``````For loop elapsed time: 0.00811 seconds
``````

Measure the execution time of the vectorized implementation

``````vectorized_time = timeit.timeit(
# Define the vectorized code as a string
stmt="result = np.matmul(a, b)",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"b = np.array([[5, 6], [7, 8]])",
# Execute the code 1000 times
number=1000
)
print("Vectorized elapsed time: {:.5f} seconds".format(vectorized_time))
``````
``````Vectorized elapsed time: 0.00131 seconds
``````

However, both `np.dot()` and `np.matmul()` functions can be used to perform a matrix-matrix dot product, but they may return different results when applied to 1D arrays. When applied to 1D arrays, `np.dot()` performs the vector dot product, while `np.matmul()` performs element-wise multiplication. For example,

Define two 1D arrays

``````u = np.array([1, 2, 3])
v = np.array([4, 5, 6])
``````

Calculate the dot product using `np.dot()`

``````result = np.dot(u, v)
print(result)
``````
``````>> 32
``````

Calculate the dot product using `np.matmul()`

``````result = np.matmul(u, v)
print(result)
``````
``````>> [4 10 18]
``````

Using `np.matmul()` is the recomended way to perform matrix multipication.

## Advanced techniques for manipulating elements in NumPy arrays

NumPy provides a set of built-in methods for working with multidimensional arrays that are similar in functionality to those provided by Python for working with lists. Some examples of these methods include `np.sort()`, `np.append()`, etc.

### `concatenate()`

The `concatenate()` function allows you to combine multiple arrays into a single array.Here is an example of concatenating two 3D arrays using a for loop,

Define two 3D arrays

``````array_1 = [[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]
array_2 = [[[13, 14, 15], [16, 17, 18]], [[19, 20, 21], [22, 23, 24]]]
``````

Iterate over the elements in each array and append them to the combined list

``````combined_array = [] # Initialize an empty list to store the combined arrays

for array in array_1:
combined_array.append(array)
for array in array_2:
combined_array.append(array)

# Convert the list to a numpy array
combined_array = np.array(combined_array)
print(combined_array)
``````
``````>> [[[ 1  2  3]
[ 4  5  6]]

[[ 7  8  9]
[10 11 12]]

[[13 14 15]
[16 17 18]]

[[19 20 21]
[22 23 24]]]
``````

The default behavior of the `concatenate()` function is to concatenate the arrays along the `first axis` (axis 0). This means that the arrays are joined together one after the other along the first axis, resulting in a new array with the same number of dimensions as the original arrays but with an increased size along the first axis. You can change the axis along which the arrays are concatenated by providing the `axis` parameter.

Use the `concatenate()` function to combine the two arrays

``````combined_array = np.concatenate((array_1, array_2))
print(combined_array)
``````
``````>> [[[ 1  2  3]
[ 4  5  6]]

[[ 7  8  9]
[10 11 12]]

[[13 14 15]
[16 17 18]]

[[19 20 21]
[22 23 24]]]
``````

On the other hand `hstack()` and `vstack()` functions can also be used to combine arrays horizontally or vertically, respectively. The `hstack()` function is used to stack arrays in sequence `horizontally` (along the second axis), whereas the `vstack()` function is used to stack arrays in sequence `vertically` (first axis).

Using hstack to stack the arrays horizontally,

``````result_hstack = np.hstack((array_1, array_2))
print(result_hstack)
``````
``````>> [[[ 1  2  3] [ 4  5  6] [13 14 15] [16 17 18]]
[[ 7  8  9] [10 11 12] [19 20 21] [22 23 24]]]
``````

Using vstack to stack the arrays vertically,

``````result_vstack = np.vstack((array_1, array_2))
print(result_vstack)
``````
``````>> [[[ 1  2  3] [ 4  5  6]]
[[ 7  8  9] [10 11 12]]
[[13 14 15] [16 17 18]]
[[19 20 21] [22 23 24]]]
``````

### `sort()`

The `sort()` function sorts the elements in a NumPy array in ascending order. Here is an example of sorting an array using a for loop,

Define a 3D array

``````data = [[[3, 1, 4, 3], [2, 1, 4, 3]], [[4, 5, 6, 4], [5, 6, 7, 5]]]
``````

Iterate over the elements in the array and add them to the list

``````sorted_elements = [] # Initialize an empty list to store the sorted elements

for array in data:
for subarray in array:
for element in subarray:
sorted_elements.append(element)

# Sort the list in ascending order
sorted_elements.sort()
print(sorted_elements)
``````
``````>> [1, 1, 1, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 6, 7]
``````

The same example using the `sort()` function,

``````sorted_elements = np.sort(data)
print(sorted_elements)
``````
``````>> [1, 1, 1, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 6, 7]
``````

### `where()`

The `where()` function in `NumPy` allows to return elements from one of two possible arrays depending on whether a given condition is met. The elements where the condition is true will come from the first array, while the elements where the condition is false will come from the second array.Here is an example,

Define arrays

``````x = np.array([1, 2, 3, 4, 5])
y = np.array([10, 20, 30, 40, 50])
``````
``````condition = x > 2
``````

The `condition` variable is a boolean array that indicates which elements of x are greater than 2.

Iterate over the elements of the input arrays and the condition array, and use an if statement to decide which element to add to the output array.

``````result = []
for i in range(len(x)):
if condition[i]:
result.append(x[i])
else:
result.append(y[i])
result = np.array(result)
print(result)
``````
``````>>> [10 20 3 4 5]
``````

The `np.where()` function uses the condition to select elements from `x` where the condition is `True` and elements from `y` where the condition is `False`.

``````result = np.where(condition, x, y)
print(result)
``````
``````>>> [10 20 3 4 5]
``````

## Working with a dataset

Working with a dataset can often involve calculating statistical measures such as the mean, standard deviation, and variance. This can be a tedious process if you have to do it manually using for loops, especially for large datasets. Fortunately, NumPy provides vectorized functions that can perform these calculations efficiently on an entire array of data at once.

In the other hand Python's built-in `math` module also provides functions for these calculations like the mean, standard deviation, and variance. The real advantage of using `NumPy` for statistical calculations comes with the axis argument, which allows you to calculate statistical measures along a specific dimension of a multidimensional array. This cannot be done with Python's built-in `math` module. Let's see some examples,

Define a 2D array

``````data = np.array([[1, 2, 3], [4, 5, 6]])
``````

### Median

The median of a data set is the middle value when the data is arranged in ascending or descending order. If the data set has an odd number of values, the median is the middle value. If the data set has an even number of values, the median is the average of the two middle values. Learn more

Calculate the median of each row using a for loop

``````medians = np.empty(data.shape[0])
for i in range(data.shape[0]):
medians[i] = np.median(data[i, :])
print(medians)
``````
``````>> [2. 5.]
``````

To calculate the median using `Numpy` library, you can use the `np.median()`function.

``````medians = np.median(data, axis=1)
print(medians)
``````
``````>> [2. 5.]
``````

In these examples, data is a 2D NumPy array and the `axis=1` argument tells the function to compute the median for each `row`. If you want to compute the median for each `column` you can change the axis argument to `axis=0`.

### Mean

The mean (average) of a data set is found by adding all numbers in the data set and then dividing by the number of values in the set. Learn more

Calculate the mean of each column using a for loop

``````means = np.empty(data.shape[1])
for i in range(data.shape[1]):
means[i] = np.mean(data[:, i])
print(means)
``````
``````>> [2.5 3.5 4.5]
``````

To calculate the mean using `Numpy` library, you can use the `np.mean()` function.

``````means = np.mean(data, axis=0)
print(means)
``````
``````>> [2.5 3.5 4.5]
``````

### Standard Deviation

The standard deviation is a measure of how spread out the data is from the `mean`. It is the square root of the average of the squared deviations from the mean. `std = sqrt(mean(x)), where x = abs(a - a.mean())**2`. A low standard deviation indicates that the data is tightly grouped around the mean, while a high standard deviation indicates that the data is more spread out. Learn more

``````# Calculate the standard deviation of each column using a for loop
stds = np.empty(data.shape[1])
for i in range(data.shape[1]):
stds[i] = np.std(data[:, i])
print(stds)
``````
``````>> [1.5 1.5 1.5]
``````

To calculate the mean using `Numpy` library, you can use the `np.std()` function.

``````# Calculate the standard deviation of each column using np.std() with the axis argument
stds = np.std(data, axis=0)
print(stds)
``````
``````>> [1.5 1.5 1.5]
``````

## Can For loops be completely replaced?

Now you know how to use vectorized operations and Numpy functions to optimize the performance and readability of your code. However, in some cases, the sequential nature of for loops may still be necessary. For example, when iteratively updating values in a function, where the output at time t depends on the output at time t-1, a for loop may be the only option. This is because vectorized operations are typically performed in parallel, so the output at t-1 cannot be used to calculate the output at t in a vectorized manner.

``````def update_values(prev):
new = prev + np.log(prev + 2)
return new

x = 0
y = 0
iterations = 10

for i in range(iterations):
x += update_values(x)
y += x + 10

print(x)
print(y)

>> 1150.00069088284
>> 2367.2066419361945
``````

## Conclusion

Considering vectorization other than explicit For Loops where necessary can make your next project more robust since it is an accelerated way to carry out calculations. But as above there can be times when using Loops is the only option. So choose wisely by analyzing the problem you are working on. If it can be solved in a parallel nature, moving to vectorization is the best option. Here I've shown you some important methods of NumPy to implement vectorization efficiently. You can also find the documentation here . Make sure to check that out to find some more important details.