DEV Community

Cover image for NumPy vs List: SIMD Vector Processing with Concrete Mathematical Example
Hussein Mahdi
Hussein Mahdi

Posted on

NumPy vs List: SIMD Vector Processing with Concrete Mathematical Example

SIMD, or Single Instruction Multiple Data, represents a parallel processing paradigm that exploits the vector processing capabilities inherent in modern CPU architectures. Contemporary processors contain specialized registers, typically 128-bit, 256-bit, or 512-bit in width, capable of holding multiple data elements simultaneously. When a SIMD instruction executes, it performs the same operation on all data elements within the register in a single clock cycle, rather than processing each element sequentially.

The Task

Consider the operation of adding two arrays, each containing eight floating-point numbers:

Array A: [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]
Array B: [0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5]
Result:  [1.5, 3.5, 5.5, 7.5, 9.5, 11.5, 13.5, 15.5]
Enter fullscreen mode Exit fullscreen mode

Traditional Python List Approach

Using standard Python lists, the computation executes sequentially:

list_a = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]
list_b = [0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5]
result = []

for i in range(len(list_a)):
    result.append(list_a[i] + list_b[i])
Enter fullscreen mode Exit fullscreen mode

This approach processes one addition per iteration, requiring eight separate operations executed sequentially. Each iteration involves retrieving Python objects, extracting numerical values, performing the addition, creating a new object, and appending it to the result list.

NumPy with SIMD Acceleration

NumPy transforms this operation dramatically:

import numpy as np

array_a = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])
array_b = np.array([0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5])
result = array_a + array_b
Enter fullscreen mode Exit fullscreen mode

When executing this vectorized operation, NumPy leverages SIMD instructions. With a processor supporting 256-bit AVX instructions, the computation proceeds as follows. The CPU loads four values from each array into 256-bit vector registers simultaneously. In a single instruction cycle, it performs all four additions in parallel. The processor then loads the next four values and repeats the parallel operation. This reduces eight sequential operations to merely two parallel operations.

Note 1: Type Coercion in NumPy

What Happened?

import numpy as np

a = np.array([1, 2, 4, 5, "code"])
Enter fullscreen mode Exit fullscreen mode

When you created the array with mixed integers and a string, NumPy identified that these elements could not coexist with their original data types. The system resolved this incompatibility by converting all elements to a single common type that could accommodate every value. In this case, NumPy selected the string data type (specifically, Unicode string) because strings can represent both numeric characters and alphabetic characters, whereas integers cannot represent text.

The output reveals that the first element, originally the integer 1, has been converted to the string '1'. All elements in the array are now strings, maintaining the fundamental requirement of homogeneity. The data type designation resembling <U21 indicates a Unicode string with a maximum length sufficient to store the longest element.

import numpy as np

a = np.array([1, 2, 4, 5, "code"])
print(a)  # ['1' '2' '4' '5' 'code'] <U21
print(a.dtype)  # This will show: <U21 or similar (Unicode string)
print(type(a[0]))  # This will show: <class 'numpy.str_'>
Enter fullscreen mode Exit fullscreen mode

Type Coercion Hierarchy

NumPy follows a specific hierarchy when performing automatic type conversion. If an array contains integers and floating-point numbers, all integers convert to floats. If an array contains numbers and strings, all numbers convert to strings. If an array contains booleans and integers, booleans convert to integers (True becomes 1, False becomes 0). This hierarchy ensures that no data is lost during the conversion process, as the target type can always represent the source type's values.


Performance Implications

When our array contains string representations of numbers rather than actual numeric types, NumPy cannot utilize SIMD vector processing or other numerical optimizations.

Note 2: Shape Requirements in NumPy

Why the Error?

import numpy as np1

a = np1.array([[1,2,7,4], [1,2,4,6]])
print(a,"\n",a.shape,"\n",a.ndim)
Enter fullscreen mode Exit fullscreen mode

NumPy requires that all dimensions of a multi-dimensional array maintain consistent sizes, creating what is termed a rectangular or uniform shape structure.

Python lists permit this irregular structure without issue, allowing nested lists of varying lengths.

Alternative Approaches for Irregular Data

a = np.array([[1, 2, 7, 4], [1, 2, 4]], dtype=object)
Enter fullscreen mode Exit fullscreen mode

This approach stores references to Python list objects rather than storing numeric data directly, eliminating the possibility of vectorized operations and SIMD processing.

List Advantages and Use Cases

Python lists maintain advantages in specific contexts. Their flexibility permits storing mixed data types and facilitates dynamic resizing operations. Lists prove superior for small datasets where NumPy's initialization overhead outweighs computational benefits, for operations requiring frequent insertions or deletions at arbitrary positions, and for general-purpose programming tasks not centered on numerical computation.


Comparative Analysis: NumPy Arrays vs. Python Lists

Attribute NumPy Arrays Python Lists
Homogeneous data types
Heterogeneous data types
Contiguous memory storage
SIMD vector processing support
Vectorized operations
Low memory overhead
Fast numerical computations
Efficient element-wise operations
Dynamic resizing efficiency
Arbitrary position insertion/deletion
Mixed type storage capability
Broadcasting functionality
Multi-dimensional indexing
Built-in mathematical functions
Integration with scientific libraries
Suitable for general-purpose programming
Optimal for small datasets
Memory-efficient for large numeric data

Conclusion

This comparison demonstrates that NumPy and Python lists serve complementary purposes within the Python ecosystem. NumPy excels in numerical computation contexts where performance and memory efficiency matter, while Python lists provide flexibility and convenience for general-purpose programming tasks requiring heterogeneous data structures or frequent modifications.


Follow me for more content! If you enjoyed this article and want to stay updated with more technical insights, tutorials, and research on software development, AI/ML, and computer science topics, connect with me on LinkedIn and check out my projects on GitHub. I regularly share valuable content about .NET development, algorithms, machine learning, and software architecture. Let's learn and grow together!

Top comments (0)