DEV Community

Abhishek Patange
Abhishek Patange

Posted on

NumPy Basics for Data Handling in Python

When starting your journey in Data Science or Machine Learning, one of the first libraries you’ll encounter is NumPy.

Why? Because pandas, scikit-learn, TensorFlow, and PyTorch are all built on top of NumPy. If you understand NumPy arrays, you’ll have an easier time working with any data library in Python.

In this post, we’ll cover the essentials: creating arrays, indexing, reshaping, performing operations, and using useful functions.


Creating NumPy Arrays

A NumPy array, formally known as an ndarray, is the fundamental data structure provided by the NumPy (Numerical Python) library. It is a powerful, N-dimensional array object optimized for numerical and scientific computing in Python.

import numpy as np

# From Python list
arr = np.array([1, 2, 3, 4, 5]) 
print(arr)

# 2D array
mat = np.array([[1, 2, 3], [4, 5, 6]])
print(mat)

# Special arrays
zeros = np.zeros((3, 3)) # -> [[0,0,0],[0,0,0],[0,0,0]]
ones = np.ones((2, 4)) # -> this will create a (2,4) matrix of 1's.
rand = np.random.rand(2, 3) # -> this will create a (2,3) matrix with random numbers.

print(zeros) 
print(ones)  
print(rand)
Enter fullscreen mode Exit fullscreen mode

Intexing and slicing

NumPy provides powerful mechanisms for accessing and manipulating elements within arrays through indexing and slicing.

matrix repesentation-> eg. mat = [[1,2,3],[4,5,6]]
-> column 0,1,2
row 0 -> [[1,2,3],
row 1 -> [4,5,6]]

arr = np.array([10, 20, 30, 40, 50])

print(arr[0])     # First element
print(arr[-1])    # Last element
print(arr[1:4])   # Slice [20, 30, 40]

# 2D indexing

mat = np.array([[1, 2, 3], [4, 5, 6]])
print(mat[0, 1])  # Row 0, Col 1 → 2
print(mat[:, 2])  # All rows, Col 2 → [3, 6]
Enter fullscreen mode Exit fullscreen mode

Reshaping & Flattening

In NumPy, reshaping and flattening are fundamental operations used to manipulate the structure of arrays.
Reshaping
Reshaping an array changes its dimensions while maintaining the total number of elements. The reshape() method is used for this purpose.

arr = np.arange(1, 13)   # Numbers 1–12
print(arr) # -> [1,2,3,4,5,6,7,8,9,10,11,12]

reshaped = arr.reshape(3, 4)  # 3 rows, 4 cols
print(reshaped) # -> [[1,2,3,4],[5,6,7,8],[9,10,11,12]]

flat = reshaped.flatten()     # Back to 1D
print(flat) # -> [1,2,3,4,5,6,7,8,9,10,11,12]

Enter fullscreen mode Exit fullscreen mode

Mathematical Operations

NumPy makes vectorized operations simple (no loops required).

a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])

print(a + b)   # Element-wise addition
print(a * b)   # Multiplication
print(a ** 2)  # Square each element

# Statistics
print(a.mean())   # Average 
print(b.max())    # Maximum
print(b.min())    # Minimum
print(np.std(b))  # Standard deviation

Enter fullscreen mode Exit fullscreen mode

Useful Functions

# Range of values
arr = np.arange(0, 10, 2)
print(arr)  # [0 2 4 6 8]

# Linearly spaced values
lin = np.linspace(0, 1, 5)
print(lin)  # [0.   0.25 0.5  0.75 1. ]

# Identity matrix
I = np.eye(3)
print(I)

# Random values
rand = np.random.randn(3, 3)  # Normal distribution
print(rand)
Enter fullscreen mode Exit fullscreen mode

🧠 Why NumPy Matters in ML/AI

  1. Data Wrangling: Fast math operations on huge datasets.
  2. Matrix Algebra: Used in linear regression, neural networks, and deep learning.
  3. Interoperability: Works seamlessly with pandas, scikit-learn, TensorFlow, and PyTorch.

Top comments (0)