By the end of this session you will learn:
• What NumPy is
• Why NumPy is faster than Python lists
• Core NumPy operations
• Data manipulation techniques
• A real-world example
1. Introduction
Python is widely used in data science, machine learning, and AI systems.
At the core of these systems lies a powerful library called NumPy.
NumPy stands for Numerical Python and provides:
- High-performance arrays
- Mathematical operations
- Matrix computations
- Fast vectorized operations
Many popular libraries are built on top of NumPy:
- Pandas
- Scikit-Learn
- TensorFlow
- PyTorch
Understanding NumPy helps us understand how modern AI systems process data efficiently.
2. Why NumPy? (Python List vs NumPy Array)
Python lists are flexible but not optimized for large numerical computations.
NumPy arrays are implemented in C, making them significantly faster.
Example: Performance Comparison
import numpy as np
import time
size = 1000000
list1 = list(range(size))
list2 = list(range(size))
start = time.time()
result = [x + y for x, y in zip(list1, list2)]
print("Python list time:", time.time() - start)
arr1 = np.arange(size)
arr2 = np.arange(size)
start = time.time()
result = arr1 + arr2
print("NumPy array time:", time.time() - start)
Explanation:
NumPy performs operations on entire arrays at once, called vectorization.
This removes the need for loops.
3. Creating NumPy Arrays
NumPy arrays are called ndarrays (N-dimensional arrays).
Basic Array Creation
import numpy as np
arr = np.array([1,2,3,4,5])
print(arr)
Creating Arrays of Zeros
zeros = np.zeros((3,3))
print(zeros)
Creates a 3×3 matrix of zeros.
Creating Arrays of Ones
ones = np.ones((2,4))
print(ones)
Creates a 2×4 matrix filled with ones.
Creating Ranges
numbers = np.arange(0,10,2)
print(numbers)
Output
[0 2 4 6 8]
4. Array Shapes and Dimensions
NumPy arrays can be multi-dimensional.
Example:
matrix = np.array([
[1,2,3],
[4,5,6],
[7,8,9]
])
print(matrix.shape)
Output
(3,3)
Meaning:
3 rows
3 columns
5. Vectorized Operations
NumPy allows mathematical operations without loops.
Example:
arr = np.array([10,20,30,40])
print(arr + 5)
print(arr * 2)
print(arr / 10)
Output
[15 25 35 45]
[20 40 60 80]
[1. 2. 3. 4.]
Explanation:
The operation is automatically applied to each element of the array.
This is called broadcasting.
6. Indexing and Slicing
1D Array
data = np.array([10,20,30,40,50])
print(data[0])
print(data[1:4])
Output
10
[20 30 40]
2D Array
matrix = np.array([
[1,2,3],
[4,5,6],
[7,8,9]
])
print(matrix[1,2])
Output
6
Extract Column
print(matrix[:,1])
Output
[2 5 8]
Explanation
: means all rows
7. Aggregation Functions
NumPy provides built-in functions for data analysis.
Example:
data = np.array([10,20,30,40])
print("Mean:", np.mean(data))
print("Sum:", np.sum(data))
print("Max:", np.max(data))
print("Min:", np.min(data))
Output
Mean: 25
Sum: 100
Max: 40
Min: 10
8. Boolean Filtering (Very Useful for Data Cleaning)
Example dataset:
scores = np.array([55,78,90,34,88,67])
Find students who passed:
passed = scores[scores > 60]
print(passed)
Output
[78 90 88 67]
Explanation:
NumPy allows filtering data without loops.
This is heavily used in data preprocessing pipelines.
10. Real-World Example – Netflix Recommendation System
Recommendation systems power platforms like Netflix.
Users and movies can be represented as vectors.
Movie List
movies = ["Interstellar", "Inception", "Titanic", "Avengers"]
User Ratings Matrix
ratings = np.array([
[5,4,1,1], # Alice
[4,5,1,1], # Bob
[1,1,5,4] # Charlie
])
print(ratings)
Rows represent users
Columns represent movies
- Cosine Similarity
Recommendation systems often compute similarity between users.
Formula
similarity = (A · B) / (|A| |B|)
Implementing Similarity
from numpy.linalg import norm
alice = ratings[0]
bob = ratings[1]
charlie = ratings[2]
sim_alice_bob = np.dot(alice,bob) / (norm(alice)*norm(bob))
sim_alice_charlie = np.dot(alice,charlie) / (norm(alice)*norm(charlie))
print("Alice vs Bob:", sim_alice_bob)
print("Alice vs Charlie:", sim_alice_charlie)
Expected output
Alice vs Bob: ~0.98
Alice vs Charlie: ~0.32
Explanation
Alice and Bob have similar movie taste.
Charlie has different preferences.
- Recommendation Logic
If Alice hasn't watched Avengers, we can recommend movies liked by similar users.
alice_ratings = np.array([5,4,1,0])
bob_ratings = ratings[1]
recommended_index = np.argmax(bob_ratings)
print("Recommended movie:", movies[recommended_index])
Top comments (0)