DEV Community

Min Xiong
Min Xiong

Posted on

🔥 KNN Explained in 5 Minutes (Python + Iris Dataset) — Beginner Guide

🧠 Why KNN Is So Popular

Machine learning can feel complicated…

KNN isn’t.

No training loops.
No gradients.
No heavy math.

Just one idea:

Similar data points are close to each other.
🎬 Full Video Explanation

⚙️ How KNN Works

KNN is a lazy learning algorithm — it doesn’t train a model.

Instead, it:

📦 Stores all training data
📏 Computes distance to new data
🔍 Finds the K nearest neighbors
🗳️ Uses their labels to predict

👉 Majority vote = classification
👉 Average = regression

🎯 Quick Visual (30s)

📏 Distance Matters (Core Idea)

Everything in KNN depends on how we measure distance.

📐 Euclidean vs Manhattan vs Minkowski
🔹 Euclidean Distance
Straight-line distance
Default in most cases
Best for continuous features

👉 Think: “as the crow flies”

🔹 Manhattan Distance
Moves in grid-like paths
Sum of absolute differences

👉 Think: “walking through city blocks”
🔹 Minkowski Distance
General version of both
Controlled by parameter p
p = 1 # Manhattan
p = 2 # Euclidean

👉 One formula → multiple distance types

🎬 Distance Explained (Short)

🌸 Example: Iris Dataset

The Iris dataset is perfect for beginners.

3 flower species
4 features:
Sepal length
Sepal width
Petal length
Petal width

👉 Goal: predict species

💻 Python Example (Complete)

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)

y_pred = knn.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))

new_sample = [[5.1, 3.5, 1.4, 0.2]]
prediction = knn.predict(new_sample)
print("Predicted:", iris.target_names[prediction][0])
Enter fullscreen mode Exit fullscreen mode

🧠 Key Takeaways
✅ Pros
Simple and intuitive
No training phase
Great for beginners
⚠️ Cons
Slow for large datasets
Sensitive to noise
Needs feature scaling

🎯 When Should You Use KNN?

Use KNN when:

Dataset is small
Data is well-labeled
You need a quick baseline
🧩 One-Line Summary

Store data → Find neighbors → Vote → Predict

Top comments (0)