K-Means makes you pick the number of clusters and only finds round blobs. DBSCAN does neither — it grows clusters from dense regions, discovers the count on its own, and flags outliers as noise. Here it is, clustering two moons that K-Means could never split.
🌌 Try it (drag eps + minPts): https://dev48v.infy.uk/ml/day16-dbscan.html
Two knobs, no K
- eps — the neighborhood radius.
- minPts — how many neighbors (within eps) a point needs to count as "dense."
That's it. You never say how many clusters there are.
Three kinds of points
- Core — has ≥ minPts neighbors within eps (sits in a dense region).
- Border — within eps of a core point, but not dense itself.
- Noise — neither. The loners. DBSCAN labels them as outliers instead of forcing them into a cluster.
How clusters form
Pick an unvisited point; if it's core, start a cluster and grow it outward through density-connected core points (a BFS), sweeping in their borders. Repeat. Because clusters chain through density, DBSCAN finds arbitrary shapes — moons, rings, blobs — and the count emerges from the data.
When to use it
Great for spatial/geo data and anomaly detection (noise = anomalies). Weak when clusters have very different densities or in high dimensions.
🔨 Built from scratch (region query → core/border/noise → BFS expand) on the page: https://dev48v.infy.uk/ml/day16-dbscan.html
Part of MachineLearningFromZero. 🌐 https://dev48v.infy.uk
Top comments (0)