DEV Community

MustafaLSailor
MustafaLSailor

Posted on

Hierarchical clustering

There are two main types of hierarchical segmentation (or hierarchical clustering): agglomerative and divisive.

Treeing Hierarchical Clustering: In this approach, each data point is considered a cluster in itself. The algorithm merges the two most similar clusters at each step. This process continues until all data points merge into a single cluster. Similarity is usually determined by a distance measure such as euclidean distance, and similarity between clusters is often determined by methods such as single linkage, complete linkage, or average linkage.

Partition Hierarchical Clustering: In this approach, one starts with a single cluster containing all data points. At each step, the algorithm divides a cluster into two subsets. This process continues until each data point is a cluster on its own.

Both methods create a hierarchy or dendrogram. This is a tree-like diagram that shows how clusters are formed or divided. A user can make a "cut" at any level in the dendrogram, which will separate the data points into a specific number of clusters.

Hierarchical clustering is useful for exploring the natural hierarchical structure of data or when the number of clusters is not known in advance. However, it is often too slow for large data sets and so faster algorithms such as K-Means are generally preferred.

Image description

Dendrogram

Image description

Top comments (0)