DEV Community

The Coding Dino
The Coding Dino

Posted on

3 1

Clustering Algorithms: K-Means

Introduction

K-Means is an unsupervised machine learning algorithm. The algorithm divides the data points into k groups (called clusters), where each data point can belong to only one cluster. K-Means aims to group together similar data points into the same cluster, while keeping different clusters as far apart as possible.

Working of K-Means Algorithm

Each cluster has a center, which is a data point that represents the center of the cluster. A data point gets added to a cluster whose center is closest to that data point. Distance between points is measured using sum of squared distances method.

Algorithm

  1. Select the number of clusters, k
  2. Appoint k data points as cluster centers (either random assignment, or space them as far apart as possible)
  3. Until cluster assignments do not change, do the following for each data point:
    1. Calculate the sum of squared distance between it and all the cluster centers.
    2. Assign the point to the cluster having the closest center.
    3. Recalculate the center for clusters by taking the average of all data points assigned to that cluster.

Additional Information

  1. K-Means clustering is highly sensitive to the initially chosen cluster centers. Hence, K-means can be run with different starting cluster centers to get optimum results.
  2. If you do not know the optimum number of clusters to divide the data, try the algorithm with different values of k and select the best k for which the data gets nicely grouped together.

For further information, please checkout https://stanford.edu/~cpiech/cs221/handouts/kmeans.html (image taken from here)

AWS Q Developer image

Your AI Code Assistant

Automate your code reviews. Catch bugs before your coworkers. Fix security issues in your code. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE

Top comments (0)

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay