Machine learning is one of the core concepts of data science which forms the foundation for AI.
Have you ever wondered how recommendation systems such as YouTube work? How are they able to recommend just the right content? The answer is simple: unsupervised machine learning.
Unsupervised Machine Learning
It is a type of machine learning where the model is fed raw, unstructured data without any labels.
The model then learns and make sense of the data through discovering patterns and relationships on its own.
How does it work?
- For it to learn the patterns and relationships, unsupervised machine learning depends heavily on mathematical concepts.
- Data points with similar features are grouped together each in its own group.
Models of Unsupervised Machine Learning
There are different unsupervised machine learning models: clustering and dimensionality reduction.
i. Clustering
Just as the name suggests, clustering involves grouping data into different clusters such that data points in the same cluster have very similar features while data points in different clusters are very different.
There are different algorithms used in clustering:
- K-means
In K-means clustering, the user defines the number of desired clusters (K) that the algorithm is supposed to form.
Distance metrics such as the Euclidean distance and Manhattan distance come in to play whereby the algorithm measures the distance of data points from a centroid and clusters the data points depending on how similar a point is to a centroid.
- Hierarchical clustering
It involves forming a hierarchy of data points thus creating a tree of clusters. There are two types of hierarchical clustering: agglomerative which is a bottom-up approach and divisive which is a top-bottom approach.
ii. Dimensionality Reduction
At times, we encounter datasets that have so many features; features which are of no meaningful value to the data we are trying to make sense of.
In such a case, we use dimensionality reduction which works by reducing the number of variables while preserving key information. The model filters through the noise and gets rid of the unnecessary dimensions.
My thoughts:
There is so much to learn from unsupervised machine learning models which exist to help us understand data beyond what is obvious to the human eye.
Its ability to recognize patterns and relationships makes it powerful to use because real world data is messy and noisy!
Top comments (0)