As a beginner in the world of Machine Learning (ML) I want to document the learnings I come across and explain them in a clear manner. Over a series of articles I plan to walkthrough different elements of Machine Learning from a beginner level up to intermediate.
This short article will look at only 2 things;
- What is Machine learning?
- Four Main Training processes (Supervised, Unsupervised, Semi-Supervised & Reinforcement).
What is Machine Learning (ML)?
ML is the process of solving a problem through the following steps;
Gathering a dataset (tabular data)
Building a statistical model based on the dataset
The inference(conclusion, or use of) that model to solve a practical problem.
A Small ML Example
Data gathered from a central server for traffic
A prediction model is built from each car dataset row, input data(location and speed)
To estimate upcoming congestion areas so that they can be prevented, a “model” is created.
A model is simply a file that identifies patterns(makes a prediction) based on the input data.
Machine learning in such scenarios helps to estimate the regions where congestion could be found.
Daily traffic experiences (data) is used to train a model to essentially see the future.
Machine Learning Types
Here are the four types of ML and a short description that helps us identify the differences later on.
Using data that is labelled, when a dataset is in a collection that is labelled, each row of data is called a feature vector.
Data here is unlabeled, during training the model discovers patterns and new information on its own.
The data here has only a small amount that is labelled with the rest being largely unlabelled, as a hybrid between the 2 above it is popular for classifying text documents.
No data is given to the model is this type, training is very different for reinforcement as the process is random and the model makes sequences of decisions to try and get the highest return value possible. Very popular in gaming and a stable for AWS DeepRacer.
We mentioned “feature vector” for supervised but as an example, just imagine this data as the height, weight and gender of a person.
The most important part here is that each datarow has the same labels so they can be differentiated between their features.
An example of this would be spam detection for emails, you have 2 labels here; spam and not_spam.
So the goal of supervised learning is to use a dataset to produce a model that takes a feature vector as an input and outputs a prediction.
The difference to supervise learning is the data is unlabeled, and the goal is to find unknown patterns within your dataset.
An example of this would be for fraud detection, (or anomaly detection as the hierarchical term).
It’s effective at detecting unseen events or rare occurrences by models made through unsupervised using non-anomalous training examples (datarows). i.e how different is event x from a “typical” example in the dataset.
It’s worth showing the differences between these 2 techniques before explaining semi-supervised.
So for semi the dataset contains both labeled and unlabeled examples. Usually unlabeled with have a much higher percentage over labeled. The outcome of semi is the same as supervised, that is, to use the labeled data to produce a model.
Then what’s the point of semi and having more unlabeled data?
You might think that extra unlabeled data will harm your model training but look at it this way; you’re actually adding more information about your problem.
Given that the labeled data came from a similar sample-set as the unlabeled then in effect you’ve improved the probability distribution for your entire dataset to leverage.
If the data you are working with is challenging to label, consider semi-supervised learning to help ease the process.
Reinforcement learning (RL) is the training of machine learning models to make a sequence of decisions.
It is a subfield of ML where the machine or “agent” in this learning context learns in an environment that can ingest state as a vector of features.
A policy is a function (similar to models in supervised learning)
Taking a feature vector of the state supplied by the environment, the agent produces an action to achieve maximum rewards (usually a float value). As the process is sequential, so to is the decision making. The long term goal of the RL lifecycle is to continually optimise the actions taking to achieve better rewards.
The agent learns to achieve a goal in an uncertain, potentially complex environment. In RL, an artificial intelligence faces a game-like situation.
The most popular gamified existence of RL is AWS DeepRacer.
For DeepRacer its goal is to;
- Navigate the virtual track (environment)
- Gather info like car position, distance from centerline (state)
- Try a random speed and direction (action)
- Logs its result from the policy (reward)
- Repeat process until termination count (episodes)