DEV Community

Cover image for Introduction to Machine Learning
Youdiowei Eteimorde
Youdiowei Eteimorde

Posted on • Updated on

Introduction to Machine Learning

The field of Machine learning has made a lot of breakthroughs in the last couple of years and it seems like it's only the beginning. The web itself has also evolved in many ways. It is no longer just used to host websites, it can also host web applications that can rival native apps. The beauty of the web is you can share anything with just one link.

In this series of articles, I will introduce Machine learning to you the web developer who is curious about the field. Throughout the series, we will learn various concepts and implement them in code. No prior knowledge of Machine Learning or (mathematics/statistics) is needed all you need to follow up, is to be familiar with JavaScript.

Intermediate knowledge of JavaScript might be needed like ES6 concepts, promises, async/await, and basic data structures. Even if you are not familiar with most of them you can learn them as you go.

What is Machine Learning

Machine learning is the act of teaching a machine to perform a task without giving it any actual instructions. Normally as a programmer, you give instructions to your computer in the form of code but in ML you don't give any instructions rather you give it data.

Data and Datasets

Data is the most important aspect of the Machine learning process. In Machine learning a collection of data is processed into a dataset. Depending on the task the dataset can either be structured as tables or unstructured as in images. Each data point in a dataset contains the actual data and its corresponding label.

Sentence label
I hate you 0
I love you 1
Go f*ck yourself 0
Burn in Hades 0
Have a nice day 1
This is awesome 1

The table above is a dataset that contains data for the task of sentiment analysis. Sentiment analysis involves predicting the sentiment of a sentence whether it is positive or negative. This dataset has its data(the sentence) and corresponding label that tells you if a sentence is positive (1) or negative(0).

Datasets are usually divided into two:

  • Training sets
  • Testing sets

The training set is used to train the ML model while the testing set is used to check the performance of the model after training.

Let's take our sentiment dataset and split it into training and testing sets.

Training set

Sentence label
I hate you 0
I love you 1
Go f*ck yourself 0
This is awesome 1

Testing set

Sentence label
Burn in Hades 0
Have a nice day 1

Datasets are usually massive and can contain millions of data points. The more data that is available the better the model will perform.

What is a Model

A model is what takes in the data and produces a desired result. The model goes through a process called training. Training is the act of showing a model of data and its corresponding labels. This training is usually performed on the training set. After supplying the model with data and labels your model will be able to draw insights from the data.

A model could be a mathematical equation or an advanced algorithm that is capable of learning. For the sake of simplicity let's create a dummy model using Javascript classes.

class Model{

  constructor(){
    // Initialize model
  }

  train(X,y){
    // train model using the training set
  }

  test(Xtest, ytest){
    // test model using the testing set
  }

  predict(x){
     // predict the result of x 
  }
}

const model = new Model()
Enter fullscreen mode Exit fullscreen mode

The Model above is abstract and currently hasn't been implemented but it will help show you how actual ML models work. The train method takes in the training set as X and y. X represents the data and y represents its labels. The train method handles the training of the model. The test method takes in a testing set and uses it to evaluate the performance of the model. The training phase usually returns an accuracy score.

model.test(Xtest, ytest)
Enter fullscreen mode Exit fullscreen mode

When it is time to use your model you can use the predict method. Let us say we trained our model on the sentiment dataset above and we want to see if it can properly perform well on data it has never seen.

let sent = "I love this world"
model.predict(sent) 
Enter fullscreen mode Exit fullscreen mode

If our model was trained properly it should predict 1. Meaning the sentence is positive.

Machine Learning subcategory

Machine Learning is majorly divided into two:

  • Supervised ML
  • Unsupervised ML
  1. Supervised Machine Learning: This branch of ML deals with Models that require labels to perform their task. The model we built above is a supervised model because it needed a label y and its data X. It also has its sub-categories that are:
  • Regression: This involves trying to predict a continuous value like the price of a product given a set of inputs.
  • Classification: This involves predicting discrete values. Like if a sentence is positive or negative.

Here are a few supervised learning models:

2.Unsupervised Machine Learning deals with models that don't require labels to perform their task. If our model was unsupervised here's how it will work.

const model = new Model()
model.train(X)
Enter fullscreen mode Exit fullscreen mode

This train method doesn't require y label for training all it needs is data.

Here are a few Unsupervised models:

This article has served as an intro to the field of ML. It is not an in-depth guide. It is more of a build-up of what is to come in the series. For a more in-depth, guide check out the following:

In the next article, we will be looking at the fundamental data structure of ML, the tensor.

Top comments (3)

Collapse
 
lanietodev profile image
Melanie Eureka Ngome

It looks overwhelming at a glance 😅 but you wrote it so well 💜

Collapse
 
eteimz profile image
Youdiowei Eteimorde

😃 Thank you so much more articles will be coming soon in the series

Collapse
 
diseyi profile image
Diseyi

This is a good intro. You made it look easy.