Machine Learning for Beginners

Justin Goldstein — Mon, 08 Mar 2021 03:22:04 +0000

Machine Learning includes computer programs that improve a prediction function (called a model) using a dataset, clustering data points, and detecting anomalies. For each of these use cases, you choose which algorithm you want to implement. If you have chosen a prediction model, you must decide if it will have many inputs or few, an output that is like 0 or 1 or all the values from 0 to 1 , and how many outputs it has. If you are clustering data points or detecting anomalies, there are several other ways to adjust your model. All of this is to say that although your machine is "learning," you have to do a lot of the work yourself. This article will walk through the basic types of Machine Learning, common algorithms associated with them, and conclude by describing the tools developers and data scientists use to implement them.

Machine Learning can be broken down into supervised learning and unsupervised learning. There is also reinforcement learning, which I will not touch on here because it is likely too complex to use when you are getting started. Supervised learning enables you to create a model from labeled data, which means that the given inputs you use to train your model have been assigned the correct outputs by humans beforehand. Examples of supervised learning include single or multi-class classification and regression. Neural networks are a type of supervised learning algorithm that perform single or multi-class classification. They are essentially a very complex model that can be represented like a network.

Unsupervised learning enables you to create a model from unlabeled data. Examples
include K-means clustering and principal component analysis. These can be used for customer segmentation and increasing the speed of fitting a model while maintaining its performance. Unsupervised learning typically finds the relationships between data points and categorizes them. In my experience, supervised learning is more common because it is more versatile and can often do the same things as unsupervised learning. However, unsupervised learning does not require labeled data, which is more difficult to acquire.

There are various tools that developers and data scientists use to implement Machine Learning algorithms. Often, engineers will use python with libraries such as scikit-learn, TensorFlow, and Keras to build their models, and rely on pandas and numpy to work with their data. There are also a suite of automated Machine Learning tools which further abstract away the nitty gritty. The most popular of these are Google Cloud AutoML, Amazon SageMaker, and Microsoft Azure AutoML. Almost every tool you will find requires you to have some knowledge of the intuition behind Machine Learning. Tools like SageMaker can be really difficult to implement because they require you to understand how to improve your model on your own, which takes time. If you are looking for an end-to-end platform that takes away this hassle, check out Telepath AI’s AutoML tools. In the field of data science, tools such as JupyterLab, Jupyter Notebook, and Anaconda are widely used and they are often prerequisites for tutorials online. If your aim is to build models yourself, you might also consider using a high level language such as Octave/MATLAB because this allows you to get started implementing algorithms much more quickly.

If your goal is to learn about Machine Learning in considerable depth, I strongly recommend taking an online class on Coursera because Machine Learning can seem like a daunting field to explore, and much of the literature online is uninformed. If you are at the beginning stage of your Machine Learning journey, I hope this article acquainted you with the areas of interest within this field and exposed you to some of the tools that will be using along the way.

An Intuitive Introduction to Machine Learning

Justin Goldstein — Fri, 05 Mar 2021 22:51:58 +0000

“A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.” -- Tom Mitchell

This description of Machine Learning by Tom Mitchell in his introductory ML book is often cited. What exactly does it mean? In this article, I hope to clear up this definition and explain some of the jargon used in Machine Learning. I will mainly focus on supervised learning, which includes regression, logistic regression, and neural networks. Supervised learning enables you to create a model from labeled data, which means that the inputs you use to train your model have been assigned the correct outputs by humans beforehand.

Concretely, Machine Learning is nothing more than creating a function (like f(x)) that gives you a desired output or outputs. The only catch is that you only determine the structure of the function and a computer program controls the parameters or weights of the function. In more complex uses of Machine Learning, you might have a series of programs that break your goal down into different parts, which allows you to perform more difficult tasks like text detection.

Here are some key terms associated with Machine Learning (read in order):

Parameters: Imagine you are trying to create a line of best fit:

You need to find its slope and intercept. These are called the parameters of your line because they determine its shape, and therefore what value this line predicts for a given input. This can be written as y = parameter1x + parameter0. You might see people using Θ, or 𝛽 to denote parameters:
where x0=1.

Features: In machine learning, "features" are different characteristics of your data. If each row in your dataset represented a different customer, you might have columns that tell you more information about that customer, like how long they have been your customer, their last purchase, how old they are, etc. Each of those columns is considered a "feature."

Model: The prediction function you use to turn your inputs into an output or outputs.

Example: One record, or row in your dataset that features describe.

Dataset: A table with each example as a row, and features as the columns. Often you will see your dataset as an mxn matrix, where m is the number of examples and n is the number of columns.

Cost Function: The function you use to evaluate your model. Imagine you have a prediction model to classify images as images of eyes or images not of eyes. For one “example,” your input will be an image. Your output will be 0 if it is not an eye, 1 if it is an eye. To represent your image (which is black and white), you translate it into a grid of pixels, which are represented by a number from 0 to 1. Your prediction function will take in all of these pixels, and output a number from 0 to 1 representing how much it resembles an eye. Informally, you can use your cost function to grade your prediction function. Your cost function gives you a high penalty (or "cost") if your prediction isn't very accurate. In scoring your model’s performance on an entire mxn dataset, you will often see an average cost divided by 2. This is represented as 1/(2m) * cost for each example, where m is the number of examples in your dataset.
Example cost function for regression: Here, J(Θ) is your cost, h(xi) is your prediction value for example xi , and yi is the true label value of xi. Your cost function is always a function of your parameters. When you adjust the parameters of your model, this can increase or decrease your model’s cost.

Gradient Descent: The basic procedure you use to improve your model in a supervised learning algorithm. Gradient descent adjusts your parameters in the correct direction by using your cost function. In 2-D, if you graphed your cost output on the y axis and a parameter on the x axis you would see a bowl like graph. The goal of gradient descent is to minimize the error of your model by going through many iterations to get to the bottom of the bowl. In each iteration, you set your parameters equal to your current parameters minus 𝛼 * the derivative of your cost function with respect to your parameters. 𝛼 is some positive number (called the learning rate). If you are on the left side of the bowl and your parameters are too low, the derivative of your cost function is negative. Therefore, if you set your parameters equal to your parameters minus 𝛼 times the derivative of your cost function with respect to your parameters, that is, parameters = parameters - some negative number, you get parameters = parameters + some positive number, which increases your parameters. This is equivalent to taking a step to the right of your cost function bowl. You can follow this until you get to the bottom of the bowl. If your parameters are too high, your parameters = parameters - some positive number, and you step to the left (which gets you closer to the cost function’s minimum). This is a sure fire way for your prediction function to converge to the best possible parameters.

If you understand gradient descent and the other terms, you understand the most fundamental concepts of supervised Machine Learning. In essence, supervised Machine Learning is the process of stepping down the cost function, to get a model with the smallest cost. If you want to get started quickly with implementing Machine Learning for your business, check out Telepath AI's solutions.

DEV Community: Justin Goldstein

Machine Learning for Beginners

An Intuitive Introduction to Machine Learning

Here are some key terms associated with Machine Learning (read in order):