DEV Community

James Stewart
James Stewart

Posted on

Keys on Designing Machine Learning Models

Machine Learning, a branch of Artificial Intelligence that simulates human neural system, has become the most popular AI approach nowadays. Since its establishment in the late 1940s, Machine Learning has faced countless developments and arguments, and now it is affecting all the stages of human life, including computer vision, natural language processing(NLP) and so on. Nowadays, it's hard to call you an AI engineer without deep knowledge of machine learning.

  • Why do we use Machine Learning? Machine learning is growing in importance due to increasingly enormous volumes and variety of data, the access and affordability of computational power, and the availability of high speed Internet. Machine Learning runs in a black box and thus has high capabilities to be adapted to these large datasets. New Machine Learning models are introduced every day and it provides comfortable environments for all businesses and developers to establish AI systems that suit their needs.

--What are the steps to design a Machine Learning Model?
Machine Learning is a simple but difficult theory. Means it looks easy and non-theoretical at the beginning, but requires deeper understandings and mathematical background to be a master. Many entry level developers design their own Machine Learning Model, combining all sorts of layers, and say "I'm a master in Machine Learning". Do all that models fit the datasets that arise in real life? Absolutely not. Different key features of different datasets, difficult cultural and technical background of datasets requires Machine Learning developers to be masters of mathematics. Machine Learning is mainly based on mathematical theories, such as matrix, and thus we need to have a strong mathematical background to be a master.

Then how do we develop a ML model?

First, we need to understand the basic layers.

There are several types of layers in the Machine Learning theory. These layers play different roles in the Machine Learning model. For example, Convolutional layers extract features of datasets, Pooling layers compress these features, and Dense layers are used for the projection. All these layers define different functions of the Machine Learning model and thus, we need to make a good combination of these models to be a real master.

Second, we need to know what activation functions are used for.

Machine Learning involves millions of neurons in its performance, but how do we define the functionalities of all the neurons? Humans have millions of billions of neurons in the body, but their functionalities are all different, and we have to do the same thing to the Machine Learning. This is done by the activation function. The activation function compresses the output range of each layer to a limited range, and thus helps calculating the weight of each neuron involved in the layer. There are mainly 3 types of activation functions - RELU, Sigmoid and Softmax. Each activation function plays different roles during the Machine Learning process - RELU is used for regression, Sigmoid is used for continuity problem, and Softmax is used for classification.

Third, we need to be familiar with loss functions and the relation between loss and accuracy.

Measuring the accuracy of the learning process involves two factors - accuracy and loss. Loss is the offset between the real data and output data. It defines how close the model is trained to the real-world data. The more the loss is, the more inaccurate the model and weights are. Learning process is mainly obsessed with the accuracy. But if the loss and accuracy are all high, then it means overfit. We have to understand this relation and find the right way to reduce the loss value as well as increasing the accuracy.

Fourth, we need to know how to fine-tune pre-trained models.

The world consists of different datasets. Each dataset consists of millions of different input/output values, and it requires a lot of time to design a new one and train it. The learning process often involves billions of matrix calculation, and this takes a lot of time, and cost. How can we save it? The answer - fine-tuning. Fine-tuning allows developers to design a new model based on pre-trained famous models - just erasing some layers and adding their own layers to generate a new model. This often increases the accuracy of the ML model, as the pre-trained models often contain most features of real-world data that arises.

Machine Learning world is always changing and it is a challenge to adopt these technologies to our real life. Developers often face difficulties in designing a ML system for businesses, and so we need to master these key features to be a real Machine Learning Developer.

Please write a comment or contact me if you have any questions or opinions in my article.

Thank you

Top comments (0)