In this post, I would like to give an overview of the different baseline models and benchmark approaches that could be used to estimate the performance of the machine learning models. I would divide baseline models into 3 buckets by the technique used: naive, heuristics, and machine/statistical learning models.
Naive baselines.
After processing and cleaning the data, and splitting it to train and test sets, we can start working on the simplest baseline models: e.g., Random Prediction, Zero Rule, One Rule, etc.
Random Prediction.
For simplicity, we can use the binary classification problem (e.g., target =0 or 1).  In the Random Prediction, we don't use any feature from the dataset, but just randomly generate a prediction (0 or 1) for the target, and we expect more or less uniform distribution for the predicted classes. 
ZeroR.
When dealing with an imbalanced classification problem (e.g., have more 0s or 1s in case of binary classification), Zero Rule could be used. This method also relies on the target, while ignoring features. It predicts the majority class. 
OneR.
One Rule uses one feature at a time and correlates it with a target. Basically, by creating a frequency table (one feature vs target) we can predict a target. We do that for all features and choose the one with the smallest error.
Heuristics.
"A heuristic technique is any approach to problem-solving that uses a practical method or various shortcuts in order to produce solutions that may not be optimal but are sufficient given a limited timeframe or deadline"(source). A heuristic is a method, which is often based on the industry experience and subject matter expertise. For example, in a time-series sales problem, the latest sales data is more important than sales data a few years ago, and using this rule specific features could be engineered and used for the prediction.
Machine and Statistical Learning Baselines.
Here we can include some statistical methods, for example for time-series 
regression problems, predicting target by using Moving Average, or AutoRegressive Integrated Moving Average (ARIMA).
From non-parametric models, a decision tree, or even a random forest (depending on the problem) could be used as an example of a baseline.
Summary.
There are many specific benchmarks within each bucket, we can use to measure the performance of the main machine learning model. It is good to keep in mind that baseline models (or approaches) should be explainable, computationally inexpensive, and relatively simple. So, if we propose a more complex model to solve the problem, we can compare it to the baseline and see whether this complex model brings more value (e.g. predictive capacity) compare to the baseline. 
 
 
              
 
    
Top comments (0)