DEV Community

Cover image for Gradient Boosting Regressor Example
Abzal Seitkaziyev
Abzal Seitkaziyev

Posted on

Gradient Boosting Regressor Example

Photo by Michael Dziedzic on Unsplash

In the previous post, I briefly explained Gradient Boosting using a classification problem. Here I will do step by step explanation of how Gradient Boosting Regressor works using sklearn and Python to complement a theory given here. I did this exercise mainly to build an intuition of processes inside the Gradient boosted trees and by doing so to avoid using it as some sort of 'black box' algorithm.

I used a dataset with car prices (source) for this purpose. So, for easy tracking of the processes inside the Gradient boosted trees, I used a small portion of the data with a minimum number of the trees(m=2), and the depth of a tree(max_depth=2).

1) First, we initialize the model, by getting initial predictions Pred_0. It is calculated as Mean value of the prices in the train dataset. Then we calculated initials residuals: Res_0 = train['price']-Pred_0. See below.

Alt Text

2) Here we fit all data points(= each row features and Res_0) into the first tree. This tree build by using 'MSE' as a criterion.

Alt Text

Each Value in the Leaf are calculated by the mean values of the residuals in each leaf. Then Prediction is calculated:
Pred_1 = Pred_0 + learning_rate*output_value_1

The we calculate residuals:
Res_1 = train['price']-Pred_1

Node #2, 3, 5, and 6 Predictions and Residuals:

Alt Text

3)
Here we fit all data points(= each row features and Res_1) into the second tree. This tree build by using 'MSE' as a criterion as well.

Alt Text

Node #5 Predictions shown below.
Alt Text

4) We continue this iterative training process. Here I used only two trees for the simplicity.

You can refer for the detailed code and step by step in chapter 5 here.

Top comments (0)