DEV Community

Cover image for Adaptive Boosting (Part 2)
Abzal Seitkaziyev
Abzal Seitkaziyev

Posted on

Adaptive Boosting (Part 2)

In the previous post, we outlined how AdaBoost algorithm works.

Here we can explore a bit further scikit-learn's implementation of the AdaBoost Classifier with Decision Trees.

AdaBoost Classifier has two options for the algorithm parameter: ‘SAMME.R’ and ‘SAMME’. By default, it uses "SAMMER.R".
SAMME stands for Stagewise Additive Modeling with a Multi-class Exponential cost function and .R stands for Real.

SAMME uses different 'decision influence' weights (alphas) for each weak learner (please see part 1, step 2). While SAMME.R assigns an equal weight for each weak learner and directly calculates the class probability. "The SAMME.R algorithm typically converges faster than SAMME, achieving a lower test error with fewer boosting iterations." (source)

Basically, in the previous post, AdaBoost was explained using SAMME algorithm.

Example of using SAMME:

# Instantiate model with 100 decision trees
a_boost = AdaBoostClassifier(n_estimators = 100, algorithm='SAMME', 
                             random_state = 42)

# Train the model on training data
a_boost.fit(X_train,y_train);
Enter fullscreen mode Exit fullscreen mode

Week learners (stumps) examples:

trees = a_boost.estimators_

#plot first stump. plot_tree is a function, not defined here
plot_tree(trees[0], features)

#plot last stump
plot_tree(trees[-1], features)
Enter fullscreen mode Exit fullscreen mode

First stump in an ensemble
Alt Text

Last stump in an ensemble
Alt Text

Weak learner's weights (alphas) in the final decision:

# alphas (weights in the final decision) of the first 10 weak learners
a_boost.estimator_weights_[:10]
Enter fullscreen mode Exit fullscreen mode

Alt Text

Top comments (0)