In the previous post, we outlined how AdaBoost algorithm works.
Here we can explore a bit further scikit-learn's implementation of the AdaBoost Classifier with Decision Trees.
AdaBoost Classifier has two options for the algorithm parameter: ‘SAMME.R’ and ‘SAMME’. By default, it uses "SAMMER.R".
SAMME stands for Stagewise Additive Modeling with a Multi-class Exponential cost function and .R stands for Real.
SAMME uses different 'decision influence' weights (alphas) for each weak learner (please see part 1, step 2). While SAMME.R assigns an equal weight for each weak learner and directly calculates the class probability. "The SAMME.R algorithm typically converges faster than SAMME, achieving a lower test error with fewer boosting iterations." (source)
Basically, in the previous post, AdaBoost was explained using SAMME algorithm.
Example of using SAMME:
# Instantiate model with 100 decision trees
a_boost = AdaBoostClassifier(n_estimators = 100, algorithm='SAMME',
random_state = 42)
# Train the model on training data
a_boost.fit(X_train,y_train);
Week learners (stumps) examples:
trees = a_boost.estimators_
#plot first stump. plot_tree is a function, not defined here
plot_tree(trees[0], features)
#plot last stump
plot_tree(trees[-1], features)
Weak learner's weights (alphas) in the final decision:
# alphas (weights in the final decision) of the first 10 weak learners
a_boost.estimator_weights_[:10]
Top comments (0)