DEV Community

Cover image for How I am learning machine learning - week 10: evaluating a model in scikit-learn (part 2)
Gabriele Boccarusso
Gabriele Boccarusso

Posted on • Updated on

How I am learning machine learning - week 10: evaluating a model in scikit-learn (part 2)

Last week we saw some ways to evaluate a machine learning model but they were only for classification problems. Let's see another way to evaluate one and then focus on regression model evaluations.

Table of contents:

Confusion matrix

The confusion matrix is a way to compare what the machine predicted with what it was supposed to predict, it gets this name because lets us know where the model is getting confused and predict a label instead of another. Using the same estimator of the last week we can already import what we need:

from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

X = iris_df.drop('target', axis = 1)
y = iris_df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = .15)

clf = SVC()
clf.fit(X_train, y_train)

y_preds = clf.predict(X_test)
Enter fullscreen mode Exit fullscreen mode

now we can import the function to do the confusion matrix:

calculating the confusion matrix of a model using the dedicated scikit learn function
But what does this mean? It is based on the false positive rates (fpr) and true positive rates (tpr), which we saw in the last post. Every cell of the matrix indicates if the model predicted correctly or not the class. We can have a quick overview with the pandas function crosstab:

using the crosstab function to view our confusion matrix
Knowing that 0, 1, and 2 are our labels Setosa, Versicolor, and Virginica, we can see where the model predicted a right label and where a wrong one.
Having saved the result of the confusion_matrix function into conf_mat we can even import the seaborn library to plot it in an utterly fashion way:

using the seaborn heatmap to visualize the result of a the confusion matrix function

Regression model evaluators

Till now we saw just how to evaluate a classification model and how to use a binary classification evaluator for multi-class classification. Evaluating a regression model is much simpler than what we already did despite being vital to machine learning in general

Coefficient of determination

the coefficient of determination or R^2 (R-squared) is the score method that we saw from the begin of the serie, but let's understand how does it work. Let's begin importing a regression dataset:
importing a csv file using pandas you can find the data here
looking at the data we know that we have all the cases of chickenpox in Hungary divided by county areas. We want to predict the total cases based on how many cases were in every county. We'll create column with the total cases for every day, to do so we need a single line of code:

chickenpox['total'] = chickenpox.sum(axis = 1)
Enter fullscreen mode Exit fullscreen mode

this will create an additional column with the sum of all the cases in every row:
creating an additional column with the sum of the other in pandas
Now we can fit and train our model:
fitting and training a regression model

The result is the same which we already saw thousands of times: a number between 0.0 and 1.0 which correspond to a percentage. The coefficient of determination compares every prediction with the test label. We can use the dedicated function from the library to calculate it:
using the r2_score function from the metrics section of the scikit learn library

because the R-squared scores compare two multi-dimensional arrays, if we compare the test with itself we'll get a score of 100%: 1.0
comparing the test with itself in the R-squared or coefficient of determination score

The Mean Absolute Error (MAE)

the mean absolute error score calculates the absolute difference between every actual value and every predicted one:
calculating the differences between the test and predicted values
Absolute means that every number is positive, so a complete way to do the dataframe would be:
visualizing the mean absolute error in pandas by hand
To complete this evaluation we have to just calculate the mean of the difference column. Let's compare our calcs with the dedicated function:
calculating the mean absolute error by hand and then comparing it to the one calculated by the dedicated scikit learn function
And from what we can see we did everything right.
The MAE tells us the range of the error of our model, through this evaluation, we now know that the errors of our model range between 74 and -74.

The Mean Squared Error (MSE)

The mean squared error is identical to the MAE but every value of the difference column is squared:
how to calculate the mean squared error in pandas by hand

When to use them?

  • R-squared is used for accuracy but doesn't tell you where the model is doing wrong.
  • MAE gives a good indication of how wrong your model is
  • MSE amplifies larger differences than MAE

In this case, if 200 cases are twice as bad as 100 cases we could use the MAE, but if a disease has a very high rate of transmission then 200 cases would be more than twice as bad as 100 and we should pay more attention to the MSE.

Using the scoring parameter

We can have the same results using the cross-evaluation function and changing the scoring parameter:
using the cross-evaluation scikit learn function to calculate the coefficient of determination of a model

for the full list, you can go here

Last thoughts

We saw all the most important ways to evaluate our machine learning model avoiding using mindlessly the metrics functions without knowing what they do. If you have any doubt feel free to leave a comment.

Top comments (0)