DEV Community

artem-evstafev
artem-evstafev

Posted on

Intuition behind ROC AUC and Gini coefficient

Introduction

Anyone who ever tried to create any classification model, knows that there are not too many metrics that measure the quality of the model and don't depend directly from the model threshold. ROC AUC score one of the most widely accepted characteristics of the model, but it's not intuitively clear what does it mean? how good is the model with AUC = 70%? what is the potential room for improvement? in some cases, should I still use classification or switch to regression model? how it is connected with corresponding characteristics of regression model: coefficient of determination and correlation coefficient? In this article I will try to prove that Gini coefficient (it's equal to 2 * AUC -1) approximately corresponds to correlation coefficient in regression tasks and coefficient of determination = Gini^2.

Result

Image description
Image description
Image description
Image description
Image description
Image description
Image description
Image description
Image description

Conclusion

All in all, we can see that Gini coefficient (= 2 * AUC - 1) corresponds to correlation coefficient in regression models and describes the portion of information that we know about hidden variable. For example, with AUC = 70%, Gini = 40%, and we can explain 16% of variance of hidden variable, and cannot explain 84% of its variance.

Top comments (0)