Accuracy Score for Evaluation of Classification on Imbalanced Data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Imbalanced data pose a challenge to the evaluation of trained machine-learning models due to evaluation bias, leading to inappropriate models for real-world applications. Data augmentation serves as a possible solution, but it introduces uncertainty due to the augmentation strategy and the data quality. Comparing measures which rely on diverse statistical principles to assess classification performance in imbalanced data is beneficial in this uncertainty situations. Here, we approach this challenge proposing the accuracy score (AC-score) which combines sensitivity and specificity in an unbiased measure with two relevant properties. First, AC-score is symmetric with respect to both positive and negative classes, this implies that AC-score adapts to cases where both classes have equal importance. Second, AC-score penalizes more than other existing measures the models whose prediction favors one class over the other. We show that AC-score is more conservative than geometric mean, AUC-ROC and balanced accuracy, and we comment the cases in which this offers an advantage offering empirical evidence on artificial and real data classifications.