Good Classification Measures and How to Find Them

被引:0
作者
Gosgens, Martijn [1 ]
Zhiyanov, Anton [2 ]
Tikhonov, Alexey [3 ]
Prokhorenkova, Liudmila [4 ]
机构
[1] Eindhoven Univ Technol, Eindhoven, Netherlands
[2] HSE Univ, Yandex Res, Moscow, Russia
[3] Yandex, Berlin, Germany
[4] HSE Univ, Yandex Res, MIPT, Moscow, Russia
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷
关键词
AUC;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several performance measures can be used for evaluating classification results: accuracy, F-measure, and many others. Can we say that some of them are better than others, or, ideally, choose one measure that is best in all situations? To answer this question, we conduct a systematic analysis of classification performance measures: we formally define a list of desirable properties and theoretically analyze which measures satisfy which properties. We also prove an impossibility theorem: some desirable properties cannot be simultaneously satisfied. Finally, we propose a new family of measures satisfying all desirable properties except one. This family includes the Matthews Correlation Coefficient and a so-called Symmetric Balanced Accuracy that was not previously used in classification literature. We believe that our systematic approach gives an important tool to practitioners for adequately evaluating classification results.
引用
收藏
页数:12
相关论文
共 30 条
[1]   On similarity indices and correction for chance agreement [J].
Albatineh, Ahmed N. ;
Niewiadomska-Bugaj, Magdalena ;
Mihalko, Daniel .
JOURNAL OF CLASSIFICATION, 2006, 23 (02) :301-313
[2]  
[Anonymous], 2020, P IEEE C COMP VIS PA, DOI DOI 10.1109/BIBM49941.2020.9313406
[3]  
[Anonymous], 2012, P 13 C EUR CHAPT ASS
[4]  
Brodersen Kay H., 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P3121, DOI 10.1109/ICPR.2010.764
[5]   The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation [J].
Chicco, Davide ;
Jurman, Giuseppe .
BMC GENOMICS, 2020, 21 (01)
[6]  
Choi S-S., 2010, J SYST CYBERN INF, V8, P43, DOI DOI 10.13053/CYS-20-3-2457
[7]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[8]  
Cortes C, 2004, ADV NEUR IN, V16, P313
[9]   Why Cohen's Kappa should be avoided as performance measure in classification [J].
Delgado, Rosario ;
Tibau, Xavier-Andoni .
PLOS ONE, 2019, 14 (09)
[10]   Enhancing Confusion Entropy (CEN) for binary and multiclass classification [J].
Delgado, Rosario ;
David Nunez-Gonzalez, J. .
PLOS ONE, 2019, 14 (01)