Weighted kappa measures for ordinal multi-class classification performance

被引:31
作者
Yilmaz, Ayfer Ezgi [1 ]
Demirhan, Haydar [2 ]
机构
[1] Hacettepe Univ, Dept Stat, Ankara, Turkiye
[2] RMIT Univ, Sch Sci, Math Sci Discipline, Melbourne, Australia
关键词
Accuracy; Agreement measures; Evaluation metric; Matthews correlation coefficient; Performance metric; Ordinal classifier; Ordinal labels; R PACKAGE; ACCURACY;
D O I
10.1016/j.asoc.2023.110020
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Assessing the classification performance of ordinal classifiers is a challenging problem under imbal-anced data compositions. Considering the critical impact of the metrics on the choice of classifiers, employing a metric with the highest performance is crucial. Although Cohen's kappa measure is used for performance assessment, there are better-performing agreement measures under different formations of ordinal confusion matrices. This research implements weighted agreement measures as evaluation metrics for ordinal classifiers. The applicability of agreement and mainstream performance metrics to various practice fields under challenging data compositions is assessed. The sensitivity of the metrics in detecting subtle distinctions between ordinal classifiers is analyzed. Five kappa -like agreement measures with six weighting schemes are employed as evaluation metrics. Their reliability/usefulness is compared to the mainstream and recently proposed metrics, including F1, Matthews correlation coefficient, and informational agreement. The performance of 37 metrics is analyzed in two extensive numerical studies, including synthetic confusion matrices and real datasets. Promising metrics under practical circumstances are identified, and recommendations about the best metric to evaluate ordinal classifiers under different conditions are made. Overall, the weighted Scott's pi-measure is found useful, sensitive to small differences in the classification performance, and reliable under general conditions.(c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页数:16
相关论文
共 37 条
[1]   Inter-Coder Agreement for Computational Linguistics [J].
Artstein, Ron ;
Poesio, Massimo .
COMPUTATIONAL LINGUISTICS, 2008, 34 (04) :555-596
[2]   Evaluation Measures for Ordinal Regression [J].
Baccianella, Stefano ;
Esuli, Andrea ;
Sebastiani, Fabrizio .
2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, :283-287
[3]   Comparison of classification accuracy using Cohen's Weighted Kappa [J].
Ben-David, Arie .
EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (02) :825-832
[4]   Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric [J].
Boughorbel, Sabri ;
Jarray, Fethi ;
El-Anbari, Mohammed .
PLOS ONE, 2017, 12 (06)
[5]   MEASURING THE PERFORMANCE OF ORDINAL CLASSIFICATION [J].
Cardoso, Jaime S. ;
Sousa, Ricardo .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2011, 25 (08) :1173-1195
[6]   Beyond kappa: an informational index for diagnostic agreement in dichotomous and multivalue ordered-categorical ratings [J].
Casagrande, Alberto ;
Fabris, Francesco ;
Girometti, Rossano .
MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2020, 58 (12) :3089-3099
[7]   The Matthews Correlation Coefficient (MCC) is More Informative Than Cohen's Kappa and Brier Score in Binary Classification Assessment [J].
Chicco, Davide ;
Warrens, Matthijs J. ;
Jurman, Giuseppe .
IEEE ACCESS, 2021, 9 :78368-78381
[8]   The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation [J].
Chicco, Davide ;
Jurman, Giuseppe .
BMC GENOMICS, 2020, 21 (01)
[9]   OCAPIS: R package for Ordinal Classification and Preprocessing in Scala [J].
Cristina Heredia-Gomez, M. ;
Garcia, Salvador ;
Antonio Gutierrez, Pedro ;
Herrera, Francisco .
PROGRESS IN ARTIFICIAL INTELLIGENCE, 2019, 8 (03) :287-292
[10]   Count on kappa [J].
Czodrowski, Paul .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2014, 28 (11) :1049-1055