Weighted kappa measures for ordinal multi-class classification performance

被引：31

作者：

Yilmaz, Ayfer Ezgi ^{[1
]}

Demirhan, Haydar ^{[2
]}

机构：

[1] Hacettepe Univ, Dept Stat, Ankara, Turkiye

[2] RMIT Univ, Sch Sci, Math Sci Discipline, Melbourne, Australia

来源：

APPLIED SOFT COMPUTING | 2023年 / 134卷

关键词：

Accuracy; Agreement measures; Evaluation metric; Matthews correlation coefficient; Performance metric; Ordinal classifier; Ordinal labels; R PACKAGE; ACCURACY;

D O I：

10.1016/j.asoc.2023.110020

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Assessing the classification performance of ordinal classifiers is a challenging problem under imbal-anced data compositions. Considering the critical impact of the metrics on the choice of classifiers, employing a metric with the highest performance is crucial. Although Cohen's kappa measure is used for performance assessment, there are better-performing agreement measures under different formations of ordinal confusion matrices. This research implements weighted agreement measures as evaluation metrics for ordinal classifiers. The applicability of agreement and mainstream performance metrics to various practice fields under challenging data compositions is assessed. The sensitivity of the metrics in detecting subtle distinctions between ordinal classifiers is analyzed. Five kappa -like agreement measures with six weighting schemes are employed as evaluation metrics. Their reliability/usefulness is compared to the mainstream and recently proposed metrics, including F1, Matthews correlation coefficient, and informational agreement. The performance of 37 metrics is analyzed in two extensive numerical studies, including synthetic confusion matrices and real datasets. Promising metrics under practical circumstances are identified, and recommendations about the best metric to evaluate ordinal classifiers under different conditions are made. Overall, the weighted Scott's pi-measure is found useful, sensitive to small differences in the classification performance, and reliable under general conditions.(c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

引用

页数：16

共 37 条

[1] Inter-Coder Agreement for Computational Linguistics [J].