How to evaluate classifier performance in the presence of additional effects: A new POD-based approach allowing certification of machine learning approaches

被引:3
作者
Ameyaw, Daniel Adofo [1 ]
Deng, Qi [1 ]
Soeffker, Dirk [1 ]
机构
[1] Univ Duisburg Essen, Chair Dynam & Control, Lotharstr 1-21, D-47057 Duisburg, Germany
来源
MACHINE LEARNING WITH APPLICATIONS | 2022年 / 7卷
关键词
Performance evaluation; Classifier; Probability of Detection; Human behavior prediction; MODEL;
D O I
10.1016/j.mlwa.2021.100220
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classifiers are useful and well-known machine learning algorithms allowing classifications. A classifier may be suited for a specific task depending on the application and datasets. To select an approach for a task, performance evaluation may be imperative. Existing approaches like the receiver operating characteristic and precision-recall curves are popular in evaluating classifier performance, however both measures do not directly address the influence of additional and possibly unknown (process) parameters on the classification results. In this contribution, this limitation is discussed and addressed by adapting the Probability of Detection (POD) measure. The POD is a probabilistic method to quantify the reliability of a diagnostic procedure taking into account statistical variability of sensor and measurements properties. In this contribution the POD approach is adapted and extended. The introduced approach is implemented on driving behavior prediction data serving as illustrative example. Based on the introduced POD -related evaluation, different classifiers can be clearly distinguished with respect to their ability to predict the correct intended driver behavior as a function of remaining time (here assumed as process parameter) before the event itself. The introduced approach provides a new diagnostic and comprehensive interpretation of the quality of a classification model.
引用
收藏
页数:12
相关论文
共 37 条
[1]   An enhanced technique of skin cancer classification using deep convolutional neural network with transfer learning models [J].
Ali, Md Shahin ;
Miah, Md Sipon ;
Haque, Jahurul ;
Rahman, Md Mahbubur ;
Islam, Md Khairul .
MACHINE LEARNING WITH APPLICATIONS, 2021, 5
[2]  
Ameyaw D.A., 2019, ANN C PHM SOC, V11
[3]   A novel feature-based probability of detection assessment and fusion approach for reliability evaluation of vibration-based diagnosis systems [J].
Ameyaw, Daniel Adofo ;
Rothe, Sandra ;
Soeffker, Dirk .
STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2020, 19 (03) :649-660
[4]  
Annis C., 2020, R Package Mh1823, V2
[5]   Edge detector evaluation using empirical ROC curves [J].
Bowyer, K ;
Kranenburg, C ;
Dougherty, S .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2001, 84 (01) :77-103
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Combining unsupervised and supervised learning in credit card fraud detection [J].
Carcillo, Fabrizio ;
Le Borgne, Yann-Ael ;
Caelen, Olivier ;
Kessaci, Yacine ;
Oble, Frederic ;
Bontempi, Gianluca .
INFORMATION SCIENCES, 2021, 557 :317-331
[8]   When to consult precision-recall curves [J].
Cook, Jonathan ;
Ramadas, Vikram .
STATA JOURNAL, 2020, 20 (01) :131-148
[9]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[10]  
Deng Q, 2019, IEEE INT C INTELL TR, P1060, DOI 10.1109/ITSC.2019.8917489