Evaluating and tuning predictive data mining models using receiver operating characteristic curves

被引:31
作者
Sinha, AP [1 ]
May, JH
机构
[1] Univ Wisconsin, Sch Business Adm, Madison, WI 53706 USA
[2] Univ Pittsburgh, Katz Grad Sch Business, Pittsburgh, PA 15260 USA
[3] Univ Pittsburgh, Artificial Intelligence Management Lab, Pittsburgh, PA 15260 USA
关键词
binary classification; credit evaluation; data mining; decision analysis; misclassification costs; performance evaluation and tuning; predictive models; ROC curves;
D O I
10.1080/07421222.2004.11045815
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this study, we conduct an empirical analysis of the performance of five popular data mining methods-neural networks, logistic rearession, linear discriminant analysis, decision trees, and nearest neighbor-on two binary classification problems from the credit evaluation domain. Whereas most studies comparing data mining methods have employed accuracy as a performance measure, we argue that, for problems such as credit evaluation, the focus should be on minimizing misclassification cost. We first generate receiver operating characteristic (ROC) curves for the classifiers and use the area under the curve (AUC) measure to compare aggregate performance of the five methods over the spectrum of decision thresholds. Next. using the ROC results, we propose a method for tuning the classifiers by identifying optimal decision thresholds. We compare the methods based on expected costs across a range of cost-probability ratios. In addition to expected cost and AUC, we evaluate the models on the basis of their generalizability to unseen data, their scalability to other problems in the domain, and their robustness against changes in class distributions. We found that the performance of logistic regression and neural network models was superior under most conditions. In contrast. decision tree and nearest neighbor models yielded higher costs, and were much less generalizable and robust than the other models. An important finding, of this research is that the models can be effectively tuned post hoc to make them cost sensitive, even though they were built without incorporating misclassification costs.
引用
收藏
页码:249 / 280
页数:32
相关论文
共 34 条
[1]   The effect of misclassification costs on neural network classifiers [J].
Berardi, VL ;
Zhang, GP .
DECISION SCIENCES, 1999, 30 (03) :659-682
[2]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[3]  
Cabena P., 1998, Discovering data mining: from concept to implementation
[4]  
Choong Nyoung Kim, 1999, Journal of Management Information Systems, V16, P189
[5]  
Chung HM., 1999, Journal of Management Information Systems, V16, P11
[6]   Credit risk assessment using a multicriteria hierarchical discrimination approach: A comparative analysis [J].
Doumpos, M ;
Kosmidou, K ;
Baourakis, G ;
Zopounidis, C .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2002, 138 (02) :392-412
[7]  
DYBOWSKI R, 2003, J MACHINE LEARNING R, V4, P293
[8]  
Elkan C, 2001, IJCAI, DOI DOI 10.5555/1642194.1642224
[9]  
Fanning K. M., 1998, International Journal of Intelligent Systems in Accounting, Finance and Management, V7, P21, DOI 10.1002/(SICI)1099-1174(199803)7:1<21::AID-ISAF138>3.0.CO
[10]  
2-K