Benchmarking state-of-the-art classification algorithms for credit scoring

被引:535
作者
Baesens, B
Van Gestel, T
Viaene, S
Stepanova, M
Suykens, J
Vanthienen, J
机构
[1] Katholieke Univ Leuven, Dept APpl Econ Sci, B-3000 Louvain, Belgium
[2] Financial Serv Grp, UBS AG, Zurich, Switzerland
关键词
credit scoring; classification; benchmarking;
D O I
10.1057/palgrave.jors.2601545
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.
引用
收藏
页码:627 / 635
页数:9
相关论文
共 28 条
[1]  
[Anonymous], 2002, Least Squares Support Vector Machines
[2]   Bayesian neural network learning for repeat purchase modelling in direct marketing [J].
Baesens, B ;
Viaene, S ;
Van den Poel, D ;
Vanthienen, J ;
Dedene, G .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2002, 138 (01) :191-211
[3]  
BANASIK J, 1996, INT REV RETAIL DISTR, V6, P180
[4]  
BISHOP CM, 1995, NUERAL NETWORKS PATT
[5]   COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].
DELONG, ER ;
DELONG, DM ;
CLARKEPEARSON, DI .
BIOMETRICS, 1988, 44 (03) :837-845
[6]   A comparison of neural networks and linear scoring models in the credit union environment [J].
Desai, VS ;
Crook, JN ;
Overstreet, GA .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1996, 95 (01) :24-37
[7]  
EGAN JP, 1975, SIGNAL DETECTION THE
[8]   Adaptive fraud detection [J].
Fawcett, T ;
Provost, F .
DATA MINING AND KNOWLEDGE DISCOVERY, 1997, 1 (03) :291-316
[9]  
Fayyad U. M., 1993, 13 INT JOINT C ART I
[10]   Bayesian network classifiers [J].
Friedman, N ;
Geiger, D ;
Goldszmidt, M .
MACHINE LEARNING, 1997, 29 (2-3) :131-163