Feature Selection in a Credit Scoring Model

被引:30
作者
Laborda, Juan [1 ]
Ryoo, Seyong [2 ]
机构
[1] Univ Carlos III, Dept Business Adm, Madrid 28903, Spain
[2] Katholieke Univ Leuven, Leuven Stat Res Ctr, B-3000 Leuven, Belgium
关键词
operational research in banking; machine learning; credit scoring; classification algorithms; feature selection methods; SUPPORT VECTOR MACHINES; ART CLASSIFICATION ALGORITHMS; LEARNING-METHODS; PREDICTION; BANKRUPTCY; REGRESSION; PROBABILITY; ACCURACY;
D O I
10.3390/math9070746
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
This paper proposes different classification algorithms-logistic regression, support vector machine, K-nearest neighbors, and random forest-in order to identify which candidates are likely to default for a credit scoring model. Three different feature selection methods are used in order to mitigate the overfitting in the curse of dimensionality of these classification algorithms: one filter method (Chi-squared test and correlation coefficients) and two wrapper methods (forward stepwise selection and backward stepwise selection). The performances of these three methods are discussed using two measures, the mean absolute error and the number of selected features. The methodology is applied for a valuable database of Taiwan. The results suggest that forward stepwise selection yields superior performance in each one of the classification algorithms used. The conclusions obtained are related to those in the literature, and their managerial implications are analyzed.
引用
收藏
页数:22
相关论文
共 65 条
[1]   A comparative study on base classifiers in ensemble methods for credit scoring [J].
Abelian, Joaquin ;
Castellano, Javier G. .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 73 :1-10
[2]   Support vector machines combined with feature selection for breast cancer diagnosis [J].
Akay, Mehmet Fatih .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) :3240-3247
[3]   An empirical comparison of conventional techniques, neural networks and the three stage hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) model for credit scoring analysis: The case of Turkish credit card data [J].
Akkoc, Soner .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 222 (01) :168-178
[4]  
Alpaydin E., 2010, INTRO MACHINE LEARNI, V1, P579
[5]   FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND PREDICTION OF CORPORATE BANKRUPTCY [J].
ALTMAN, EI .
JOURNAL OF FINANCE, 1968, 23 (04) :589-609
[6]  
[Anonymous], 2015, ASTROPHYS SOURCE COD
[7]   Benchmarking state-of-the-art classification algorithms for credit scoring [J].
Baesens, B ;
Van Gestel, T ;
Viaene, S ;
Stepanova, M ;
Suykens, J ;
Vanthienen, J .
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2003, 54 (06) :627-635
[8]   Support vector machines for credit scoring and discovery of significant features [J].
Bellotti, Tony ;
Crook, Jonathan .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) :3302-3308
[9]  
Belsley D. A., 1991, Computer Science in Economics and Management, V4, P33
[10]   Accuracy of machine learning models versus "hand crafted" expert systems - A credit scoring case study [J].
Ben-David, Arie ;
Frank, Eibe .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) :5264-5271