Deep learning for credit scoring: Do or don't?

被引:134
作者
Gunnarsson, Bjorn Rafn [1 ]
Broucke, Seppe vanden [1 ,2 ]
Baesens, Bart [1 ,3 ]
Oskarsdottir, Maria [4 ]
Lemahieu, Wilfried [1 ]
机构
[1] Katholieke Univ Leuven, Res Ctr Informat Syst Engn LIRIS, Naamsestr 69, B-3000 Leuven, Belgium
[2] UGent, Dept Business Informat & Operat Management, B-9000 Ghent, Belgium
[3] Univ Southampton, Dept Decis Analyt & Risk, Southampton, Hants, England
[4] Reykjavik Univ, Dept Comp Sci, Menntavegi 1, IS-101 Reykjavik, Iceland
关键词
Decision support systems; Risk analysis; Credit scoring; Deep learning; Bayesian statistical testing; ART CLASSIFICATION ALGORITHMS; DATA MINING METHODS; NEURAL-NETWORKS; OPTIMIZATION; INTELLIGENCE; CLASSIFIERS; MACHINE; DEFAULT; DESIGN; TESTS;
D O I
10.1016/j.ejor.2021.03.006
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Developing accurate analytical credit scoring models has become a major focus for financial institutions. For this purpose, numerous classification algorithms have been proposed for credit scoring. However, the application of deep learning algorithms for classification has been largely ignored in the credit scoring literature. The main motivation for this research is to consider the appropriateness of deep learning algorithms for credit scoring. To this end two deep learning architectures are constructed, namely a multilayer perceptron network and a deep belief network, and their performance compared to that of two conventional methods and two ensemble methods for credit scoring. The models are then evaluated using a range of credit scoring data sets and performance measures. Furthermore, Bayesian statistical testing procedures are introduced in the context of credit scoring and compared to frequentist non-parametric testing procedures which have traditionally been considered best practice in credit scoring. This comparison will highlight the benefits of Bayesian statistical procedures and secure empirical findings. Two main conclusions emerge from comparing the different classification algorithms for credit scoring. Firstly, the ensemble method, XGBoost, is the best performing method for credit scoring of all the methods considered here. Secondly, deep neural networks do not outperform their shallower counterparts and are considerably more computationally expensive to construct. Therefore, deep learning algorithms do not seem to be appropriate models for credit scoring based on this comparison and XGBoost should be preferred over the other credit scoring methods considered here when classification performance is the main objective of credit scoring activities. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:292 / 305
页数:14
相关论文
共 79 条
[1]   Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) [J].
Adadi, Amina ;
Berrada, Mohammed .
IEEE ACCESS, 2018, 6 :52138-52160
[2]   Credit Risk Analysis Using Machine and Deep Learning Models [J].
Addo, Peter Martey ;
Guegan, Dominique ;
Hassani, Bertrand .
RISKS, 2018, 6 (02)
[3]   An empirical comparison of conventional techniques, neural networks and the three stage hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) model for credit scoring analysis: The case of Turkish credit card data [J].
Akkoc, Soner .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 222 (01) :168-178
[4]  
[Anonymous], 2005, J CREDIT RISK
[5]  
[Anonymous], 1994, A Comprehensive Foundation: Neural Networks, DOI 10.1142/S0129065794000372
[6]  
[Anonymous], 2012, COURSERA NEURAL NETW
[7]  
[Anonymous], 2014, CHICKEN
[8]  
[Anonymous], 2015, Machine learning for adaptive many-core machines-a practical approach
[9]   Benchmarking state-of-the-art classification algorithms for credit scoring [J].
Baesens, B ;
Van Gestel, T ;
Viaene, S ;
Stepanova, M ;
Suykens, J ;
Vanthienen, J .
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2003, 54 (06) :627-635
[10]  
Baesens B, 2016, CREDIT RISK ANAL MEA