Deep learning for credit scoring: Do or don't?

被引：134

作者：

Gunnarsson, Bjorn Rafn ^{[1
]}

Broucke, Seppe vanden ^{[1
,2
]}

Baesens, Bart ^{[1
,3
]}

Oskarsdottir, Maria ^{[4
]}

Lemahieu, Wilfried ^{[1
]}

机构：

[1] Katholieke Univ Leuven, Res Ctr Informat Syst Engn LIRIS, Naamsestr 69, B-3000 Leuven, Belgium

[2] UGent, Dept Business Informat & Operat Management, B-9000 Ghent, Belgium

[3] Univ Southampton, Dept Decis Analyt & Risk, Southampton, Hants, England

[4] Reykjavik Univ, Dept Comp Sci, Menntavegi 1, IS-101 Reykjavik, Iceland

来源：

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH | 2021年 / 295卷 / 01期

关键词：

Decision support systems; Risk analysis; Credit scoring; Deep learning; Bayesian statistical testing; ART CLASSIFICATION ALGORITHMS; DATA MINING METHODS; NEURAL-NETWORKS; OPTIMIZATION; INTELLIGENCE; CLASSIFIERS; MACHINE; DEFAULT; DESIGN; TESTS;

D O I：

10.1016/j.ejor.2021.03.006

中图分类号：

C93 [管理学];

学科分类号：

12 ; 1201 ; 1202 ; 120202 ;

摘要：

Developing accurate analytical credit scoring models has become a major focus for financial institutions. For this purpose, numerous classification algorithms have been proposed for credit scoring. However, the application of deep learning algorithms for classification has been largely ignored in the credit scoring literature. The main motivation for this research is to consider the appropriateness of deep learning algorithms for credit scoring. To this end two deep learning architectures are constructed, namely a multilayer perceptron network and a deep belief network, and their performance compared to that of two conventional methods and two ensemble methods for credit scoring. The models are then evaluated using a range of credit scoring data sets and performance measures. Furthermore, Bayesian statistical testing procedures are introduced in the context of credit scoring and compared to frequentist non-parametric testing procedures which have traditionally been considered best practice in credit scoring. This comparison will highlight the benefits of Bayesian statistical procedures and secure empirical findings. Two main conclusions emerge from comparing the different classification algorithms for credit scoring. Firstly, the ensemble method, XGBoost, is the best performing method for credit scoring of all the methods considered here. Secondly, deep neural networks do not outperform their shallower counterparts and are considerably more computationally expensive to construct. Therefore, deep learning algorithms do not seem to be appropriate models for credit scoring based on this comparison and XGBoost should be preferred over the other credit scoring methods considered here when classification performance is the main objective of credit scoring activities. (c) 2021 Elsevier B.V. All rights reserved.

引用

页码：292 / 305

页数：14

共 79 条

[1] Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) [J].