Credit scoring based on a Bagging-cascading boosted decision tree

被引:0
作者
Zou, Yao [1 ]
Gao, Changchun [1 ]
Xia, Meng [2 ]
Pang, Congyuan [1 ]
机构
[1] Donghua Univ, Glorious Sun Sch Business & Management, Shanghai, Peoples R China
[2] Donghua Univ, Coll Informat Sci & Technol, Shanghai 201620, Peoples R China
基金
中国国家自然科学基金;
关键词
Credit scoring; ensemble learning; boosting; Bagging-cascading; FEATURE-SELECTION; CLASSIFICATION; MODELS; ALGORITHMS;
D O I
10.3233/IDA-216228
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Establishing precise credit scoring models to predict the potential default probability is vital for credit risk management. Machine learning models, especially ensemble learning approaches, have shown substantial progress in the performance improvement of credit scoring. The Bagging ensemble approach improves the credit scoring performance by optimizing the prediction variance while boosting ensemble algorithms reduce the prediction error by controlling the prediction bias. In this study, we propose a hybrid ensemble method that combines the advantages of the Bagging ensemble strategy and boosting ensemble optimization pattern, which can well balance the tradeoff of variance-bias optimization. The proposed method considers XGBoost as a base learner, which ensures the low-bias prediction. Moreover, the Bagging strategy is introduced to train the base learner to prevent over-fitting in the proposed method. Besides, the Bagging-boosting ensemble algorithm is further assembled in a cascading way, making the proposed new hybrid ensemble algorithm a good solution to balance the tradeoff of variance bias for credit scoring. Experimental results on the Australian, German, Japanese, and Taiwan datasets show the proposed Bagging-cascading boosted decision tree provides a more accurate credit scoring result.
引用
收藏
页码:1557 / 1578
页数:22
相关论文
共 52 条
[1]  
Abdoli M, 2020, Arxiv, DOI arXiv:2010.08930
[2]   A comparative study on base classifiers in ensemble methods for credit scoring [J].
Abelian, Joaquin ;
Castellano, Javier G. .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 73 :1-10
[3]   No Free Lunch Theorem: A Review [J].
Adam, Stavros P. ;
Alexandropoulos, Stamatios-Aggelos N. ;
Pardalos, Panos M. ;
Vrahatis, Michael N. .
APPROXIMATION AND OPTIMIZATION: ALGORITHMS, COMPLEXITY AND APPLICATIONS, 2019, 145 :57-82
[4]   Example-Dependent Cost-Sensitive Logistic Regression for Credit Scoring [J].
Bahnsen, Alejandro Correa ;
Aouada, Djamila ;
Ottersten, Bjorn .
2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, :263-269
[5]   Extreme learning machines for credit scoring: An empirical evaluation [J].
Beque, Artem ;
Lessmann, Stefan .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 86 :42-53
[6]   Application of eXtreme gradient boosting trees in the construction of credit risk assessment models for financial institutions [J].
Chang, Yung-Chia ;
Chang, Kuei-Hu ;
Wu, Guan-Jhih .
APPLIED SOFT COMPUTING, 2018, 73 :914-920
[7]   C4.5 Decision Tree Enhanced with AdaBoost Versus Multilayer Perceptron for Credit Scoring Modeling [J].
Damrongsakmethee, Thitimanan ;
Neagoe, Victor-Emil .
COMPUTATIONAL STATISTICS AND MATHEMATICAL MODELING METHODS IN INTELLIGENT SYSTEMS, VOL. 2, 2019, 1047 :216-226
[8]   Statistical and machine learning models in credit scoring: A systematic literature survey [J].
Dastile, Xolani ;
Celik, Turgay ;
Potsane, Moshe .
APPLIED SOFT COMPUTING, 2020, 91
[9]  
Dohmatob E., 2019, P INT C MACH LEARN I, P1646
[10]  
Dua D., 2019, UCI MACHINE LEARNING