Effective Credit Risk Prediction Using Ensemble Classifiers With Model Explanation

被引:5
作者
Aruleba, Idowu [1 ]
Sun, Yanxia [1 ]
机构
[1] Univ Johannesburg, Dept Elect & Elect Engn Sci, ZA-2006 Johannesburg, South Africa
基金
新加坡国家研究基金会;
关键词
Predictive models; Random forests; Decision trees; Boosting; Ensemble learning; Accuracy; Training; Explainable AI; Machine learning; Financial management; Risk management; CART; credit risk; ensemble learning; XAI; machine learning; SHAP; MACHINE; CLASSIFICATION; REGRESSION; ALGORITHM; TREES;
D O I
10.1109/ACCESS.2024.3445308
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Credit risk prediction is a critical task in the financial industry, allowing lenders to assess the likelihood of a borrower defaulting on a loan. Traditional machine learning (ML) classifiers have been widely used for this purpose, and they often struggle with imbalanced data and lack interpretability, making it challenging for financial institutions to make informed decisions. This article explores the use of ensemble classifiers and Synthetic minority over-sampling Edited nearest neighbor (SMOTE-ENN) technique in credit risk prediction, aiming to improve the classification performance. The ensemble classifiers include Random Forest, adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM). The study addresses the class imbalance issue by leveraging ensemble classifiers and the SMOTE-ENN technique while employing Shapley additive exPlanations (SHAP) for model interpretability. The experimental results showed that the proposed approach resulted in improved classification performance. Specifically, on the German credit dataset, XGBoost outperformed the other models with a Recall of 0.930 and a Specificity of 0.846, while Random Forest obtained the best performance on the Australian dataset, achieving a Recall of 0.907 and Specificity of 0.922. Additionally, the integration of SHAP enhanced the models' transparency by providing valuable insights into the contribution of individual features, which is crucial for informed financial decision-making.
引用
收藏
页码:115015 / 115025
页数:11
相关论文
共 85 条
[1]   Credit Risk Analysis Using Machine and Deep Learning Models [J].
Addo, Peter Martey ;
Guegan, Dominique ;
Hassani, Bertrand .
RISKS, 2018, 6 (02)
[2]   An Investigation of Credit Card Default Prediction in the Imbalanced Datasets [J].
Alam, Talha Mahboob ;
Shaukat, Kamran ;
Hameed, Ibrahim A. ;
Luo, Suhuai ;
Sarwar, Muhammad Umer ;
Shabbir, Shakir ;
Li, Jiaming ;
Khushi, Matloob .
IEEE ACCESS, 2020, 8 :201173-201198
[3]   Explaining anomalies detected by autoencoders using Shapley Additive Explanations [J].
Antwarg, Liat ;
Miller, Ronnie Mindlin ;
Shapira, Bracha ;
Rokach, Lior .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 186
[4]  
Austin P. C., 2023, Stat.MethodsMed.Res., V32, P2183
[5]  
Bhatore S., 2020, Journal of Banking and Financial Technology, V4, P111, DOI DOI 10.1007/S42786-020-00020-3
[6]  
Birla S., 2016, P IEEE 7 ANN INF TEC, P6
[7]  
[曹莹 Cao Ying], 2013, [自动化学报, Acta Automatica Sinica], V39, P745
[8]   Interpretable Anomaly Detection with DIFFI: Depth-based feature importance of Isolation Forest [J].
Carletti, Mattia ;
Terzi, Matteo ;
Susto, Gian Antonio .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 119
[9]   Mathematical optimization in classification and regression trees [J].
Carrizosa, Emilio ;
Molero-Rio, Cristina ;
Romero Morales, Dolores .
TOP, 2021, 29 (01) :5-33
[10]  
Celik X.DastileandT., 2021, IEEE Access, V9, P50440