Feature Enhanced Ensemble Modeling With Voting Optimization for Credit Risk Assessment

被引:3
|
作者
Yang, Dongqi [1 ]
Xiao, Binqing [1 ]
机构
[1] Nanjing Univ, Sch Management & Engn, Nanjing 210008, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
基金
中国国家自然科学基金;
关键词
Risk management; Predictive models; Data models; Adaptation models; Accuracy; Training; Soft sensors; Credit risk; ensemble modeling; feature enhancement; model interpretability; voting optimization; PERFORMANCE; PREDICTION;
D O I
10.1109/ACCESS.2024.3445499
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning methods have gained widespread utilization in small and micro enterprise credit risk assessment. However, the practical application of these methods encounters a conundrum involving accuracy and interpretability. In this study, a multi-stage ensemble model is proposed to enhance the model's interpretability. To strengthen predictive portraits, a multi-feature enhancement method is proposed to integrate non-financial behavioral information and soft information on credit rating into the annual loan ledger data, thereby bolstering the explanatory capacity of the features. To rectify the issue of data imbalance and avoid information loss, a new bagging-based oversampling method is proposed to oversample the minority class samples in multiple parallelized subsets divided by the bagging strategy. To unleash the performance potential of base classifiers, a new voting-weight optimization method is proposed to optimize the soft voting weights of the candidate base classifiers. The experiment results of an annual loan ledger dataset of a commercial bank in China (with an accuracy of 97.9%, an area under the curve of 0.97, a logistic loss of 0.07, a Brier score of 0.01, and a Kolmogorov-Smirnov statistic of 0.38) and the other five public datasets indicating excellent model fit. By focusing on the widespread soft information and data structures characteristic of SME loan risk assessment data, an additional SHAP model explanation method enhances interpretability. This method reveals that the enhanced 'debt-to-income ratio,' along with non-financial behavioral information and features derived from soft information, are essential for predicting loan defaults. Such enhancements help to alleviate the issue of information asymmetry in SME loan risk assessment.
引用
收藏
页码:115124 / 115136
页数:13
相关论文
共 50 条
  • [41] An ensemble model of QSAR tools for regulatory risk assessment
    Pradeep, Prachi
    Povinelli, Richard J.
    White, Shannon
    Merrill, Stephen J.
    JOURNAL OF CHEMINFORMATICS, 2016, 8 : 1 - 9
  • [42] Application of conditional value at risk for credit risk optimization
    Misankova, Maria
    Spuchlakova, Erika
    SELECTED PAPERS OF 5TH WORLD CONFERENCE ON BUSINESS, ECONOMICS AND MANAGEMENT (BEM-2016), 2017, : 146 - 152
  • [43] Cancer Classification Utilizing Voting Classifier with Ensemble Feature Selection Method and Transcriptomic Data
    Khatun, Rabea
    Akter, Maksuda
    Islam, Md. Manowarul
    Uddin, Md. Ashraf
    Talukder, Md. Alamin
    Kamruzzaman, Joarder
    Azad, Akm
    Paul, Bikash Kumar
    Almoyad, Muhammad Ali Abdulllah
    Aryal, Sunil
    Moni, Mohammad Ali
    GENES, 2023, 14 (09)
  • [44] THE IMPACT OF CREDIT RISK ASSESSMENT ON CREDIT ACTIVITY OF COMMERCIAL BANKS
    Ljubic, Marijana
    Pavlovic, Vladan
    Milacic, Srecko
    ECONOMIC AND SOCIAL DEVELOPMENT: 5TH EASTERN EUROPEAN ECONOMIC AND SOCIAL DEVELOPMENT CONFERENCE ON SOCIAL RESPONSIBILITY, 2015, : 29 - 37
  • [45] An ensemble machine learning approach for forecasting credit risk of agricultural SMEs' investments in agriculture 4.0 through supply chain finance
    Belhadi, Amine
    Kamble, Sachin S.
    Mani, Venkatesh
    Benkhati, Imane
    Touriki, Fatima Ezahra
    ANNALS OF OPERATIONS RESEARCH, 2021, 345 (2) : 779 - 807
  • [46] A machine learning approach combining expert knowledge with genetic algorithms in feature selection for credit risk assessment
    Lappas, Pantelis Z.
    Yannacopoulos, Athanasios N.
    APPLIED SOFT COMPUTING, 2021, 107
  • [47] Integrating data augmentation and hybrid feature selection for small sample credit risk assessment with high dimensionality
    Zhang, Xiaoming
    Yu, Lean
    Yin, Hang
    Lai, Kin Keung
    COMPUTERS & OPERATIONS RESEARCH, 2022, 146
  • [48] OptSelect: An algorithm for ensemble feature selection and stability assessment
    Lee, Eva K.
    Uppal, Karan
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 1979 - 1986
  • [49] Random Subspace Ensemble With Enhanced Feature for Hyperspectral Image Classification
    Jiang, Mengying
    Fang, Yi
    Su, Yuanchao
    Cai, Guofa
    Han, Guojun
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (08) : 1373 - 1377
  • [50] Credit risk optimization using factor models
    Saunders, David
    Xiouros, Costas
    Zenios, Stavros A.
    ANNALS OF OPERATIONS RESEARCH, 2007, 152 : 49 - 77