Combining weighted SMOTE with ensemble learning for the class-imbalanced prediction of small business credit risk

被引:27
作者
Abedin, Mohammad Zoynul [1 ,2 ]
Guotai, Chi [3 ]
Hajek, Petr [4 ]
Zhang, Tong [5 ]
机构
[1] Teesside Univ, Int Business Sch, Dept Finance Performance & Mkt, Middlesbrough TS1 3BX, Cleveland, England
[2] Hajee Mohammad Danesh Sci & Technol Univ, Dept Finance & Banking, Dinajpur 5200, Bangladesh
[3] Dalian Univ Technol, Fac Econ & Management, Dalian 116024, Peoples R China
[4] Univ Pardubice, Inst Syst Engn & Informat, Fac Econ & Adm, Sci & Res Ctr, Pardubice 53210, Czech Republic
[5] Dalian Univ Technol, Fac Econ & Management, Dalian 116024, Peoples R China
基金
中国国家自然科学基金;
关键词
Small business; Credit risk; Imbalanced data; Oversampling; Weighted SMOTE; Ensemble learning; DEFAULT PREDICTION; EMPIRICAL-ANALYSIS; CLASSIFICATION ALGORITHMS; MEDIUM ENTERPRISES; FEATURE-SELECTION; NEURAL-NETWORKS; MACHINE; SMES; MODEL; COST;
D O I
10.1007/s40747-021-00614-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In small business credit risk assessment, the default and nondefault classes are highly imbalanced. To overcome this problem, this study proposes an extended ensemble approach rooted in the weighted synthetic minority oversampling technique (WSMOTE), which is called WSMOTE-ensemble. The proposed ensemble classifier hybridizes WSMOTE and Bagging with sampling composite mixtures to guarantee the robustness and variability of the generated synthetic instances and, thus, minimize the small business class-skewed constraints linked to default and nondefault instances. The original small business dataset used in this study was taken from 3111 records from a Chinese commercial bank. By implementing a thorough experimental study of extensively skewed data-modeling scenarios, a multilevel experimental setting was established for a rare event domain. Based on the proper evaluation measures, this study proposes that the random forest classifier used in the WSMOTE-ensemble model provides a good trade-off between the performance on default class and that of nondefault class. The ensemble solution improved the accuracy of the minority class by 15.16% in comparison with its competitors. This study also shows that sampling methods outperform nonsampling algorithms. With these contributions, this study fills a noteworthy knowledge gap and adds several unique insights regarding the prediction of small business credit risk.
引用
收藏
页码:3559 / 3579
页数:21
相关论文
共 67 条
  • [1] Tax Default Prediction Using Feature Transformation-Based Machine Learning
    Abedin, Mohammad Zoynul
    Chi, Guotai
    Uddin, Mohammed Mohi
    Satu, Md Shahriare
    Khan, Imran
    Hajek, Petr
    [J]. IEEE ACCESS, 2021, 9 : 19864 - 19881
  • [2] Weighted SMOTE-ensemble algorithms: Evidence from Chinese Imbalance Credit Approval Instances
    Abedin, Mohammad Zoynul
    Guotai, Chi
    Moula, Fahmida E.
    [J]. 2019 2ND INTERNATIONAL CONFERENCE ON DATA INTELLIGENCE AND SECURITY (ICDIS 2019), 2019, : 208 - 211
  • [3] An optimized support vector machine intelligent technique using optimized feature selection methods: evidence from Chinese credit approval data
    Abedin, Mohammad Zoynul
    Guotai, Chi
    Fahmida-E-Moula
    Zhang, Tong
    Hassan, M. Kabir
    [J]. JOURNAL OF RISK MODEL VALIDATION, 2019, 13 (02): : 1 - 46
  • [4] Topological applications of multilayer perceptrons and support vector machines in financial decision support systems
    Abedin, Mohammad Zoynul
    Guotai, Chi
    Fahmida-E-Moula
    Azad, A. S. M. Sohel
    Khan, Mohammed Shamim Uddin
    [J]. INTERNATIONAL JOURNAL OF FINANCE & ECONOMICS, 2019, 24 (01) : 474 - 507
  • [5] Credit default prediction using a support vector machine and a probabilistic neural network
    Abedin, Mohammad Zoynul
    Guotai, Chi
    Colombage, Sisira
    Fahmida-E-Moula
    [J]. JOURNAL OF CREDIT RISK, 2018, 14 (02): : 1 - 27
  • [6] Bank competition, lending relationships and firm default risk: An investigation of Italian SMEs
    Agostino, Mariarosaria
    Gagliardi, Francesca
    Trivieri, Francesco
    [J]. INTERNATIONAL SMALL BUSINESS JOURNAL-RESEARCHING ENTREPRENEURSHIP, 2012, 30 (08): : 907 - 943
  • [7] Modelling credit risk for SMEs: Evidence from the US market
    Altman, Edward I.
    Sabato, Gabriele
    [J]. ABACUS-A JOURNAL OF ACCOUNTING FINANCE AND BUSINESS STUDIES, 2007, 43 (03): : 332 - 357
  • [8] Probabilistic modeling and visualization for bankruptcy prediction
    Antunes, Francisco
    Ribeiro, Bernardete
    Pereira, Francisco
    [J]. APPLIED SOFT COMPUTING, 2017, 60 : 831 - 843
  • [9] Early stage SME bankruptcy: does the local banking market matter?
    Arcuri, Giuseppe
    Levratto, Nadine
    [J]. SMALL BUSINESS ECONOMICS, 2020, 54 (02) : 421 - 436
  • [10] Behr P, 2007, J SMALL BUS MANAGE, V45, P194