Using semi-supervised classifiers for credit scoring

被引：27

作者：

Kennedy, K. ^{[1
]}

Mac Namee, B. ^{[1
]}

Delany, S. J. ^{[1
]}

机构：

[1] Dublin Inst Technol, Appl Intelligence Res Ctr, Dublin 8, Ireland

来源：

JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY | 2013年 / 64卷 / 04期

关键词：

banking; credit scoring; low-default portfolio; supervised classification; one-class classification; benchmarking; BANKRUPTCY PREDICTION; MODELS; PROBABILITIES;

D O I：

10.1057/jors.2011.30

中图分类号：

C93 [管理学];

学科分类号：

12 ; 1201 ; 1202 ; 120202 ;

摘要：

In credit scoring, low-default portfolios (LDPs) are those for which very little default history exists. This makes it problematic for financial institutions to estimate a reliable probability of a customer defaulting on a loan. Banking regulation (Basel II Capital Accord), and best practice, however, necessitate an accurate and valid estimate of the probability of default. In this article the suitability of semi-supervised one-class classification (OCC) algorithms as a solution to the LDP problem is evaluated. The performance of OCC algorithms is compared with the performance of supervised two-class classification algorithms. This study also investigates the suitability of over sampling, which is a common approach to dealing with LDPs. Assessment of the performance of one-and two-class classification algorithms using nine real-world banking data sets, which have been modified to replicate LDPs, is provided. Our results demonstrate that only in the near or complete absence of defaulters should semi-supervised OCC algorithms be used instead of supervised two-class classification algorithms. Furthermore, we demonstrate for data sets whose class labels are unevenly distributed that optimising the threshold value on classifier output yields, in many cases, an improvement in classification performance. Finally, our results suggest that oversampling produces no overall improvement to the best performing two-class classification algorithms. Journal of the Operational Research Society (2013) 64, 513-529. doi:10.1057/jors.2011.30

引用

页码：513 / 529

页数：17

共 50 条

[21] Outliers detection using an iterative strategy for semi-supervised learning
Frumosu, Flavia D.
Kulahci, Murat
QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2019, 35 (05) : 1408 - 1423
[22] Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring
Abellan, Joaquin
Mantas, Carlos J.
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (08) : 3825 - 3830
[23] Unbiased Generative Semi-Supervised Learning
Fox-Roberts, Patrick
Rosten, Edward
JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 367 - 443
[24] Improving the Results in Credit Scoring by Increasing Diversity in Ensembles of Classifiers
Moral-Garcia, Serafin
Abellan, Joaquin
IEEE ACCESS, 2023, 11 : 58451 - 58461
[25] Comparison of the hybrid Credit scoring models based on Various Classifiers
Chen, Fei-Long
Li, Feng-Chia
INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2010, 6 (03) : 56 - 74
[26] Supervised, semi-supervised and unsupervised inference of gene regulatory networks
Maetschke, Stefan R.
Madhamshettiwar, Piyush B.
Davis, Melissa J.
Ragan, Mark A.
BRIEFINGS IN BIOINFORMATICS, 2014, 15 (02) : 195 - 211
[27] Semi-supervised classification using sparse representation for cancer recurrence prediction
Cui, Yan
Cai, Xiaodong
Jin, Zhong
2013 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS (GENSIPS 2013), 2013, : 102 - 105
[28] Ensemble classification based on supervised clustering for credit scoring
Xiao, Hongshan
Xiao, Zhi
Wang, Yu
APPLIED SOFT COMPUTING, 2016, 43 : 73 - 86
[29] Semi-Supervised Novelty Detection Using SVM Entire Solution Path
de Morsier, Frank
Tuia, Devis
Borgeaud, Maurice
Gass, Volker
Thiran, Jean-Philippe
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2013, 51 (04): : 1939 - 1950
[30] A new hybrid ensemble credit scoring model based on classifiers consensus system approach
Ala'raj, Maher
Abbod, Maysam F.
EXPERT SYSTEMS WITH APPLICATIONS, 2016, 64 : 36 - 55

← 1 2 3 4 5 →