A new approach for reject inference in credit scoring using kernel-free fuzzy quadratic surface support vector machines

被引:43
作者
Tian, Ye [1 ,2 ]
Yong, Ziyang [1 ]
Luo, Jian [3 ]
机构
[1] Southwestern Univ Finance & Econ, Sch Business Adm, Chengdu 611130, Sichuan, Peoples R China
[2] Southwestern Univ Finance & Econ, Collaborat Innovat Ctr Financial Secur, Chengdu 611130, Sichuan, Peoples R China
[3] Dongbei Univ Finance & Econ, Sch Management Sci & Engn, Dalian 116025, Peoples R China
基金
中国国家自然科学基金;
关键词
Reject inference; Credit scoring; Kernel-free quadratic surface SVM; Outlier detection; Online lending; CLASSIFICATION; OPTIMIZATION; MODEL; PERFORMANCE; SELECTION; IMPROVE; SVM;
D O I
10.1016/j.asoc.2018.08.021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Credit scoring models have offered benefits to lenders and borrowers for many years. However, in practice these models are normally built on a sample of accepted applicants and fail to consider the remaining rejected applicants. This may cause a sample bias which is an important statistical issue, especially in the online lending situation where a large proportion of requests are rejected. Reject inference is a method for inferring how rejected applicants would have behaved if they had been granted and incorporating this information in rebuilding a more accurate credit scoring system. Due to the good performances of SVM models in this area, this paper proposes a new approach based on the state-of-the-art kernel-free fuzzy quadratic surface SVM model. It is worth pointing out that our method not only performs very well in classification as some latest works, but also handles some big issues in the classical SVM models, such as searching proper kernel functions and solving complex models. Besides, this paper is the first one to eliminate the bad effect of outliers in credit scoring. Moreover, we use two real-world loan data sets to compare our method with some benchmark methods. Particularly, one of the data set is very valuable for the study of reject inference, because the outcomes of rejected applicants are partially known. Finally, the numerical results strongly demonstrate the superiority of the proposed method in applicability, accuracy and efficiency. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:96 / 105
页数:10
相关论文
共 41 条
  • [1] Anderson Billie, 2013, International Journal of Data Analysis Techniques and Strategies, V5, P359
  • [2] [Anonymous], 2010, CVX: Matlab software for disciplined convex programming (web page and software)
  • [3] [Anonymous], 2002, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
  • [4] Introduction to semi-supervised learning
    Goldberg, Xiaojin
    [J]. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2009, 6 : 1 - 116
  • [5] [Anonymous], 1993, IMA J MATH APPL BUSI
  • [6] Credit scoring, augmentation and lean models
    Banasik, J
    Crook, J
    [J]. JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2005, 56 (09) : 1072 - 1081
  • [7] Reject inference in survival analysis by augmentation
    Banasik, J.
    Crook, J.
    [J]. JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2010, 61 (03) : 473 - 485
  • [8] FSVM-CIL: Fuzzy Support Vector Machines for Class Imbalance Learning
    Batuwita, Rukshan
    Palade, Vasile
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2010, 18 (03) : 558 - 571
  • [9] Support vector machines for credit scoring and discovery of significant features
    Bellotti, Tony
    Crook, Jonathan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 3302 - 3308
  • [10] Does segmentation always improve model performance in credit scoring?
    Bijak, Katarzyna
    Thomas, Lyn C.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (03) : 2433 - 2442