A decision tree classifier for credit assessment problems in big data environments

被引:13
作者
Chern, Ching-Chin [1 ]
Lei, Weng-U [1 ]
Huang, Kwei-Long [2 ]
Chen, Shu-Yi [3 ]
机构
[1] Natl Taiwan Univ, Dept Informat Management, 50,Lane 144,Sec 4,Keelung Rd, Taipei 106, Taiwan
[2] Natl Taiwan Univ, Inst Ind Engn, Room 109,IYC Bldg 1,Sec 4,Roosevelt Rd, Taipei 106, Taiwan
[3] Ming Chuan Univ, Dept Informat Management, 5 De Ming Rd, Guishan Township 333, Taoyuan County, Taiwan
关键词
Credit assessment; Decision tree; Big data; Data mining; Record linkage; ENSEMBLE CLASSIFICATION; REGRESSION; STRATEGY;
D O I
10.1007/s10257-021-00511-w
中图分类号
F [经济];
学科分类号
02 ;
摘要
Financial institutions have long sought to reduce the risk of consumer loans by improving their credit assessment methods. As new information and network technologies enable massive data collections from many different sources, credit assessment has become a challenge in the big data environment. Complicated processing is required to deal with vast, messy data sources and ever-changing loan regulations. This study proposes a decision tree credit assessment approach (DTCAA) to solve the credit assessment problem in a big data environment. Decision tree models offer good interpretability and easily understood rules, with competitive performance capabilities. In addition, DTCAA features various data consolidation methods to eliminate some of the noise in raw data and facilitate the construction of decision tree. By using a large volume data set from one of the biggest car collateral loan companies in Taiwan, this study verifies the efficiency and validity of DTCAA. The results indicate that DTCAA is competitive in various situations and across multiple factors, in support of the applicability of DTCAA to credit assessment practices.
引用
收藏
页码:363 / 386
页数:24
相关论文
共 33 条
[1]   Data mining for credit card fraud: A comparative study [J].
Bhattacharyya, Siddhartha ;
Jha, Sanjeev ;
Tharakunnel, Kurian ;
Westland, J. Christopher .
DECISION SUPPORT SYSTEMS, 2011, 50 (03) :602-613
[2]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[3]   The use of vicinal-risk minimization for training decision trees [J].
Cao, Yilong ;
Rockett, Peter I. .
APPLIED SOFT COMPUTING, 2015, 31 :185-195
[4]   Decision-tree-based knowledge discovery: Single-vs. multi-decision-tree induction [J].
Chang, Namsik ;
Sheng, Olivia R. Liu .
INFORMS JOURNAL ON COMPUTING, 2008, 20 (01) :46-54
[5]   Combination of feature selection approaches with SVM in credit scoring [J].
Chen, Fei-Long ;
Li, Feng-Chia .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (07) :4902-4909
[6]   Building a cost-constrained decision tree with multiple condition attributes [J].
Chen, Yen-Liang ;
Wu, Chia-Chi ;
Tang, Kwei .
INFORMATION SCIENCES, 2009, 179 (07) :967-979
[7]   Estimating the utility value of individual credit card delinquents [J].
Chung, Suk-Hoon ;
Suh, YongMoo .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) :3975-3981
[8]  
Cieslak DA, 2008, LECT NOTES ARTIF INT, V5211, P241, DOI 10.1007/978-3-540-87479-9_34
[9]   Hellinger distance decision trees are robust and skew-insensitive [J].
Cieslak, David A. ;
Hoens, T. Ryan ;
Chawla, Nitesh V. ;
Kegelmeyer, W. Philip .
DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 24 (01) :136-158
[10]   A novel approach to estimate proximity in a random forest: An exploratory study [J].
Englund, C. ;
Verikas, A. .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (17) :13046-13050