Machine Learning-Based Empirical Investigation for Credit Scoring in Vietnam's Banking

被引:3
作者
Khanh Quoc Tran [1 ,2 ]
Binh Van Duong [1 ,2 ]
Linh Quang Tran [1 ,2 ]
An Le-Hoai Tran [1 ,2 ]
An Trong Nguyen [1 ,2 ]
Kiet Van Nguyen [1 ,2 ]
机构
[1] Univ Informat Technol, Ho Chi Minh City, Vietnam
[2] Vietnam Natl Univ, Ho Chi Minh City, Vietnam
来源
ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE. FROM THEORY TO PRACTICE, IEA/AIE 2021, PT II | 2021年 / 12799卷
关键词
Credit scoring; Prediction; Machine learning; Ensemble models; Data mining; PERFORMANCE;
D O I
10.1007/978-3-030-79463-7_48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In thons for credit scoring in Vietnam with machine learning models based on our submissions for the Kalapa Credit Score Challenge. We conduct experiments with modern machine learning methods based on ensemble learning models: LightGBM, CatBoost, and Random Forest. Our experimental results are better than single-model algorithms such as Support Vector Machine (SVM) or Logistic Regression. As a result, we achieve the F1-Score of 0.83 (Random Forest) with the sixth place on the leaderboard. Subsequently, we analyze the advantages and disadvantages of the used models, propose suitable measures to use for similar problems in the future, and evaluate the results to select the best model. To the best of our knowledge, this is the first work of the field in Vietnamese banking.
引用
收藏
页码:564 / 574
页数:11
相关论文
共 12 条
[1]  
Ahmed M.S.I., 2019, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), V8
[2]  
Al Daoud E., 2019, International Journal of Computer and Information Engineering, V13, P6
[3]   Permutation importance: a corrected feature importance measure [J].
Altmann, Andre ;
Tolosi, Laura ;
Sander, Oliver ;
Lengauer, Thomas .
BIOINFORMATICS, 2010, 26 (10) :1340-1347
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]  
Dorogush AV, 2018, P WORKSHOP ML SYSTEM, DOI DOI 10.48550/ARXIV.1810.11363
[6]   Stochastic gradient boosting [J].
Friedman, JH .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 38 (04) :367-378
[7]  
Ghatasheh N., 2014, International Journal of Advanced Science and Technology, V72, P19, DOI 10.14257/ijast.2014.72.02
[8]   Credit scoring with a data mining approach based on support vector machines [J].
Huang, Cheng-Lung ;
Chen, Mu-Chen ;
Wang, Chieh-Jen .
EXPERT SYSTEMS WITH APPLICATIONS, 2007, 33 (04) :847-856
[9]   Facing Imbalanced Data Recommendations for the Use of Performance Metrics [J].
Jeni, Laszlo A. ;
Cohn, Jeffrey F. ;
De La Torre, Fernando .
2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2013, :245-251
[10]  
Ke GL, 2017, ADV NEUR IN, V30