Feature Selection Engineering for Credit Risk Assessment in Retail Banking

被引:9
作者
Jemai, Jaber [1 ]
Zarrad, Anis [2 ]
机构
[1] Higher Coll Technol, Comp & Informat Syst Div, Abu Dhabi 32092, U Arab Emirates
[2] Univ Birmingham Dubai, Sch Comp Sci, Dubai 73000, U Arab Emirates
关键词
feature selection engineering; credit risk assessment; machine learning; classification; ART CLASSIFICATION ALGORITHMS; SCORING MODELS;
D O I
10.3390/info14030200
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In classification, feature selection engineering helps in choosing the most relevant data attributes to learn from. It determines the set of features to be rejected, supposing their low contribution in discriminating the labels. The effectiveness of a classifier passes mainly through the set of selected features. In this paper, we identify the best features to learn from in the context of credit risk assessment in the financial industry. Financial institutions concur with the risk of approving the loan request of a customer who may default later, or rejecting the request of a customer who can abide by their debt without default. We propose a feature selection engineering approach to identify the main features to refer to in assessing the risk of a loan request. We use different feature selection methods including univariate feature selection (UFS), recursive feature elimination (RFE), feature importance using decision trees (FIDT), and the information value (IV). We implement two variants of the XGBoost classifier on the open data set provided by the Lending Club platform to evaluate and compare the performance of different feature selection methods. The research shows that the most relevant features are found by the four feature selection techniques.
引用
收藏
页数:11
相关论文
共 39 条
[1]  
Amaro M.M., 2020, THESIS LISBON
[2]  
Andreeva G., 2021, Journal of Financial Management, Markets and Institutions, V9
[3]  
[Anonymous], 2021, LEND CLUB PLATF
[4]   Credit scoring: Statistical issues and evidence from credit-bureau files [J].
Avery, RB ;
Bostic, RW ;
Calem, PS ;
Canner, GB .
REAL ESTATE ECONOMICS, 2000, 28 (03) :523-547
[5]   Benchmarking state-of-the-art classification algorithms for credit scoring [J].
Baesens, B ;
Van Gestel, T ;
Viaene, S ;
Stepanova, M ;
Suykens, J ;
Vanthienen, J .
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2003, 54 (06) :627-635
[6]   Explainable models of credit losses [J].
Bastos, Joao A. ;
Matos, Sara M. .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2022, 301 (01) :386-394
[7]   On the Rise of FinTechs: Credit Scoring Using Digital Footprints [J].
Berg, Tobias ;
Burg, Valentin ;
Gombovic, Ana ;
Puri, Manju .
REVIEW OF FINANCIAL STUDIES, 2020, 33 (07) :2845-2897
[8]   AN ECONOMETRIC-ANALYSIS OF THE BANK CREDIT SCORING PROBLEM [J].
BOYES, WJ ;
HOFFMAN, DL ;
LOW, SA .
JOURNAL OF ECONOMETRICS, 1989, 40 (01) :3-14
[9]  
Bumacov V., 2011, P 2 EUR RES C MICR G
[10]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794