Feature Selection Engineering for Credit Risk Assessment in Retail Banking

被引:6
作者
Jemai, Jaber [1 ]
Zarrad, Anis [2 ]
机构
[1] Higher Coll Technol, Comp & Informat Syst Div, Abu Dhabi 32092, U Arab Emirates
[2] Univ Birmingham Dubai, Sch Comp Sci, Dubai 73000, U Arab Emirates
关键词
feature selection engineering; credit risk assessment; machine learning; classification; ART CLASSIFICATION ALGORITHMS; SCORING MODELS;
D O I
10.3390/info14030200
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In classification, feature selection engineering helps in choosing the most relevant data attributes to learn from. It determines the set of features to be rejected, supposing their low contribution in discriminating the labels. The effectiveness of a classifier passes mainly through the set of selected features. In this paper, we identify the best features to learn from in the context of credit risk assessment in the financial industry. Financial institutions concur with the risk of approving the loan request of a customer who may default later, or rejecting the request of a customer who can abide by their debt without default. We propose a feature selection engineering approach to identify the main features to refer to in assessing the risk of a loan request. We use different feature selection methods including univariate feature selection (UFS), recursive feature elimination (RFE), feature importance using decision trees (FIDT), and the information value (IV). We implement two variants of the XGBoost classifier on the open data set provided by the Lending Club platform to evaluate and compare the performance of different feature selection methods. The research shows that the most relevant features are found by the four feature selection techniques.
引用
收藏
页数:11
相关论文
共 39 条
  • [1] Amaro M.M., 2020, THESIS LISBON
  • [2] Andreeva G., 2021, J FINANC MANAG MARK, V9, P2150004, DOI [10.1142/S2282717X21500043, DOI 10.1142/S2282717X21500043]
  • [3] [Anonymous], 2021, LEND CLUB PLATF
  • [4] Credit scoring: Statistical issues and evidence from credit-bureau files
    Avery, RB
    Bostic, RW
    Calem, PS
    Canner, GB
    [J]. REAL ESTATE ECONOMICS, 2000, 28 (03) : 523 - 547
  • [5] Benchmarking state-of-the-art classification algorithms for credit scoring
    Baesens, B
    Van Gestel, T
    Viaene, S
    Stepanova, M
    Suykens, J
    Vanthienen, J
    [J]. JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2003, 54 (06) : 627 - 635
  • [6] Explainable models of credit losses
    Bastos, Joao A.
    Matos, Sara M.
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2022, 301 (01) : 386 - 394
  • [7] On the Rise of FinTechs: Credit Scoring Using Digital Footprints
    Berg, Tobias
    Burg, Valentin
    Gombovic, Ana
    Puri, Manju
    [J]. REVIEW OF FINANCIAL STUDIES, 2020, 33 (07) : 2845 - 2897
  • [8] AN ECONOMETRIC-ANALYSIS OF THE BANK CREDIT SCORING PROBLEM
    BOYES, WJ
    HOFFMAN, DL
    LOW, SA
    [J]. JOURNAL OF ECONOMETRICS, 1989, 40 (01) : 3 - 14
  • [9] Bumacov V., 2011, P 2 EUR RES C MICR G
  • [10] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794