Evolving Transparent Credit Risk Models: A Symbolic Regression Approach Using Genetic Programming

被引:5
作者
Sotiropoulos, Dionisios N. [1 ]
Koronakos, Gregory [1 ]
Solanakis, Spyridon V. [1 ]
机构
[1] Univ Piraeus, Dept Informat, 80 Karaoli & Dimitriou Str, Piraeus 18534, Greece
关键词
credit risk assessment; neural networks; support vector machines; genetic programming; radial basis functions networks; LOGISTIC-REGRESSION; NETWORK;
D O I
10.3390/electronics13214324
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Credit scoring is a cornerstone of financial risk management, enabling financial institutions to assess the likelihood of loan default. However, widely recognized contemporary credit risk metrics, like FICO (Fair Isaac Corporation) or Vantage scores, remain proprietary and inaccessible to the public. This study aims to devise an alternative credit scoring metric that mirrors the FICO score, using an extensive dataset from Lending Club. The challenge lies in the limited available insights into both the precise analytical formula and the comprehensive suite of credit-specific attributes integral to the FICO score's calculation. Our proposed metric leverages basic information provided by potential borrowers, eliminating the need for extensive historical credit data. We aim to articulate this credit risk metric in a closed analytical form with variable complexity. To achieve this, we employ a symbolic regression method anchored in genetic programming (GP). Here, the Occam's razor principle guides evolutionary bias toward simpler, more interpretable models. To ascertain our method's efficacy, we juxtapose the approximation capabilities of GP-based symbolic regression with established machine learning regression models, such as Gaussian Support Vector Machines (GSVMs), Multilayer Perceptrons (MLPs), Regression Trees, and Radial Basis Function Networks (RBFNs). Our experiments indicate that GP-based symbolic regression offers accuracy comparable to these benchmark methodologies. Moreover, the resultant analytical model offers invaluable insights into credit risk evaluation mechanisms, enabling stakeholders to make informed credit risk assessments. This study contributes to the growing demand for transparent machine learning models by demonstrating the value of interpretable, data-driven credit scoring models.
引用
收藏
页数:37
相关论文
共 65 条
[1]   CREDIT SCORING, STATISTICAL TECHNIQUES AND EVALUATION CRITERIA: A REVIEW OF THE LITERATURE [J].
Abdou, Hussein A. ;
Pointon, John .
INTELLIGENT SYSTEMS IN ACCOUNTING FINANCE & MANAGEMENT, 2011, 18 (2-3) :59-88
[2]   Why Do Borrowers Make Mortgage Refinancing Mistakes? [J].
Agarwal, Sumit ;
Rosen, Richard J. ;
Yao, Vincent .
MANAGEMENT SCIENCE, 2016, 62 (12) :3494-3509
[3]  
Albanesi S., 2019, Predicting consumer default: A deep learning approach Working paper no. 26165
[4]  
Amaro M.M., 2020, Masters Thesis
[5]  
[Anonymous], My FICO Score
[6]  
[Anonymous], About us
[7]  
[Anonymous], About Us
[8]  
[Anonymous], About us
[9]  
Avery R., 2004, Federal Reserve Bulletin, V90, P297
[10]  
Barocas S., 2019, Fairness and Machine Learning