An Explainable Credit Scoring Framework: A Use Case of Addressing Challenges in Applied Machine Learning

被引:1
作者
Guntay, Levent [1 ]
Bozan, Erdal [2 ]
Tigrak, Umit [2 ]
Durdu, Tolga [2 ]
Ozkahya, Gulcin Ece [2 ]
机构
[1] Ozyegin Univ, Ctr Financial Engn, Sch Business, Istanbul, Turkey
[2] R&D Ctr Fibabanka, Istanbul, Turkey
来源
2022 IEEE TECHNOLOGY AND ENGINEERING MANAGEMENT CONFERENCE (TEMSCON EUROPE) | 2022年
关键词
Explainable Model; Credit Scoring; Surrogate Modeling; Machine Learning;
D O I
10.1109/TEMSCONEUROPE54743.2022.9802029
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
While Machine Learning (ML) classification algorithms can accurately classify a borrower's credit risk, the determinants of the credit score cannot be interpreted clearly by customers, decision makers and auditors. The lack of transparency of black-box credit scoring mechanisms reduces the trust in the banking system and has serious implications for the financing and growth of businesses. Recent regulations in the European Union and the United States require that credit decision mechanism should by explainable and transparent. We present a framework for developing an explainable credit scoring model. Our scientific novelty is to follow a simple and parsimonious Surrogate approach for credit scoring. This approach estimates an explainable white-box model that effectively fits to the in-sample forecasts of the most accurate "black-box" model. We implement the Surrogate credit risk framework using check transactions data provided by a Turkish bank. We find that the Surrogate tree's performance is sufficiently close to performance of the most accurate black-box XGBoost model. Overall, our findings show that it is possible to develop a high-performing explainable credit scoring model with a minimal decrease in model accuracy.
引用
收藏
页码:222 / 227
页数:6
相关论文
共 25 条
  • [1] [Anonymous], 2017, Artificial intelligence and machine learning in financial services
  • [2] A Bolasso based consistent feature selection enabled random forest classification algorithm: An application to credit risk assessment
    Arora, Nisha
    Kaur, Pankaj Deep
    [J]. APPLIED SOFT COMPUTING, 2020, 86 (86)
  • [3] Bracke P., 2019, 813 BANK ENGL STAFF
  • [4] Chen CF, 2018, Arxiv, DOI arXiv:1811.12615
  • [5] Chen J., 2018, arXiv
  • [6] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [7] Clauset A, 2004, PHYS REV E, V70, DOI 10.1103/PhysRevE.70.066111
  • [8] Croxson K., 2019, EXPLAINING WHY COMPU
  • [9] Fahner G., 2018, DATA ANAL
  • [10] Additive logistic regression: A statistical view of boosting - Rejoinder
    Friedman, J
    Hastie, T
    Tibshirani, R
    [J]. ANNALS OF STATISTICS, 2000, 28 (02) : 400 - 407