Assessment of Creditworthiness Models Privacy-Preserving Training with Synthetic Data

被引:3
|
作者
Munoz-Cancino, Ricardo [1 ]
Bravo, Cristian [2 ]
Rios, Sebastian A. [1 ]
Grana, Manuel [3 ]
机构
[1] Univ Chile, Dept Ind Engn, Business Intelligence Res Ctr CEINE, Beauchef 851, Santiago 8370456, Chile
[2] Univ Western Ontario, Dept Stat & Actuarial Sci, 1151 Richmond St, London, ON N6A 3K7, Canada
[3] Univ Basque Country, Computat Intelligence Grp, San Sebastian 20018, Spain
来源
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2022 | 2022年 / 13469卷
基金
加拿大自然科学与工程研究理事会;
关键词
Credit scoring; Synthetic data; Generative adversarial networks; Variational autoencoders;
D O I
10.1007/978-3-031-15471-3_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Credit scoring models are the primary instrument used by financial institutions to manage credit risk. The scarcity of research on behavioral scoring is due to the difficult data access. Financial institutions have to maintain the privacy and security of borrowers' information refrain them from collaborating in research initiatives. In this work, we present a methodology that allows us to evaluate the performance of models trained with synthetic data when they are applied to real-world data. Our results show that synthetic data quality is increasingly poor when the number of attributes increases. However, creditworthiness assessment models trained with synthetic data show a reduction of 3% of AUC and 6% of KS when compared with models trained with real data. These results have a significant impact since they encourage credit risk investigation from synthetic data, making it possible to maintain borrowers' privacy and to address problems that until now have been hampered by the availability of information.
引用
收藏
页码:375 / 384
页数:10
相关论文
共 50 条
  • [1] Privacy-Preserving Synthetic Smart Meters Data
    Del Grosso, Ganesh
    Pichler, Georg
    Piantanida, Pablo
    2021 IEEE POWER & ENERGY SOCIETY INNOVATIVE SMART GRID TECHNOLOGIES CONFERENCE (ISGT), 2021,
  • [2] Privacy-Preserving Synthetic Location Data in the Real World
    Cunningham, Teddy
    Cormode, Graham
    Ferhatosmanoglu, Hakan
    PROCEEDINGS OF 17TH INTERNATIONAL SYMPOSIUM ON SPATIAL AND TEMPORAL DATABASES, SSTD 2021, 2021, : 23 - 33
  • [3] Privacy-Preserving Anomaly Detection Using Synthetic Data
    Mayer, Rudolf
    Hittmeir, Markus
    Ekelhart, Andreas
    DATA AND APPLICATIONS SECURITY AND PRIVACY XXXIV, DBSEC 2020, 2020, 12122 : 195 - 207
  • [4] Synthetic data for privacy-preserving clinical risk prediction
    Qian, Zhaozhi
    Callender, Thomas
    Cebere, Bogdan
    Janes, Sam M.
    Navani, Neal
    van der Schaar, Mihaela
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [5] DataSynthesizer: Privacy-Preserving Synthetic Datasets
    Ping, Haoyue
    Stoyanovich, Julia
    Howe, Bill
    SSDBM 2017: 29TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2017,
  • [6] A systematic review of privacy-preserving techniques for synthetic tabular health data
    Tobias Hyrup
    Anton D. Lautrup
    Arthur Zimek
    Peter Schneider-Kamp
    Discover Data, 3 (1):
  • [7] SoK: Privacy-Preserving Data Synthesis
    Hu, Yuzheng
    Wu, Fan
    Li, Qinbin
    Long, Yunhui
    Garrido, Gonzalo Munilla
    Ge, Chang
    Ding, Bolin
    Forsyth, David
    Li, Bo
    Song, Dawn
    45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 4696 - 4713
  • [8] Generating Synthetic Health Sensor Data for Privacy-Preserving Wearable Stress Detection
    Lange, Lucas
    Wenzlitschke, Nils
    Rahm, Erhard
    SENSORS, 2024, 24 (10)
  • [9] Experimental Evaluation for Risk Assessment of Privacy Preserving Synthetic Data
    Chida, Koji
    Kakuta, Susumu
    Itakura, Hiroyuki
    Ishihara, Ichiro
    Yoshioka, Kosuke
    Takeuchi, Hiroshi
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2024, 2024, 14986 : 224 - 236
  • [10] Towards Privacy-Preserving Relational Data Synthesis via Probabilistic Relational Models
    Luttermann, Malte
    Moeller, Ralf
    Hartwig, Mattis
    KI 2024: ADVANCES IN ARTIFICIAL INTELLIGENCE, KI 2024, 2024, 14992 : 175 - 189