Improving Risk Predictions by Preprocessing Imbalanced Credit Data

被引:0
|
作者
Garcia, Vicente [1 ]
Isabel Marques, Ana [2 ]
Salvador Sanchez, Jose [1 ]
机构
[1] Univ Jaume 1, Inst New Imaging Technol, Dept Comp Languages & Syst, Av Vicent Sos Baynat S-N, Castellon de La Plana 12071, Spain
[2] Univ Jaume 1, Dep Business Adm & Mkt, Castellon de La Plana 12071, Spain
来源
NEURAL INFORMATION PROCESSING, ICONIP 2012, PT II | 2012年 / 7664卷
关键词
Credit scoring; Class imbalance; Classification; Resampling; Finance; CLASSIFICATION; DEFAULT; SMOTE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imbalanced credit data sets refer to databases in which the class of defaulters is heavily under-represented in comparison to the class of non-defaulters. This is a very common situation in real-life credit scoring applications, but it has still received little attention. This paper investigates whether data resampling can be used to improve the performance of learners built from imbalanced credit data sets, and whether the effectiveness of resampling is related to the type of classifier. Experimental results demonstrate that learning with the resampled sets consistently outperforms the use of the original imbalanced credit data, independently of the classifier used.
引用
收藏
页码:68 / 75
页数:8
相关论文
共 50 条
  • [1] Machine Learning on Imbalanced Data in Credit Risk
    Birla, Shiivong
    Kohli, Kashish
    Dutta, Akash
    7TH IEEE ANNUAL INFORMATION TECHNOLOGY, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE IEEE IEMCON-2016, 2016,
  • [2] Application of Preprocessing Methods to Imbalanced Clinical Data: An Experimental Study
    Wilk, Szymon
    Stefanowski, Jerzy
    Wojciechowski, Szymon
    Farion, Ken J.
    Michalowski, Wojtek
    INFORMATION TECHNOLOGIES IN MEDICINE, ITIB 2016, VOL 1, 2016, 471 : 503 - 515
  • [3] A-SMOTE: A New Preprocessing Approach for Highly Imbalanced Datasets by Improving SMOTE
    Hussein, Ahmed Saad
    Li, Tianrui
    Yohannese, Chubato Wondaferaw
    Bashir, Kamal
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2019, 12 (02) : 1412 - 1422
  • [4] Improving interpolation-based oversampling for imbalanced data learning
    Zhu, Tuanfei
    Lin, Yaping
    Liu, Yonghe
    KNOWLEDGE-BASED SYSTEMS, 2020, 187
  • [5] Systematic literature review of preprocessing techniques for imbalanced data
    Felix, Ebubeogu Amarachukwu
    Lee, Sai Peck
    IET SOFTWARE, 2019, 13 (06) : 479 - 496
  • [6] Machine Learning for Prediction of Imbalanced Data: Credit Fraud Detection
    Thanh Cong Tran
    Tran Khanh Dang
    PROCEEDINGS OF THE 2021 15TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2021), 2021,
  • [7] Improving Software-Quality Predictions With Data Sampling and Boosting
    Seiffert, Chris
    Khoshgoftaar, Taghi M.
    Van Hulse, Jason
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2009, 39 (06): : 1283 - 1294
  • [8] A descriptive study of variable discretization and cost-sensitive logistic regression on imbalanced credit data
    Zhang, Lili
    Ray, Herman
    Priestley, Jennifer
    Tan, Soon
    JOURNAL OF APPLIED STATISTICS, 2020, 47 (03) : 568 - 581
  • [9] Safe Level OUPS for Improving Target Concept Learning in Imbalanced Data Sets
    Rivera, William A.
    Asparouhov, Ognian
    IEEE SOUTHEASTCON 2015, 2015,
  • [10] An experimental comparison of classification algorithms for imbalanced credit scoring data sets
    Brown, Iain
    Mues, Christophe
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (03) : 3446 - 3453