Credit Default Prediction on Time-Series Behavioral Data Using Ensemble Models

被引:1
|
作者
Guo, Kangshuai [1 ]
Luo, Shichao [2 ]
Liang, Ming [3 ]
Zhang, Zhongjian [1 ]
Yang, Huabin [1 ]
Wang, Yan [1 ]
Zhou, Yingjie [4 ]
机构
[1] Univ Elect Sci & Technol China, Chengdu, Peoples R China
[2] CITIC aiBank Corp Ltd, Beijing, Peoples R China
[3] China Construct Bank, Beijing, Peoples R China
[4] Sichuan Univ, Chengdu, Sichuan, Peoples R China
关键词
Credit default prediction; Time-series behavior analysis; Machine learning; RISK;
D O I
10.1109/IJCNN54540.2023.10191783
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the past few decades, credit default prediction has been central to managing risk in a consumer lending business. Credit default prediction allows lenders to optimize lending decisions, which leads to a better customer experience and sound business economics. Current models exist to help manage risk, but there is still exists space for better models that can outperform those currently in use. In this paper, we proposed a solution for the credit default prediction at double anonymized information including customer profile information and time-series behavioral data. Specifically, at the industrial-scale dataset provided by the credit default prediction competition of American Express, we leverage it to build a machine learning model that challenges the current model in application. Through the analysis of double anonymized information, we design an effective data processing flow, analyze the impact of time-series behavioral data, and derive useful latent features through feature augmentation and feature engineering, which ends up with a multi-model hybrid prediction integration scheme. The model consists of three modules: LightGBM, XGBoost, and Local-Ensemble, We use different feature combinations and individualized prediction schemes for each model to achieve efficient learning. Finally, the multi-model prediction results are ensemble to output the final result of a score-driven ensemble strategy. Experiments show that the method proposed in this paper has obvious advantages in solving the credit default prediction problem. We validate our model in the credit default prediction competition of American Express by ranking Top 10 of 4,874 teams.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] ESTIMATION OF TIME-SERIES MODELS IN THE PRESENCE OF MISSING DATA
    DUNSMUIR, W
    ROBINSON, PM
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1981, 76 (375) : 560 - 568
  • [32] Discovering ecosystem models from time-series data
    George, D
    Saito, K
    Langley, P
    Bay, S
    Arrigo, KR
    DISCOVERY SCIENCE, PROCEEDINGS, 2003, 2843 : 141 - 152
  • [33] Forecasting financial time-series using data mining models: A simulation study
    Bou-Hamad, Imad
    Jamali, Ibrahim
    RESEARCH IN INTERNATIONAL BUSINESS AND FINANCE, 2020, 51
  • [34] Unsupervised Feature Learning From Time-Series Data Using Linear Models
    Kapourchali, Masoumeh Heidari
    Banerjee, Bonny
    IEEE INTERNET OF THINGS JOURNAL, 2018, 5 (05): : 3918 - 3926
  • [35] Arbitrated Dynamic Ensemble with Abstaining for Time-Series Forecasting on Data Streams
    Boulegane, Dihia
    Bifet, Albert
    Madhusudan, Giyyarpuram
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 1040 - 1045
  • [36] Ensemble learning of genetic networks from time-series expression data
    Nam, Dougu
    Yoon, Sung Ho
    Kim, Jihyun F.
    BIOINFORMATICS, 2007, 23 (23) : 3225 - 3231
  • [37] Evolving Ensemble of Fuzzy Models for Multivariate Time Series Prediction
    Bueno, Lourenco
    Costa, Pyramo
    Mendes, Israel
    Cruz, Enderson
    Leite, Daniel
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [38] Air Pollution Prediction Using an Ensemble of Dynamic Transfer Models for Multivariate Time Series
    Kong, Taewoon
    Choi, Dongguen
    Lee, Geonseok
    Lee, Kichun
    SUSTAINABILITY, 2021, 13 (03) : 1 - 17
  • [39] Time series prediction with ensemble models applied to the CATS benchmark
    Wichard, Joerg D.
    Ogorzalek, Maciej
    NEUROCOMPUTING, 2007, 70 (13-15) : 2371 - 2378
  • [40] Evaluation and prediction of meteorological drought conditions using time-series and genetic programming models
    Omidvar, Ebrahim
    Tahroodi, Zahra Nazeri
    JOURNAL OF EARTH SYSTEM SCIENCE, 2019, 128 (03)