Learning from class-imbalance and heterogeneous data for 30-day hospital readmission

被引:18
作者
Du, Guodong [1 ]
Zhang, Jia [1 ]
Li, Shaozi [1 ]
Li, Candong [2 ]
机构
[1] Xiamen Univ, Dept Artificial Intelligence, Xiamen 361005, Peoples R China
[2] Fujian Univ Tradit Chinese Med, Coll Tradit Chinese Med, Fuzhou 350122, Peoples R China
关键词
30-day readmission prediction; Heterogeneous data; Class-imbalance data; Sample weight learning; Large margin property; FEATURE-SELECTION; PREDICTION; FRAMEWORK; MODELS; TIME;
D O I
10.1016/j.neucom.2020.08.064
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Predicting 30-day hospital readmission is a core research task in the development of personalized healthcare. However, the imbalanced class distribution and the heterogeneity of electronic health records are the major challenges to establish an effective machine learning model for this task. To overcome these issues, we propose a new 30-day readmission prediction algorithm to improve the performance. First, we solve the problem of class-imbalance readmission prediction by learning sample weights based on hypothesis margin loss. At the same time, we consider the character of data heterogeneity, and learn the weights of heterogeneous data sources to improve the generalization ability. Based on this, we construct an optimization framework, which involves two variables, i.e., sample weights and source weights. By iterative optimization, we obtain the prediction result for readmission. Finally, we conduct experiments on three real-world readmission datasets to verify the effectiveness of the proposed method. The experimental results show that the proposed algorithm has the advantages to deal with the task of 30-day hospital readmission prediction. (C) 2020 Published by Elsevier B.V.
引用
收藏
页码:27 / 35
页数:9
相关论文
共 58 条
  • [1] [Anonymous], 2019, CONCURR COMPUT PRACT
  • [2] Predictive models for hospital readmission risk: A systematic review of methods
    Artetxe, Arkaitz
    Beristain, Andoni
    Grana, Manuel
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2018, 164 : 49 - 64
  • [3] Readmission prediction using deep learning on electronic health records
    Ashfaq, Awais
    Sant'Anna, Anita
    Lingman, Markus
    Nowaczyk, Slawomir
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 97
  • [4] Boyd S., 2014, CONVEX OPTIMIZATION
  • [5] Novel Cost-Sensitive Approach to Improve the Multilayer Perceptron Performance on Imbalanced Data
    Castro, Cristiano L.
    Braga, Antonio P.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (06) : 888 - 899
  • [6] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [7] Joint multilabel classification and feature selection based on deep canonical correlation analysis
    Dai, Liang
    Du, Guodong
    Zhang, Jia
    Li, Candong
    Wei, Rong
    Li, Shaozi
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (22)
  • [8] Random Balance: Ensembles of variable priors classifiers for imbalanced data
    Diez-Pastor, Jose F.
    Rodriguez, Juan J.
    Garcia-Osorio, Cesar
    Kuncheva, Ludmila I.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2015, 85 : 96 - 111
  • [9] Joint imbalanced classification and feature selection for hospital readmissions
    Du, Guodong
    Zhang, Jia
    Luo, Zhiming
    Ma, Fenglong
    Ma, Lei
    Li, Shaozi
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 200
  • [10] Prediction of 30-Day Readmission: An Improved Gradient Boosting Decision Tree Approach
    Du, Guodong
    Ma, Lei
    Hu, Jin-Shan
    Zhang, Junpeng
    Xiang, Yan
    Shao, Dangguo
    Wang, Hongbin
    [J]. JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2019, 9 (03) : 620 - 627