Performance comparison of multi-label learning algorithms on clinical data for chronic diseases

被引:46
|
作者
Zufferey, Damien [1 ,2 ]
Hofer, Thomas [1 ]
Hennebert, Jean [2 ]
Schumacher, Michael [1 ]
Ingold, Rolf [2 ]
Bromuri, Stefano [1 ]
机构
[1] Univ Appl Sci & Arts Western Switzerland, Inst Informat Syst, AISLab, CH-3960 Sierre, Switzerland
[2] Univ Fribourg, DIVA Res Grp, Dept Informat, Bd Perolles 90, CH-1700 Fribourg, Switzerland
关键词
Multi-label learning; Complex patient; Chronic disease; Clinical data; Summary statistics; MISSING DATA; SYSTEMATIC ANALYSIS; CLASSIFICATION; SCALE; PREDICTION; DESIGN; WORDS; BAG;
D O I
10.1016/j.compbiomed.2015.07.017
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We are motivated by the issue of classifying diseases of chronically ill patients to assist physicians in their everyday work. Our goal is to provide a performance comparison of state-of-the-art multi-label learning algorithms for the analysis of multivariate sequential clinical data from medical records of patients affected by chronic diseases. As a matter of fact, the multi-label learning approach appears to be a good candidate for modeling overlapped medical conditions, specific to chronically ill patients. With the availability of such comparison study, the evaluation of new algorithms should be enhanced. According to the method, we choose a summary statistics approach for the processing of the sequential clinical data, so that the extracted features maintain an interpretable link to their corresponding medical records. The publicly available MIMIC-II dataset, which contains more than 19,000 patients with chronic diseases, is used in this study. For the comparison we selected the following multi-label algorithms: ML-kNN, AdaBoostMH, binary relevance, classifier chains, HOMER and RAkEL. Regarding the results, binary relevance approaches, despite their elementary design and their independence assumption concerning the chronic illnesses, perform optimally in most scenarios, in particular for the detection of relevant diseases. In addition, binary relevance approaches scale up to large dataset and are easy to learn. However, the RAkEL algorithm, despite its scalability problems when it is confronted to large dataset, performs well in the scenario which consists of the ranking of the labels according to the dominant disease of the patient. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:34 / 43
页数:10
相关论文
共 50 条
  • [21] A novel approach for learning label correlation with application to feature selection of multi-label data
    Che, Xiaoya
    Chen, Degang
    Mi, Jusheng
    INFORMATION SCIENCES, 2020, 512 (512) : 795 - 812
  • [22] Imbalanced and missing multi-label data learning with global and local structure
    Su, Xinpei
    Xu, Yitian
    INFORMATION SCIENCES, 2024, 677
  • [23] Multi-Label Learning with Label Enhancement
    Shao, Ruifeng
    Xu, Ning
    Geng, Xin
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 437 - 446
  • [24] Multi-label learning with kernel local label information
    Fu, Xiaozhen
    Li, Deyu
    Zhai, Yanhui
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207
  • [25] Partial Multi-label Learning using Label Compression
    Yu, Tingting
    Yu, Guoxian
    Wang, Jun
    Domeniconi, Carlotta
    Zhang, Xiangliang
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2020), 2020, : 761 - 770
  • [26] Learning with Latent Label Hierarchy from Incomplete Multi-Label Data
    Pei, Yuanli
    Fern, Xiaoli
    Raich, Raviv
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2075 - 2080
  • [27] LAIM discretization for multi-label data
    Cano, Alberto
    Maria Luna, Jose
    Gibaja, Eva L.
    Ventura, Sebastian
    INFORMATION SCIENCES, 2016, 330 : 370 - 384
  • [28] Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms
    Bromuri, Stefano
    Zufferey, Damien
    Hennebert, Jean
    Schumacher, Michael
    JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 51 : 165 - 175
  • [29] Effective lazy learning algorithm based on a data gravitation model for multi-label learning
    Reyes, Oscar
    Morell, Carlos
    Ventura, Sebastian
    INFORMATION SCIENCES, 2016, 340 : 159 - 174
  • [30] PTML Multi-Label Algorithms: Models, Software, and Applications
    Ortega-Tenezaca, Bernabe
    Quevedo-Tumailli, Viviana
    Bediaga, Harbil
    Collados, Jon
    Arrasate, Sonia
    Madariaga, Gotzon
    Munteanu, Cristian R.
    Cordeiro, M. Natalia D. S.
    Gonzalez-Diaz, Humbert
    CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2020, 20 (25) : 2326 - 2337