Predicting Long-COVID Sequelae: A Multi-Label Classification Approach

被引:0
作者
Bellan, Mattia [1 ]
Chiocchetti, Annalisa [1 ]
Dossena, Marco [2 ,3 ]
Irwin, Christopher [2 ,3 ]
Piovesan, Luca [2 ]
Portinale, Luigi [2 ]
机构
[1] Univ Piemonte Orientale, Ctr Translat Res Autoimmune & Allerg Dis CAAD, Novara, Italy
[2] Univ Piemonte Orientale, Comp Sci Inst DiSIT, Alessandria, Italy
[3] Univ Campus Biomed, Natl PhD Program Healthcare & Life Sci, Rome, Italy
关键词
multi-label classification; data augmentation; long-COVID syndrome;
D O I
10.1177/17248035251317937
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a study about the prediction of long-COVID sequelae through multi-label classification (MLC). Data on more than 300 patients have been collected during a long-COVID study at Ospedale Maggiore of Novara (Italy), considering their baseline situation, as well as their condition on acute COVID-19 onset. The goal is to predict the presence of specific long-COVID sequelae after a one-year follow-up. To amplify the representativeness of the analysis, we carefully investigated the possibility of both augmenting the dataset by considering situations where different levels in the number of complications could arise, and reducing the number of features to be considered for prediction. In the first case, MLSmote under six different policies of data augmentation has been considered, while in case of feature reduction we have generated new datasets via both a supervised and an unsupervised dimension reduction approach (Relief and PCA respectively). A representative set of MLC approaches has been tested on all the available datasets. Results have been evaluated in terms of Accuracy, Exact match, Hamming score and macro-averaged AUC; they show that MLC methods can actually be useful for the prediction of specific long-COVID sequelae, under the different conditions represented by the different considered datasets. In addition, interpretability of the results has been addressed through an approach based on the SHAP method, showing that clinical interpretations of specific predictions can be actually captured by the method, together with the observation that data augmentation techniques do not harm such a kind of explanations.
引用
收藏
页数:14
相关论文
共 33 条
[1]  
[Anonymous], 2023, Tecnomed-hub webpage
[2]   MEASUREMENT OF INEQUALITY [J].
ATKINSON, AB .
JOURNAL OF ECONOMIC THEORY, 1970, 2 (03) :244-263
[3]   Multilabel classification of disease prediction in patients presenting with dyspnea [J].
Baarts, Justus ;
Giezendanner, Stephanie ;
Luethi-Corridori, Giorgia ;
Braendle, Michael ;
Dieterle, Thomas ;
Gabutti, Luca ;
Hammerer-Lercher, Angelika ;
Hasler, Paul ;
Henny-Fullin, Katja ;
Muser, Juergen ;
Leibundgut, Gregor ;
Leuppi-Taegtmeyer, Anne ;
Marbet, Corinne Punsap ;
Schraner, Christian ;
Leuppi, Joerg D. ;
Jaun, Fabienne .
EUROPEAN RESPIRATORY JOURNAL, 2021, 58
[4]  
Bellan M., JAMA Network, V41
[5]   Long-term sequelae are highly prevalent one year after hospitalization for severe COVID-19 [J].
Bellan, Mattia ;
Baricich, Alessio ;
Patrucco, Filippo ;
Zeppegno, Patrizia ;
Gramaglia, Carla ;
Balbo, Piero Emilio ;
Carriero, Alessandro ;
Amico, Chiara Santa ;
Avanzi, Gian Carlo ;
Barini, Michela ;
Battaglia, Marco ;
Bor, Simone ;
Cantaluppi, Vincenzo ;
Cappellano, Giuseppe ;
Ceruti, Federico ;
Chiocchetti, Annalisa ;
Clivati, Elisa ;
Giordano, Mara ;
Cuneo, Daria ;
Gambaro, Eleonora ;
Gattoni, Eleonora ;
Loro, Alberto ;
Manfredi, Marcello ;
Morosini, Umberto ;
Murano, Francesco ;
Paracchini, Elena ;
Patti, Giuseppe ;
Pinato, David James ;
Raineri, Davide ;
Rolla, Roberta ;
Sainaghi, Pier Paolo ;
Tricca, Stefano ;
Pirisi, Mario .
SCIENTIFIC REPORTS, 2021, 11 (01)
[6]   Comprehensive comparative study of multi-label classification methods [J].
Bogatinovski, Jasmin ;
Todorovski, Ljupco ;
Dzeroski, Saso ;
Kocev, Dragi .
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 203
[7]   Dealing with difficult minority labels in imbalanced mutilabel data sets [J].
Charte, Francisco ;
Rivera, Antonio J. ;
del Jesus, Maria J. ;
Herrera, Francisco .
NEUROCOMPUTING, 2019, 326 :39-53
[8]   MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation [J].
Charte, Francisco ;
Rivera, Antonio J. ;
del Jesus, Maria J. ;
Herrera, Francisco .
KNOWLEDGE-BASED SYSTEMS, 2015, 89 :385-397
[9]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[10]   A Tutorial on Multilabel Learning [J].
Gibaja, Eva ;
Ventura, Sebastian .
ACM COMPUTING SURVEYS, 2015, 47 (03)