Differential diagnosis of systemic lupus erythematosus and Sjogren's syndrome using machine learning and multi-omics data

被引:15
作者
Martorell-Marugan, Jordi [1 ,2 ,3 ,4 ]
Chierici, Marco [3 ]
Jurman, Giuseppe [3 ]
Alarcon-Riquelme, Marta E. [5 ,6 ]
Carmona-Saez, Pedro [1 ,2 ,4 ]
机构
[1] Univ Granada, Dept Stat, Granada 18071, Spain
[2] Univ Granada, OR, Granada 18071, Spain
[3] Fdn Bruno Kessler, Data Sci Hlth Res Unit, I-38123 Trento, Italy
[4] Univ Granada, GENYO Ctr Genom & Oncol Res Pfizer, Bioinformat Unit, PTS Granada,Andalusian Reg Govt, Granada 18016, Spain
[5] Univ Granada, Andalusian Reg Govt, Genet Complex Dis, GENYO Ctr Genom & Oncol Res Pfizer,PTS Granada, Granada 18016, Spain
[6] Karolinska Inst, Inst Environm Med, Unit Chron Inflammatory Dis, S-17177 Stockholm, Sweden
关键词
Machine learning; Modeling and prediction; Bioinformatics; Clustering; Classification and association rules; Health; CLASSIFICATION; EXPRESSION; CRITERIA; BIOCONDUCTOR; METHYLATION; VALIDATION; ACCURACY; PACKAGE;
D O I
10.1016/j.compbiomed.2022.106373
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Systemic lupus erythematosus and primary Sjogren's syndrome are complex systemic autoimmune diseases that are often misdiagnosed. In this article, we demonstrate the potential of machine learning to perform differential diagnosis of these similar pathologies using gene expression and methylation data from 651 individuals. Furthermore, we analyzed the impact of the heterogeneity of these diseases on the performance of the predictive models, discovering that patients assigned to a specific molecular cluster are misclassified more often and affect to the overall performance of the predictive models. In addition, we found that the samples characterized by a high interferon activity are the ones predicted with more accuracy, followed by the samples with high inflam-matory activity. Finally, we identified a group of biomarkers that improve the predictions compared to using the whole data and we validated them with external studies from other tissues and technological platforms.
引用
收藏
页数:7
相关论文
共 41 条
  • [1] Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays
    Aryee, Martin J.
    Jaffe, Andrew E.
    Corrada-Bravo, Hector
    Ladd-Acosta, Christine
    Feinberg, Andrew P.
    Hansen, Kasper D.
    Irizarry, Rafael A.
    [J]. BIOINFORMATICS, 2014, 30 (10) : 1363 - 1369
  • [2] New 2019 SLE EULAR/ACR classification criteria are valuable for distinguishing patients with SLE from patients with pSS
    Assan, Florence
    Seror, Raphaele
    Mariette, Xavier
    Nocturne, Gaetane
    [J]. ANNALS OF THE RHEUMATIC DISEASES, 2021, 80 (08)
  • [3] Assessing the accuracy of prediction algorithms for classification: an overview
    Baldi, P
    Brunak, S
    Chauvin, Y
    Andersen, CAF
    Nielsen, H
    [J]. BIOINFORMATICS, 2000, 16 (05) : 412 - 424
  • [4] Integrative Analysis Reveals a Molecular Stratification of Systemic Autoimmune Diseases
    Barturen, Guillermo
    Babaei, Sepideh
    Catala-Moll, Francesc
    Martinez-Bueno, Manuel
    Makowska, Zuzanna
    Martorell-Marugan, Jordi
    Carmona-Saez, Pedro
    Toro-Dominguez, Daniel
    Carnero-Montoro, Elena
    Teruel, Maria
    Kerick, Martin
    Acosta-Herrera, Marialbert
    Le Lann, Lucas
    Jamin, Christophe
    Rodriguez-Ubreva, Javier
    Garcia-Gomez, Antonio
    Kageyama, Jorge
    Buttgereit, Anne
    Hayat, Sikander
    Mueller, Joerg
    Lesche, Ralf
    Hernandez-Fuentes, Maria
    Juarez, Maria
    Rowley, Tania
    White, Ian
    Maranon, Concepcion
    Gomes Anjos, Tania
    Varela, Nieves
    Aguilar-Quesada, Rocio
    Garrancho, Francisco Javier
    Lopez-Berrio, Antonio
    Rodriguez Maresca, Manuel
    Navarro-Linares, Hector
    Almeida, Isabel
    Azevedo, Nancy
    Brandao, Mariana
    Campar, Ana
    Faria, Raquel
    Farinha, Fatima
    Marinho, Antonio
    Neves, Esmeralda
    Tavares, Ana
    Vasconcelos, Carlos
    Trombetta, Elena
    Montanelli, Gaia
    Vigone, Barbara
    Alvarez-Errico, Damiana
    Li, Tianlu
    Thiagaran, Divya
    Blanco Alonso, Ricardo
    [J]. ARTHRITIS & RHEUMATOLOGY, 2021, 73 (06) : 1073 - 1085
  • [5] Metabolic Profiling of Systemic Lupus Erythematosus and Comparison with Primary Sjogren's Syndrome and Systemic Sclerosis
    Bengtsson, Anders A.
    Trygg, Johan
    Wuttge, Dirk M.
    Sturfelt, Gunnar
    Theander, Elke
    Donten, Magdalena
    Moritz, Thomas
    Sennbro, Carl-Johan
    Torell, Frida
    Lood, Christian
    Surowiec, Izabella
    Rannar, Stefan
    Lundstedt, Torbjorn
    [J]. PLOS ONE, 2016, 11 (07):
  • [6] Bezalel S, 2014, ISR MED ASSOC J, V16, P246
  • [7] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [8] GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists
    Carmona-Saez, Pedro
    Chagoyen, Monica
    Tirado, Francisco
    Carazo, Jose M.
    Pascual-Montano, Alberto
    [J]. GENOME BIOLOGY, 2007, 8 (01)
  • [9] Chen T., 2016, ARXIV
  • [10] The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
    Chicco, Davide
    Jurman, Giuseppe
    [J]. BMC GENOMICS, 2020, 21 (01)