Establishment and analysis of a disease risk prediction model for the systemic lupus erythematosus with random forest

被引:11
作者
Chen, Huajian [1 ]
Huang, Li [1 ]
Jiang, Xinyue [1 ]
Wang, Yue [1 ]
Bian, Yan [1 ]
Ma, Shumei [1 ]
Liu, Xiaodong [1 ,2 ,3 ]
机构
[1] Wenzhou Med Univ, Sch Publ Hlth & Management, Wenzhou, Peoples R China
[2] Wenzhou Med Univ, South Zhejiang Inst Radiat Med & Nucl Technol, Wenzhou, Peoples R China
[3] Wenzhou Med Univ, Key Lab Watershed Sci & Hlth Zhejiang Prov, Wenzhou, Peoples R China
关键词
systemic lupus erythematosus; Lasso; random forest; GEO; disease risk prediction model; I INTERFERON; DIAGNOSIS; PATHOGENESIS;
D O I
10.3389/fimmu.2022.1025688
中图分类号
R392 [医学免疫学]; Q939.91 [免疫学];
学科分类号
100102 ;
摘要
Systemic lupus erythematosus (SLE) is a latent, insidious autoimmune disease, and with the development of gene sequencing in recent years, our study aims to develop a gene-based predictive model to explore the identification of SLE at the genetic level. First, gene expression datasets of SLE whole blood samples were collected from the Gene Expression Omnibus (GEO) database. After the datasets were merged, they were divided into training and validation datasets in the ratio of 7:3, where the SLE samples and healthy samples of the training dataset were 334 and 71, respectively, and the SLE samples and healthy samples of the validation dataset were 143 and 30, respectively. The training dataset was used to build the disease risk prediction model, and the validation dataset was used to verify the model identification ability. We first analyzed differentially expressed genes (DEGs) and then used Lasso and random forest (RF) to screen out six key genes (OAS3, USP18, RTP4, SPATS2L, IFI27 and OAS1), which are essential to distinguish SLE from healthy samples. With six key genes incorporated and five iterations of 10-fold cross-validation performed into the RF model, we finally determined the RF model with optimal mtry. The mean values of area under the curve (AUC) and accuracy of the models were over 0.95. The validation dataset was then used to evaluate the AUC performance and our model had an AUC of 0.948. An external validation dataset (GSE99967) with an AUC of 0.810, an accuracy of 0.836, and a sensitivity of 0.921 was used to assess the model's performance. The external validation dataset (GSE185047) of all SLE patients yielded an SLE sensitivity of up to 0.954. The final high-throughput RF model had a mean value of AUC over 0.9, again showing good results. In conclusion, we identified key genetic biomarkers and successfully developed a novel disease risk prediction model for SLE that can be used as a new SLE disease risk prediction aid and contribute to the identification of SLE.
引用
收藏
页数:13
相关论文
共 45 条
[1]   Machine learning approach to gene essentiality prediction: a review [J].
Aromolaran, Olufemi ;
Aromolaran, Damilare ;
Isewon, Itunuoluwa ;
Oyelade, Jelili .
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
[2]   Type I interferon and T helper 17 cells co-exist and co-regulate disease pathogenesis in lupus patients [J].
Biswas, Partha S. ;
Aggarwal, Rohit ;
Levesque, Marc C. ;
Maers, Kelly ;
Ramani, Kritika .
INTERNATIONAL JOURNAL OF RHEUMATIC DISEASES, 2015, 18 (06) :646-653
[3]   Type I interferons affect the metabolic fitness of CD8+ T cells from patients with systemic lupus erythematosus [J].
Buang, Norzawani ;
Tapeng, Lunnathaya ;
Gray, Victor ;
Sardini, Alessandro ;
Whilding, Chad ;
Lightstone, Liz ;
Cairns, Thomas D. ;
Pickering, Matthew C. ;
Behmoaras, Jacques ;
Ling, Guang Sheng ;
Botto, Marina .
NATURE COMMUNICATIONS, 2021, 12 (01)
[4]   Classification of lung cancer using ensemble-based feature selection and machine learning methods [J].
Cai, Zhihua ;
Xu, Dong ;
Zhang, Qing ;
Zhang, Jiexia ;
Ngai, Sai-Ming ;
Shao, Jianlin .
MOLECULAR BIOSYSTEMS, 2015, 11 (03) :791-800
[5]   Erythroid mitochondrial retention triggers myeloid-dependent type I interferon in human SLE [J].
Caielli, Simone ;
Cardenas, Jacob ;
de Jesus, Adriana Almeida ;
Baisch, Jeanine ;
Walters, Lynnette ;
Blanck, Jean Philippe ;
Balasubramanian, Preetha ;
Stagnar, Cristy ;
Ohouo, Marina ;
Hong, Seunghee ;
Nassi, Lorien ;
Stewart, Katie ;
Fuller, Julie ;
Gu, Jinghua ;
Banchereau, Jacques F. ;
Wright, Tracey ;
Goldbach-Mansky, Raphaela ;
Pascual, Virginia .
CELL, 2021, 184 (17) :4464-+
[6]   The pathogenesis of systemic lupus erythematosus-an update [J].
Choi, Jinyoung ;
Kim, Sang Taek ;
Craft, Joe .
CURRENT OPINION IN IMMUNOLOGY, 2012, 24 (06) :651-657
[7]   Autoantibodies in SLE: Specificities, Isotypes and Receptors [J].
Dema, Barbara ;
Charles, Nicolas .
ANTIBODIES, 2016, 5 (01)
[8]   Long-term prognosis and causes of death in systemic lupus erythematosus [J].
Doria, Andrea ;
Iaccarino, Luca ;
Ghirardello, Anna ;
Zampieri, Sandra ;
Arienti, Silvia ;
Sarzi-Puttini, Piercarlo ;
Atzeni, Fabiola ;
Piccoli, Antonio ;
Todesco, Silvano .
AMERICAN JOURNAL OF MEDICINE, 2006, 119 (08) :700-706
[9]   Management strategies and future directions for systemic lupus erythematosus in adults [J].
Durcan, Laura ;
O'Dwyer, Tom ;
Petri, Michelle .
LANCET, 2019, 393 (10188) :2332-2343
[10]   Type I IFN system in the development and manifestations of SLE [J].
Elkon, Keith B. ;
Wiedeman, Alice .
CURRENT OPINION IN RHEUMATOLOGY, 2012, 24 (05) :499-505