Identification of COVID-19 severity biomarkers based on feature selection on single-cell RNA-Seq data of CD8+ T cells

被引:6
作者
Lu, Jian [1 ,2 ]
Meng, Mei [3 ]
Zhou, XianChao [3 ]
Ding, Shijian [4 ]
Feng, KaiYan [5 ]
Zeng, Zhenbing [1 ]
Huang, Tao [2 ,6 ]
Cai, Yu-Dong [4 ]
机构
[1] Shanghai Univ, Sch Sci, Dept Math, Shanghai, Peoples R China
[2] Univ Chinese Acad Sci, Chinese Acad Sci, Shanghai Inst Nutr & Hlth, Biomed Big Data Ctr,CAS Key Lab Computat Biol, Shanghai, Peoples R China
[3] Shanghai Jiao Tong Univ, Ctr Single Cell Om, Sch Publ Hlth, Sch Med,State Key Lab Oncogenes & Related Genes, Shanghai, Peoples R China
[4] Shanghai Univ, Sch Life Sci, Shanghai, Peoples R China
[5] Guangdong AIB Polytech Coll, Dept Comp Sci, Guangzhou, Peoples R China
[6] Univ Chinese Acad Sci, Shanghai Inst Nutr & Hlth, Chinese Acad Sci, CAS Key Lab Tissue Microenvironm & Tumor, Shanghai, Peoples R China
关键词
COVID-19; severity; CD8(+) T cell; single-cell; feature selection; ANTIGEN;
D O I
10.3389/fgene.2022.1053772
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The global outbreak of the COVID-19 epidemic has become a major public health problem. COVID-19 virus infection triggers a complex immune response. CD8(+) T cells, in particular, play an essential role in controlling the severity of the disease. However, the mechanism of the regulatory role of CD8(+) T cells on COVID-19 remains poorly investigated. In this study, single-cell gene expression profiles from three CD8(+) T cell subtypes (effector, memory, and naive T cells) were downloaded. Each cell subtype included three disease states, namely, acute COVID-19, convalescent COVID-19, and unexposed individuals. The profiles on each cell subtype were individually analyzed in the same way. Irrelevant features in the profiles were first excluded by the Boruta method. The remaining features for each CD8(+) T cells subtype were further analyzed by Max-Relevance and Min-Redundancy, Monte Carlo feature selection, and light gradient boosting machine methods to obtain three feature lists. These lists were then brought into the incremental feature selection method to determine the optimal features for each cell subtype. Their corresponding genes may be latent biomarkers to determine COVID-19 severity. Genes, such as ZFP36, DUSP1, TCR, and IL7R, can be confirmed to play an immune regulatory role in COVID-19 infection and recovery. The results of functional enrichment analysis revealed that these important genes may be associated with immune functions, such as response to cAMP, response to virus, T cell receptor complex, T cell activation, and T cell differentiation. This study further set up different gene expression pattens, represented by classification rules, on three states of COVID-19 and constructed several efficient classifiers to distinguish COVID-19 severity. The findings of this study provided new insights into the biological processes of CD8(+) T cells in regulating the immune response.
引用
收藏
页数:15
相关论文
共 50 条
[1]   Omicron variant of SARS-CoV-2: Genomics, transmissibility, and responses to current COVID-19 vaccines [J].
Araf, Yusha ;
Akter, Fariya ;
Tang, Yan-dong ;
Fatemi, Rabeya ;
Parvez, Md Sorwer Alam ;
Zheng, Chunfu ;
Hossain, Md Golzar .
JOURNAL OF MEDICAL VIROLOGY, 2022, 94 (05) :1825-1832
[2]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[3]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[4]   Clinical and immunological features of severe and moderate coronavirus disease 2019 [J].
Chen, Guang ;
Wu, Di ;
Guo, Wei ;
Cao, Yong ;
Huang, Da ;
Wang, Hongwu ;
Wang, Tao ;
Zhang, Xiaoyun ;
Chen, Huilong ;
Yu, Haijing ;
Zhang, Xiaoping ;
Zhang, Minxia ;
Wu, Shiji ;
Song, Jianxin ;
Chen, Tao ;
Han, Meifang ;
Li, Shusheng ;
Luo, Xiaoping ;
Zhao, Jianping ;
Ning, Qin .
JOURNAL OF CLINICAL INVESTIGATION, 2020, 130 (05) :2620-2629
[5]   Predicting RNA 5-Methylcytosine Sites by Using Essential Sequence Features and Distributions [J].
Chen, Lei ;
Li, ZhanDong ;
Zhang, ShiQi ;
Zhang, Yu-Hang ;
Huang, Tao ;
Cai, Yu-Dong .
BIOMED RESEARCH INTERNATIONAL, 2022, 2022
[6]   iMPT-FDNPL: Identification of Membrane Protein Types with Functional Domains and a Natural Language Processing Approach [J].
Chen, Wei ;
Chen, Lei ;
Dai, Qi .
COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2021, 2021
[7]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[8]   Immunological memory to SARS-CoV-2 assessed for up to 8 months after infection [J].
Dan, Jennifer M. ;
Mateus, Jose ;
Kato, Yu ;
Hastie, Kathryn M. ;
Yu, Esther Dawen ;
Faliti, Caterina E. ;
Grifoni, Alba ;
Ramirez, Sydney, I ;
Haupt, Sonya ;
Frazier, April ;
Nakao, Catherine ;
Rayaprolu, Vamseedhar ;
Rawlings, Stephen A. ;
Peters, Bjoern ;
Krammer, Florian ;
Simon, Viviana ;
Saphire, Erica Ollmann ;
Smith, Davey M. ;
Weiskopf, Daniela ;
Sette, Alessandro ;
Crotty, Shane .
SCIENCE, 2021, 371 (6529) :587-+
[9]   Predicting Heart Cell Types by Using Transcriptome Profiles and a Machine Learning Method [J].
Ding, Shijian ;
Wang, Deling ;
Zhou, Xianchao ;
Chen, Lei ;
Feng, Kaiyan ;
Xu, Xianling ;
Huang, Tao ;
Li, Zhandong ;
Cai, Yudong .
LIFE-BASEL, 2022, 12 (02)
[10]   Monte Carlo feature selection for supervised classification [J].
Draminski, Michal ;
Rada-Iglesias, Alvaro ;
Enroth, Stefan ;
Wadelius, Claes ;
Koronacki, Jacek ;
Komorowski, Jan .
BIOINFORMATICS, 2008, 24 (01) :110-117