PCA-based unsupervised feature extraction for gene expression analysis of COVID-19 patients

被引:9
作者
Fujisawa, Kota [1 ]
Shimo, Mamoru [2 ]
Taguchi, Y-H [3 ]
Ikematsu, Shinya [4 ]
Miyata, Ryota [5 ]
机构
[1] Tokyo Inst Technol, Sch Life Sci & Technol, Tokyo 1528550, Japan
[2] Univ Ryukyus, Grad Sch Engn & Sci, Nishihara, Okinawa 9030213, Japan
[3] Chuo Univ, Dept Phys, Tokyo 1128551, Japan
[4] Okinawa Coll, Natl Inst Technol, Dept Bioresources Engn, Nago, Okinawa 9052192, Japan
[5] Univ Ryukyus, Fac Engn, Nishihara, Okinawa 9030213, Japan
关键词
KAPPA-B;
D O I
10.1038/s41598-021-95698-w
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Coronavirus disease 2019 (COVID-19) is raging worldwide. This potentially fatal infectious disease is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, the complete mechanism of COVID-19 is not well understood. Therefore, we analyzed gene expression profiles of COVID-19 patients to identify disease-related genes through an innovative machine learning method that enables a data-driven strategy for gene selection from a data set with a small number of samples and many candidates. Principal-component-analysis-based unsupervised feature extraction (PCAUFE) was applied to the RNA expression profiles of 16 COVID-19 patients and 18 healthy control subjects. The results identified 123 genes as critical for COVID-19 progression from 60,683 candidate probes, including immune-related genes. The 123 genes were enriched in binding sites for transcription factors NFKB1 and RELA, which are involved in various biological phenomena such as immune response and cell survival: the primary mediator of canonical nuclear factor-kappa B (NF-kappa B) activity is the heterodimer RelA-p50. The genes were also enriched in histone modification H3K36me3, and they largely overlapped the target genes of NFKB1 and RELA. We found that the overlapping genes were downregulated in COVID-19 patients. These results suggest that canonical NF-kappa B activity was suppressed by H3K36me3 in COVID-19 patient blood.
引用
收藏
页数:11
相关论文
共 81 条
[1]   GeneSetDB: A comprehensive meta-database, statistical and visualisation framework for gene set analysis [J].
Araki, Hiromitsu ;
Knapp, Christoph ;
Tsai, Peter ;
Print, Cristin .
FEBS OPEN BIO, 2012, 2 :76-82
[2]   Systems biological assessment of immunity to mild versus severe COVID-19 infection in humans [J].
Arunachalam, Prabhu S. ;
Wimmers, Florian ;
Mok, Chris Ka Pun ;
Perera, Ranawaka A. P. M. ;
Scott, Madeleine ;
Hagan, Thomas ;
Sigal, Natalia ;
Feng, Yupeng ;
Bristow, Laurel ;
Tsang, Owen Tak-Yin ;
Wagh, Dhananjay ;
Coller, John ;
Pellegrini, Kathryn L. ;
Kazmin, Dmitri ;
Alaaeddine, Ghina ;
Leung, Wai Shing ;
Chan, Jacky Man Chun ;
Chik, Thomas Shiu Hong ;
Choi, Chris Yau Chung ;
Huerta, Christopher ;
McCullough, Michele Paine ;
Lv, Huibin ;
Anderson, Evan ;
Edupuganti, Srilatha ;
Upadhyay, Amit A. ;
Bosinger, Steve E. ;
Maecker, Holden Terry ;
Khatri, Purvesh ;
Rouphael, Nadine ;
Peiris, Malik ;
Pulendran, Bali .
SCIENCE, 2020, 369 (6508) :1210-+
[3]   Generation and activation of multiple dimeric transcription factors within the NF-κB signaling system [J].
Basak, Soumen ;
Shih, Vincent Feng-Sheng ;
Hoffmann, Alexander .
MOLECULAR AND CELLULAR BIOLOGY, 2008, 28 (10) :3139-3150
[4]   Dimensionality reduction for visualizing single-cell data using UMAP [J].
Becht, Etienne ;
McInnes, Leland ;
Healy, John ;
Dutertre, Charles-Antoine ;
Kwok, Immanuel W. H. ;
Ng, Lai Guan ;
Ginhoux, Florent ;
Newell, Evan W. .
NATURE BIOTECHNOLOGY, 2019, 37 (01) :38-+
[5]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[6]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Enrichr: interactive and collaborative HTML']HTML5 gene list enrichment analysis tool [J].
Chen, Edward Y. ;
Tan, Christopher M. ;
Kou, Yan ;
Duan, Qiaonan ;
Wang, Zichen ;
Meirelles, Gabriela Vaz ;
Clark, Neil R. ;
Ma'ayan, Avi .
BMC BIOINFORMATICS, 2013, 14
[9]   Genetic regulatory subnetworks and key regulating genes in rat hippocampus perturbed by prenatal malnutrition: implications for major brain disorders [J].
Chen, Jiaying ;
Zhao, Xinzhi ;
Cui, Li ;
He, Guang ;
Wang, Xinhui ;
Wang, Fudi ;
Duan, Shiwei ;
He, Lin ;
Li, Qiang ;
Yu, Xiaodan ;
Zhang, Fuquan ;
Xu, Mingqing .
AGING-US, 2020, 12 (09) :8434-8458
[10]   TargetMine, an Integrated Data Warehouse for Candidate Gene Prioritisation and Target Discovery [J].
Chen, Yi-An ;
Tripathi, Lokesh P. ;
Mizuguchi, Kenji .
PLOS ONE, 2011, 6 (03)