A Machine Learning Method for Differentiating and Predicting Human-Infective Coronavirus Based on Physicochemical Features and Composition of the Spike Protein

被引:21
作者
Chao, Wang [1 ]
Quan, Zou [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu 610054, Peoples R China
[2] Hainan Normal Univ, Hainan Key Lab Computat Sci & Applicat, Haikou 571158, Peoples R China
基金
中国国家自然科学基金;
关键词
Coronavirus; Virus-host association; Spike protein; Machine learning; FUNCTIONAL RECEPTOR; NEURAL-NETWORK; BINDING-SITES; IDENTIFICATION;
D O I
10.1049/cje.2021.06.003
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Several Coronaviruses (CoVs) are epidemic pathogens that cause severe respiratory syndrome and are associated with significant morbidity and mortality. In this paper, a machine learning method was developed for predicting the risk of human infection posed by CoVs as an early warning system. The proposed Spike-SVM (Support vector machine) model achieved an accuracy of 97.36% for Human-infective CoV (HCoV) and Nonhuman-infective CoV (Non-HCoV) classification. The top informative features that discriminate HCoVs and Non-HCoVs were identified. Spike-SVM is anticipated to be a useful bioinformatics tool for predicting the infection risk posed by CoVs to humans.
引用
收藏
页码:815 / 823
页数:9
相关论文
共 70 条
[1]   Classification of nuclear receptors based on amino acid composition and dipeptide composition [J].
Bhasin, M ;
Raghava, GPS .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2004, 279 (22) :23262-23266
[2]   Coinfection with COVID-19 and coronavirus HKU1-The critical need for repeat testing if clinically indicated [J].
Chaung, Jenna ;
Chan, Douglas ;
Pada, Surinder ;
Tambyah, Paul A. .
JOURNAL OF MEDICAL VIROLOGY, 2020, 92 (10) :1785-1786
[3]   iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data [J].
Chen, Zhen ;
Zhao, Pei ;
Li, Fuyi ;
Marquez-Lago, Tatiana T. ;
Leier, Andre ;
Revote, Jerico ;
Zhu, Yan ;
Powell, David R. ;
Akutsu, Tatsuya ;
Webb, Geoffrey, I ;
Chou, Kuo-Chen ;
Smith, A. Ian ;
Daly, Roger J. ;
Li, Jian ;
Song, Jiangning .
BRIEFINGS IN BIOINFORMATICS, 2020, 21 (03) :1047-1057
[4]   iFeature: a Python']Python package and web server for features extraction and selection from protein and peptide sequences [J].
Chen, Zhen ;
Zhao, Pei ;
Li, Fuyi ;
Leier, Andre ;
Marquez-Lago, Tatiana T. ;
Wang, Yanan ;
Webb, Geoffrey I. ;
Smith, A. Ian ;
Daly, Roger J. ;
Chou, Kuo-Chen ;
Song, Jiangning .
BIOINFORMATICS, 2018, 34 (14) :2499-2502
[5]   Omics Data and Artificial Intelligence: New Challenges for Gene Therapy Preface [J].
Cheng, Liang .
CURRENT GENE THERAPY, 2020, 20 (01) :1-1
[6]   Computational and Biological Methods for Gene Therapy [J].
Cheng, Liang .
CURRENT GENE THERAPY, 2019, 19 (04) :210-210
[7]   Computational Methods for Identifying Similar Diseases [J].
Cheng, Liang ;
Zhao, Hengqiang ;
Wang, Pingping ;
Zhou, Wenyang ;
Luo, Meng ;
Li, Tianxin ;
Han, Junwei ;
Liu, Shulin ;
Jiang, Qinghua .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2019, 18 :590-604
[8]   gutMDisorder: a comprehensive database for dysbiosis of the gut microbiota in disorders and interventions [J].
Cheng, Liang ;
Qi, Changlu ;
Zhuang, He ;
Fu, Tongze ;
Zhang, Xue .
NUCLEIC ACIDS RESEARCH, 2020, 48 (D1) :D554-D560
[9]   DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features [J].
Chu, Yanyi ;
Kaushik, Aman Chandra ;
Wang, Xiangeng ;
Wang, Wei ;
Zhang, Yufang ;
Shan, Xiaoqi ;
Salahub, Dennis Russell ;
Xiong, Yi ;
Wei, Dong-Qing .
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (01) :451-462
[10]   Hosts and Sources of Endemic Human Coronaviruses [J].
Corman, Victor M. ;
Muth, Doreen ;
Niemeyer, Daniela ;
Drosten, Christian .
ADVANCES IN VIRUS RESEARCH, VOL 100, 2018, 100 :163-188