Deep learning based on biologically interpretable genome representation predicts two types of human adaptation of SARS-CoV-2 variants

被引:17
作者
Li, Jing [1 ]
Wu, Ya-Nan [1 ]
Zhang, Sen [1 ]
Kang, Xiao-Ping [1 ]
Jiang, Tao [1 ]
机构
[1] Beijing Inst Microbiol & Epidemiol, AMMS, State Key Lab Pathogen & Biosecur, Beijing 100071, Peoples R China
基金
中国国家自然科学基金;
关键词
dinucleotide composition representation; 3D convolutional neural networks; SARS-CoV-2; variants of concern; human adaptation; CODON USAGE; SELECTION; EVOLUTION; DATABASE;
D O I
10.1093/bib/bbac036
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Explosively emerging SARS-CoV-2 variants challenge current nomenclature schemes based on genetic diversity and biological significance. Genomic composition-based machine learning methods have recently performed well in identifying phenotype-genotype relationships. We introduced a framework involving dinucleotide (DNT) composition representation (DCR) to parse the general human adaptation of RNA viruses and applied a three-dimensional convolutional neural network (3D CNN) analysis to learn the human adaptation of other existing coronaviruses (CoVs) and predict the adaptation of SARS-CoV-2 variants of concern (VOCs). A markedly separable, linear DCR distribution was observed in two major genes-receptor-binding glycoprotein and RNA-dependent RNA polymerase (RdRp)-of six families of single-stranded (ssRNA) viruses. Additionally, there was a general host-specific distribution of both the spike proteins and RdRps of CoVs. The 3D CNN based on spike DCR predicted a dominant type II adaptation of most Beta, Delta and Omicron VOCs, with high transmissibility and low pathogenicity. Type I adaptation with opposite transmissibility and pathogenicity was predicted for SARS-CoV-2 Alpha VOCs (77%) and Kappa variants of interest (58%). The identified adaptive determinants included D1118H and A570D mutations and local DNTs. Thus, the 3D CNN model based on DCR features predicts SARS-CoV-2, a major type II human adaptation and is qualified to predict variant adaptation in real time, facilitating the risk-assessment of emerging SARS-CoV-2 variants and COVID-19 control.
引用
收藏
页数:13
相关论文
共 51 条
  • [1] Unified rational protein engineering with sequence-based deep representation learning
    Alley, Ethan C.
    Khimulya, Grigory
    Biswas, Surojit
    AlQuraishi, Mohammed
    Church, George M.
    [J]. NATURE METHODS, 2019, 16 (12) : 1315 - +
  • [2] The proximal origin of SARS-CoV-2
    Andersen, Kristian G.
    Rambaut, Andrew
    Lipkin, W. Ian
    Holmes, Edward C.
    Garry, Robert F.
    [J]. NATURE MEDICINE, 2020, 26 (04) : 450 - 452
  • [3] Mutation D614G increases SARS-CoV-2 transmission
    Arora, Prerna
    Poehlmann, Stefan
    Hoffmann, Markus
    [J]. SIGNAL TRANSDUCTION AND TARGETED THERAPY, 2021, 6 (01)
  • [4] Predicting reservoir hosts and arthropod vectors from evolutionary signatures in RNA virus genomes
    Babayan, Simon A.
    Orton, Richard J.
    Streicker, Daniel G.
    [J]. SCIENCE, 2018, 362 (6414) : 577 - +
  • [5] Viral adaptation to host: a proteome-based analysis of codon usage and amino acid preferences
    Bahir, Iris
    Fromer, Menachem
    Prat, Yosef
    Linial, Michal
    [J]. MOLECULAR SYSTEMS BIOLOGY, 2009, 5
  • [6] Escape of SARS-CoV-2 501Y.V2 from neutralization by convalescent plasma
    Cele, Sandile
    Gazy, Inbal
    Jackson, Laurelle
    Hwa, Shi-Hsia
    Tegally, Houriiyah
    Lustig, Gila
    Giandhari, Jennifer
    Pillay, Sureshnee
    Wilkinson, Eduan
    Naidoo, Yeshnee
    Karim, Farina
    Ganga, Yashica
    Khan, Khadija
    Bernstein, Mallory
    Balazs, Alejandro B.
    Gosnell, Bernadett, I
    Hanekom, Willem
    Moosa, Mahomed-Yunus S.
    Lessells, Richard J.
    de Oliveira, Tulio
    Sigal, Alex
    [J]. NATURE, 2021, 593 (7857) : 142 - +
  • [7] Codon usage bias and tRNA over-expression in Buchnera aphidicola after aromatic amino acid nutritional stress on its host Acyrthosiphon pisum
    Charles, Hubert
    Calevro, Federica
    Vinuelas, Jose
    Fayard, Jean-Michel
    Rahbe, Yvan
    [J]. NUCLEIC ACIDS RESEARCH, 2006, 34 (16) : 4583 - 4592
  • [8] Dissimilation of synonymous codon usage bias in virus-host coevolution due to translational selection
    Chen, Feng
    Wu, Peng
    Deng, Shuyun
    Zhang, Heng
    Hou, Yutong
    Hu, Zheng
    Zhang, Jianzhi
    Chen, Xiaoshu
    Yang, Jian-Rong
    [J]. NATURE ECOLOGY & EVOLUTION, 2020, 4 (04) : 589 - 600
  • [9] Prediction and mitigation of mutation threats to COVID-19 vaccines and antibody therapies
    Chen, Jiahui
    Gao, Kaifu
    Wang, Rui
    Wei, Guo-Wei
    [J]. CHEMICAL SCIENCE, 2021, 12 (20) : 6929 - 6948
  • [10] Characterisation of the Semliki Forest Virus-host cell interactome reveals the viral capsid protein as an inhibitor of nonsense-mediated mRNA decay
    Contu, Lara
    Balistreri, Giuseppe
    Domanski, Michal
    Uldry, Anne-Christine
    Muhlemann, Oliver
    [J]. PLOS PATHOGENS, 2021, 17 (05)