Forecasting SARS-CoV-2 spike protein evolution from small data by deep learning and regression

被引:0
作者
King, Samuel [1 ,2 ,3 ]
Chen, Xinyi E. [1 ,4 ,5 ]
Ng, Sarah W. S. [1 ,4 ,5 ]
Rostin, Kimia [1 ,4 ,5 ]
Hahn, Samuel V. [1 ,6 ]
Roberts, Tylo [1 ,4 ]
Schwab, Janella C. [1 ,7 ]
Sekhon, Parneet [1 ,4 ]
Kagieva, Madina [1 ,2 ,3 ]
Reilly, Taylor [1 ,2 ,3 ]
Qi, Ruo Chen [1 ,8 ]
Salman, Paarsa [1 ,2 ,3 ]
Hong, Ryan J. [1 ,4 ]
Ma, Eric J. [9 ]
Hallam, Steven J. [1 ,4 ,9 ,10 ,11 ,12 ]
机构
[1] Univ British Columbia, BC Canc Agcy, Radiat Oncol, Vancouver, BC, Canada
[2] Univ British Columbia, Dept Bot, Vancouver, BC, Canada
[3] Univ British Columbia, Dept Zool, Vancouver, BC, Canada
[4] Univ British Columbia, Dept Microbiol & Immunol, Vancouver, BC, Canada
[5] Univ British Columbia, Dept Comp Sci, Vancouver, BC, Canada
[6] Univ British Columbia, Dept Chem & Biol Engn, Vancouver, BC, Canada
[7] Univ British Columbia, Fac Land & Food Syst, Vancouver, BC, Canada
[8] Univ British Columbia, Dept Cellular & Physiol Sci, Vancouver, BC, Canada
[9] Univ British Columbia, Grad Program Bioinformat, Vancouver, BC, Canada
[10] Univ British Columbia, Genome Sci & Technol Program, Vancouver, BC, Canada
[11] Univ British Columbia, Life Sci Inst, Vancouver, BC, Canada
[12] Univ British Columbia, ECOSCOPE Training Program, Vancouver, BC, Canada
来源
FRONTIERS IN SYSTEMS BIOLOGY | 2024年 / 4卷
基金
加拿大自然科学与工程研究理事会;
关键词
deep learning; regression; protein evolution; SARS-CoV-2; spike protein; small data; predictive model; GAUSSIAN PROCESS REGRESSION; VACCINE; MODEL;
D O I
10.3389/fsysb.2024.1284668
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The emergence of SARS-CoV-2 variants during the COVID-19 pandemic caused frequent global outbreaks that confounded public health efforts across many jurisdictions, highlighting the need for better understanding and prediction of viral evolution. Predictive models have been shown to support disease prevention efforts, such as with the seasonal influenza vaccine, but they require abundant data. For emerging viruses of concern, such models should ideally function with relatively sparse data typically encountered at the early stages of a viral outbreak. Conventional discrete approaches have proven difficult to develop due to the spurious and reversible nature of amino acid mutations and the overwhelming number of possible protein sequences adding computational complexity. We hypothesized that these challenges could be addressed by encoding discrete protein sequences into continuous numbers, effectively reducing the data size while enhancing the resolution of evolutionarily relevant differences. To this end, we developed a viral protein evolution prediction model (VPRE), which reduces amino acid sequences into continuous numbers by using an artificial neural network called a variational autoencoder (VAE) and models their most statistically likely evolutionary trajectories over time using Gaussian process (GP) regression. To demonstrate VPRE, we used a small amount of early SARS-CoV-2 spike protein sequences. We show that the VAE can be trained on a synthetic dataset based on this data. To recapitulate evolution along a phylogenetic path, we used only 104 spike protein sequences and trained the GP regression with the numerical variables to project evolution up to 5 months into the future. Our predictions contained novel variants and the most frequent prediction mapped primarily to a sequence that differed by only a single amino acid from the most reported spike protein within the prediction timeframe. Novel variants in the spike receptor binding domain (RBD) were capable of binding human angiotensin-converting enzyme 2 (ACE2) in silico, with comparable or better binding than previously resolved RBD-ACE2 complexes. Together, these results indicate the utility and tractability of combining deep learning and regression to model viral protein evolution with relatively sparse datasets, toward developing more effective medical interventions.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Effect on the conformations of the spike protein of SARS-CoV-2 due to mutation
    Gupta, Aayatti Mallick
    Chakrabarti, Jaydeb
    [J]. BIOTECHNOLOGY AND APPLIED BIOCHEMISTRY, 2023, 70 (03) : 979 - 991
  • [42] Multifaceted membrane binding head of the SARS-CoV-2 spike protein
    Tran, Anh
    Kervin, Troy A.
    Overduin, Michael
    [J]. CURRENT RESEARCH IN STRUCTURAL BIOLOGY, 2022, 4 : 146 - 157
  • [43] SARS-CoV-2 spike protein aggregation is triggered by bacterial lipopolysaccharide
    Petrlova, Jitka
    Samsudin, Firdaus
    Bond, Peter J.
    Schmidtchen, Artur
    [J]. FEBS LETTERS, 2022, 596 (19) : 2566 - 2575
  • [44] SARS-CoV-2 Spike Protein Activates Human Lung Macrophages
    Palestra, Francesco
    Poto, Remo
    Ciardi, Renato
    Opromolla, Giorgia
    Secondo, Agnese
    Tedeschi, Valentina
    Ferrara, Anne Lise
    Di Crescenzo, Rosa Maria
    Galdiero, Maria Rosaria
    Cristinziano, Leonardo
    Modestino, Luca
    Marone, Gianni
    Fiorelli, Alfonso
    Varricchi, Gilda
    Loffredo, Stefania
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (03)
  • [45] In Vitro Analysis of SARS-CoV-2 Spike Protein and Ivermectin Interaction
    Garcia-Aguilar, Alejandra
    Campi-Caballero, Rebeca
    Visoso-Carvajal, Giovani
    Garcia-Sanchez, Jose Ruben
    Correa-Basurto, Jose
    Garcia-Machorro, Jazmin
    Espinosa-Raya, Judith
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (22)
  • [46] Stability and expression of SARS-CoV-2 spike-protein mutations
    Kristoffer T. Bæk
    Rukmankesh Mehra
    Kasper P. Kepp
    [J]. Molecular and Cellular Biochemistry, 2023, 478 : 1269 - 1280
  • [47] Product of natural evolution (SARS, MERS, and SARS-CoV-2); deadly diseases, from SARS to SARS-CoV-2
    Shahrajabian, Mohamad Hesam
    Sun, Wenli
    Cheng, Qi
    [J]. HUMAN VACCINES & IMMUNOTHERAPEUTICS, 2021, 17 (01) : 62 - 83
  • [48] Discovery of Small Molecule Entry Inhibitors Targeting the Fusion Peptide of SARS-CoV-2 Spike Protein
    Hu, Xin
    Chen, Catherine Z.
    Xu, Miao
    Hu, Zongyi
    Guo, Hui
    Itkin, Zina
    Shinn, Paul
    Ivin, Parker
    Leek, Madeleine
    Liang, T. Jake
    Shen, Min
    Zheng, Wei
    Hall, Matthew D.
    [J]. ACS MEDICINAL CHEMISTRY LETTERS, 2021, 12 (08): : 1267 - 1274
  • [49] Functional evolution of SARS-CoV-2 spike protein: Maintaining wide host spectrum and enhancing infectivity via surface charge of spike protein
    Lu, Xiaolong
    Chen, Yang
    Zhang, Gong
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 2068 - 2074
  • [50] Will Mutations in the Spike Protein of SARS-CoV-2 Lead to the Failure of COVID-19 Vaccines?
    Jia, Zaixing
    Gong, Wenping
    [J]. JOURNAL OF KOREAN MEDICAL SCIENCE, 2021, 36 (18)