Learning the Protein Language Model of SARS-CoV-2 Spike Proteins

被引:0
|
作者
Llanes, Paul Vincent [1 ]
Solano, Geoffrey [1 ]
Pontiveros, Marc Jermaine [2 ]
机构
[1] Univ Philippines Manila, Dept Phys Sci & Math, Manila, Philippines
[2] Univ Philippines Diliman, Dept Comp Sci, Quezon City, Philippines
关键词
SARS-CoV-2; spike proteins; sequence mutations; COVID-19; language modelling; recurrent neural network; Leiden clustering algorithm; viral escape;
D O I
10.1109/ICAIIC57133.2023.10067040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
SARS-CoV-2 virus has long been evolving posing an increased risk in terms of infectivity and transmissibility which causes greater impact in communities worldwide. With the surge of collected SARS-CoV-2 sequences, studies found out that most of the emerging variants are linked to increased mutations in the spike (S) protein as observed in Alpha, Beta, Gamma, and Delta variants. Multiple approaches on genomic surveillance have been performed to monitor the mutational status and spread of the virus however most are heavily dependent on labels attributed to these sequences. Hence, this study features a system that has the capability to learn the protein language model of SARS-CoV-2 spike proteins, based on a bidirectional long-short term memory (BiLSTM) recurrent neural network, using sequence data alone. Upon obtaining the sequence embedding from the model, observed clusters are generated using the Leiden clustering algorithm and is visualized to monitor similarities between variants in terms of grammatical probability and semantic change. Additionally, the system measures the validity of a user-generated next-generation sequence capturing potential sequence mutations indicative of viral escape, particularly mutations by substitutions. Further studies on methods uncovering semantic rules that govern spike proteins are recommended to learn more about other viral characteristics conclusive of the future of the COVID-19 pandemic.
引用
收藏
页码:429 / 434
页数:6
相关论文
共 50 条
  • [31] Distinct conformational states of SARS-CoV-2 spike protein
    Cai, Yongfei
    Zhang, Jun
    Xiao, Tianshu
    Peng, Hanqin
    Sterling, Sarah M.
    Walsh, Richard M., Jr.
    Rawson, Shaun
    Rits-Volloch, Sophia
    Chen, Bing
    SCIENCE, 2020, 369 (6511) : 1586 - +
  • [32] Conformational variability of loops in the SARS-CoV-2 spike protein
    Wong, Samuel W. K.
    Liu, Zongjun
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2022, 90 (03) : 691 - 703
  • [33] The SARS-CoV-2 spike protein: balancing stability and infectivity
    Imre Berger
    Christiane Schaffitzel
    Cell Research, 2020, 30 : 1059 - 1060
  • [34] A thermostable, closed SARS-CoV-2 spike protein trimer
    Xiaoli Xiong
    Kun Qu
    Katarzyna A. Ciazynska
    Myra Hosmillo
    Andrew P. Carter
    Soraya Ebrahimi
    Zunlong Ke
    Sjors H. W. Scheres
    Laura Bergamaschi
    Guinevere L. Grice
    Ying Zhang
    James A. Nathan
    Stephen Baker
    Leo C. James
    Helen E. Baxendale
    Ian Goodfellow
    Rainer Doffinger
    John A. G. Briggs
    Nature Structural & Molecular Biology, 2020, 27 : 934 - 941
  • [35] Evaluation of spike protein antigens for SARS-CoV-2 serology
    Jagtap, Suraj
    Ratnasri, K.
    Valloly, Priyanka
    Sharma, Rakhi
    Maurya, Satyaghosh
    Gaigore, Anushree
    Ardhya, Chitra
    Biligi, Dayananda S.
    Desiraju, Bapu Koundinya
    Natchu, Uma Chandra Mouli
    Saini, Deepak Kumar
    Roy, Rahul
    JOURNAL OF VIROLOGICAL METHODS, 2021, 296
  • [36] The SARS-CoV-2 spike protein: balancing stability and infectivity
    Berger, Imre
    Schaffitzel, Christiane
    CELL RESEARCH, 2020, 30 (12) : 1059 - 1060
  • [37] Interaction of SARS-CoV-2 spike protein with amyloid beta
    Izadpanah, Amin
    Alberts, Julie
    Rappaport, Jay
    Datta, Prasun
    JOURNAL OF MEDICAL PRIMATOLOGY, 2023, 52 (05) : 342 - 342
  • [38] Emergence of SARS-CoV-2 spike protein at the vaccination site
    Beck, Annika
    Dietenberger, Hanna
    Kunz, Sebastian N.
    Mellert, Kevin
    Moeller, Peter
    IMMUNITY INFLAMMATION AND DISEASE, 2023, 11 (03)
  • [39] Terahertz and Infrared Spectroscopy of SARS-CoV-2 Spike Protein
    Konnikova, M.
    Heinz, T.
    Mankova, A.
    Cherkasova, O.
    Butylin, A.
    Peng, Y.
    Shkurinov, A.
    2021 46TH INTERNATIONAL CONFERENCE ON INFRARED, MILLIMETER AND TERAHERTZ WAVES (IRMMW-THZ), 2021,
  • [40] Evolution of the SARS-CoV-2 spike protein in the human host
    Antoni G. Wrobel
    Donald J. Benton
    Chloë Roustan
    Annabel Borg
    Saira Hussain
    Stephen R. Martin
    Peter B. Rosenthal
    John J. Skehel
    Steven J. Gamblin
    Nature Communications, 13