Motifs in SARS-CoV-2 evolution

被引:0
作者
Barrett, Christopher [1 ,2 ]
Bura, Andrei C. [1 ]
He, Qijun [1 ]
Huang, Fenix W. [1 ]
Li, Thomas J. X. [1 ]
Reidys, Christian M. [1 ,3 ]
机构
[1] Univ Virginia, Biocomplex Inst & Initiat, Charlottesville, VA 22904 USA
[2] Univ Virginia, Dept Comp Sci, Charlottesville, VA 22904 USA
[3] Univ Virginia, Dept Math, Charlottesville, VA 22904 USA
关键词
site motif; relational structure; coevolution; SARS-CoV-2; genomic surveillance; GENOMIC SURVEILLANCE; FITNESS; COEVOLUTION; INFORMATION; PHYLOGENY; IMPROVES; STATES; RNA;
D O I
10.1261/rna.079557.122
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present a novel framework enhancing the prediction of whether novel lineage poses the threat of eventually dominating the viral population. The framework is based purely on genomic sequence data, without requiring prior established biological analysis. Its building blocks are sets of coevolving sites in the alignment (motifs), identified via coevolutionary signals. The collection of such motifs forms a relational structure over the polymorphic sites. Motifs are constructed using distances quantifying the coevolutionary coupling of pairs and manifest as coevolving clusters of sites. We present an approach to genomic surveillance based on this notion of relational structure. Our system will issue an alert regarding a lineage, based on its contribution to drastic changes in the relational structure. We then conduct a comprehensive retrospective analysis of the COVID-19 pandemic based on SARS-CoV-2 genomic sequence data in GISAID from October 2020 to September 2022, across 21 lineages and 27 countries with weekly resolution. We investigate the performance of this surveillance system in terms of its accuracy, timeliness, and robustness. Lastly, we study how well each lineage is classified by such a system.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 56 条
  • [21] Omicron: a mysterious variant of concern
    Gowrisankar, A.
    Priyanka, T. M. C.
    Banerjee, Santo
    [J]. EUROPEAN PHYSICAL JOURNAL PLUS, 2022, 137 (01)
  • [22] Gulko B, 1996, Pac Symp Biocomput, P350
  • [23] IDENTIFYING CONSTRAINTS ON THE HIGHER-ORDER STRUCTURE OF RNA - CONTINUED DEVELOPMENT AND APPLICATION OF COMPARATIVE SEQUENCE-ANALYSIS METHODS
    GUTELL, RR
    POWER, A
    HERTZ, GZ
    PUTZ, EJ
    STORMO, GD
    [J]. NUCLEIC ACIDS RESEARCH, 1992, 20 (21) : 5785 - 5795
  • [24] Nextstrain: real-time tracking of pathogen evolution
    Hadfield, James
    Megill, Colin
    Bell, Sidney M.
    Huddleston, John
    Potter, Barney
    Callender, Charlton
    Sagulenko, Pavel
    Bedford, Trevor
    Neher, Richard A.
    [J]. BIOINFORMATICS, 2018, 34 (23) : 4121 - 4123
  • [25] A clustering algorithm based on graph connectivity
    Hartuv, E
    Shamir, R
    [J]. INFORMATION PROCESSING LETTERS, 2000, 76 (4-6) : 175 - 181
  • [26] Fitness effects of advantageous mutations in evolving Escherichia coli populations
    Imhof, M
    Schlötterer, C
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (03) : 1113 - 1117
  • [27] Jaccard P., 2006, New Phytologist, V11, P37, DOI [10.1111/j.1469-8137.1912.tb05611.x, DOI 10.1111/J.1469-8137.1912.TB05611.X]
  • [28] COVID-19 vaccines: rapid development, implications, challenges and future prospects
    Kashte, Shivaji
    Gulbake, Arvind
    El-Amin III, Saadiq F.
    Gupta, Ashim
    [J]. HUMAN CELL, 2021, 34 (03) : 711 - 733
  • [29] MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform
    Katoh, K
    Misawa, K
    Kuma, K
    Miyata, T
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (14) : 3059 - 3066
  • [30] COEVOLUTION TO THE EDGE OF CHAOS - COUPLED FITNESS LANDSCAPES, POISED STATES, AND COEVOLUTIONARY AVALANCHES
    KAUFFMAN, SA
    JOHNSEN, S
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 1991, 149 (04) : 467 - 505