Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic

被引:224
作者
Turakhia, Yatish [1 ,2 ]
Thornlow, Bryan [1 ,2 ]
Hinrichs, Angie S. [2 ]
De Maio, Nicola [3 ]
Gozashti, Landen [1 ,2 ,4 ,5 ]
Lanfear, Robert [6 ]
Haussler, David [1 ,2 ,7 ]
Corbett-Detig, Russell [1 ,2 ,8 ]
机构
[1] Univ Calif Santa Cruz, Dept Biomol Engn, Santa Cruz, CA 95064 USA
[2] Univ Calif Santa Cruz, Genom Inst, Santa Cruz, CA 95064 USA
[3] Wellcome Genome Campus, European Mol Biol Lab, European Bioinformat Inst, Cambridge, England
[4] Harvard Univ, Dept Organism & Evolutionary Biol, Cambridge, MA 02138 USA
[5] Harvard Univ, Museum Comparat Zool, Cambridge, MA 02138 USA
[6] Australian Natl Univ, Res Sch Biol, Dept Ecol & Evolut, Canberra, ACT, Australia
[7] Univ Calif Santa Cruz, Howard Hughes Med Inst, Santa Cruz, CA 95064 USA
[8] Natl Res Univ Higher Sch Econ, Moscow, Russia
基金
澳大利亚研究理事会;
关键词
ACCURATE; BRANCHES;
D O I
10.1038/s41588-021-00862-7
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Ultrafast Sample placement on Existing tRees (UShER) is an efficient method that facilitates the addition of new SARS-CoV-2 genome sequences onto the existing phylogeny, aiding in real-time analysis of viral evolution during the COVID-19 pandemic. As the SARS-CoV-2 virus spreads through human populations, the unprecedented accumulation of viral genome sequences is ushering in a new era of 'genomic contact tracing'-that is, using viral genomes to trace local transmission dynamics. However, because the viral phylogeny is already so large-and will undoubtedly grow many fold-placing new sequences onto the tree has emerged as a barrier to real-time genomic contact tracing. Here, we resolve this challenge by building an efficient tree-based data structure encoding the inferred evolutionary history of the virus. We demonstrate that our approach greatly improves the speed of phylogenetic placement of new samples and data visualization, making it possible to complete the placements under the constraints of real-time contact tracing. Thus, our method addresses an important need for maintaining a fully updated reference phylogeny. We make these tools available to the research community through the University of California Santa Cruz SARS-CoV-2 Genome Browser to enable rapid cross-referencing of information in new virus sequences with an ever-expanding array of molecular and structural biology data. The methods described here will empower research and genomic contact tracing for SARS-CoV-2 specifically for laboratories worldwide.
引用
收藏
页码:809 / +
页数:22
相关论文
共 50 条
[1]   The proximal origin of SARS-CoV-2 [J].
Andersen, Kristian G. ;
Rambaut, Andrew ;
Lipkin, W. Ian ;
Holmes, Edward C. ;
Garry, Robert F. .
NATURE MEDICINE, 2020, 26 (04) :450-452
[2]   Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative [J].
Anisimova, Maria ;
Gascuel, Olivier .
SYSTEMATIC BIOLOGY, 2006, 55 (04) :539-552
[3]  
[Anonymous], 2020, ADDING EXTRAMETADATA
[4]   EPA-ng: Massively Parallel Evolutionary Placement of Genetic Sequences [J].
Barbera, Pierre ;
Kozlov, Alexey M. ;
Czech, Lucas ;
Morel, Benoit ;
Darriba, Diego ;
Flouri, Tomas ;
Stamatakis, Alexandros .
SYSTEMATIC BIOLOGY, 2019, 68 (02) :365-369
[5]  
Bedford Trevor, 2020, medRxiv, DOI [10.1126/science.abc0523, 10.1101/2020.04.02.20051417]
[6]   Lowest common ancestors in trees and directed acyclic graphs [J].
Bender, MA ;
Farach-Colton, M ;
Pemmasani, G ;
Skiena, S ;
Sumazin, P .
JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2005, 57 (02) :75-94
[7]   TreeCmp: Comparison of Trees in Polynomial Time [J].
Bogdanowicz, Damian ;
Giaro, Krzysztof ;
Wrobel, Borys .
EVOLUTIONARY BIOINFORMATICS, 2012, 8 :475-487
[8]   DensiTree: making sense of sets of phylogenetic trees [J].
Bouckaert, Remco R. .
BIOINFORMATICS, 2010, 26 (10) :1372-1373
[9]   Ultrafast Approximation for Phylogenetic Bootstrap [J].
Bui Quang Minh ;
Minh Anh Thi Nguyen ;
von Haeseler, Arndt .
MOLECULAR BIOLOGY AND EVOLUTION, 2013, 30 (05) :1188-1195
[10]  
De Maio, 2020, PREPRINT