Online Bayesian Phylodynamic Inference in BEAST with Application to Epidemic Reconstruction

被引:26
作者
Gill, Mandev S. [1 ]
Lemey, Philippe [1 ]
Suchard, Marc A. [2 ,3 ,4 ]
Rambaut, Andrew [5 ,6 ]
Baele, Guy [1 ]
机构
[1] Katholieke Univ Leuven, Dept Microbiol Immunol & Transplantat, Rega Inst, Leuven, Belgium
[2] Univ Calif Los Angeles, David Geffen Sch Med, Dept Human Genet, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, Sch Publ Hlth, Dept Biostat, Los Angeles, CA 90024 USA
[4] Univ Calif Los Angeles, David Geffen Sch Med, Dept Biomath, Los Angeles, CA 90095 USA
[5] Univ Edinburgh, Inst Evolutionary Biol, Edinburgh, Midlothian, Scotland
[6] NIH, Fogarty Int Ctr, Bldg 10, Bethesda, MD 20892 USA
基金
欧洲研究理事会; 比尔及梅琳达.盖茨基金会; 英国惠康基金;
关键词
BEAST; Markov chain Monte Carlo; real-time analysis; Bayesian phylogenetics; pathogen phylodynamics; online inference; CHAIN MONTE-CARLO; PHYLOGENETIC INFERENCE; ZIKA VIRUS; EVOLUTION; TRANSMISSION; SPREAD; MODEL; TIME;
D O I
10.1093/molbev/msaa047
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Reconstructing pathogen dynamics from genetic data as they become available during an outbreak or epidemic represents an important statistical scenario in which observations arrive sequentially in time and one is interested in performing inference in an "online" fashion. Widely used Bayesian phylogenetic inference packages are not set up for this purpose, generally requiring one to recompute trees and evolutionary model parameters de novo when new data arrive. To accommodate increasing data flow in a Bayesian phylogenetic framework, we introduce a methodology to efficiently update the posterior distribution with newly available genetic data. Our procedure is implemented in the BEAST 1.10 software package, and relies on a distance-based measure to insert new taxa into the current estimate of the phylogeny and imputes plausible values for new model parameters to accommodate growing dimensionality. This augmentation creates informed starting values and re-uses optimally tuned transition kernels for posterior exploration of growing data sets, reducing the time necessary to converge to target posterior distributions. We apply our framework to data from the recent West African Ebola virus epidemic and demonstrate a considerable reduction in time required to obtain posterior estimates at different time points of the outbreak. Beyond epidemic monitoring, this framework easily finds other applications within the phylogenetics community, where changes in the data-in terms of alignment changes, sequence addition or removal-present common scenarios that can benefit from online inference.
引用
收藏
页码:1832 / 1842
页数:11
相关论文
共 61 条
[1]   The epidemic dynamics of hepatitis C virus subtypes 4a and 4d in Saudi Arabia [J].
Al-Qahtani, Ahmed A. ;
Baele, Guy ;
Khalaf, Nisreen ;
Suchard, Marc A. ;
Al-Anazi, Mashael R. ;
Abdo, Ayman A. ;
Sanai, Faisal M. ;
Al-Ashgar, Hamad I. ;
Khan, Mohammed Q. ;
Al-Ahdal, Mohammed N. ;
Lemey, Philippe ;
Vrancken, Bram .
SCIENTIFIC REPORTS, 2017, 7
[2]  
[Anonymous], 1993, PHYLIP: phylogenetic inference package
[3]  
[Anonymous], 2018, R LANG ENV STAT COMP
[4]   Rapid outbreak sequencing of Ebola virus in Sierra Leone identifies transmission chains linked to sporadic cases [J].
Arias, Armando ;
Watson, Simon J. ;
Asogun, Danny ;
Tobin, Ekaete Alice ;
Lu, Jia ;
Phan, My V. T. ;
Jah, Umaru ;
Wadoum, Raoul Emeric Guetiya ;
Meredith, Luke ;
Thorne, Lucy ;
Caddy, Sarah ;
Tarawalie, Alimamy ;
Langat, Pinky ;
Dudas, Gytis ;
Faria, Nuno R. ;
Dellicour, Simon ;
Kamara, Abdul ;
Kargbo, Brima ;
Kamara, Brima Osaio ;
Gevao, Sahr ;
Cooper, Daniel ;
Newport, Matthew ;
Horby, Peter ;
Dunning, Jake ;
Sahr, Foday ;
Brooks, Tim ;
Simpson, Andrew J. H. ;
Groppelli, Elisabetta ;
Liu, Guoying ;
Mulakken, Nisha ;
Rhodes, Kate ;
Akpablie, James ;
Yoti, Zabulon ;
Lamunu, Margaret ;
Vitto, Esther ;
Otim, Patrick ;
Owilli, Collins ;
Boateng, Isaac ;
Okoror, Lawrence ;
Omomoh, Emmanuel ;
Oyakhilome, Jennifer ;
Omiunu, Racheal ;
Yemisis, Ighodalo ;
Adomeh, Donatus ;
Ehikhiametalor, Solomon ;
Akhilomen, Patience ;
Aire, Chris ;
Kurth, Andreas ;
Cook, Nicola ;
Baumann, Jan .
VIRUS EVOLUTION, 2016, 2 (01)
[5]   BEAGLE 3: Improved Performance, Scaling, and Usability for a High-Performance Computing Library for Statistical Phylogenetics [J].
Ayres, Daniel ;
Cummings, Michael ;
Baele, Guy ;
Darling, Aaron ;
Lewis, Paul ;
Swofford, David ;
Huelsenbeck, John ;
Lemey, Philippe ;
Rambaut, Andrew ;
Suchard, Marc .
SYSTEMATIC BIOLOGY, 2019, 68 (06) :1052-1061
[6]  
Ayres DL, 2012, SYST BIOL, V61, P170, DOI [10.1093/sysbio/syr100, 10.1093/sysbio/sys029]
[7]   Recent advances in computational phylodynamics [J].
Baele, Guy ;
Dellicour, Simon ;
Suchard, Marc A. ;
Lemey, Philippe ;
Vrancken, Bram .
CURRENT OPINION IN VIROLOGY, 2018, 31 :24-32
[8]   Adaptive MCMC in Bayesian phylogenetics: an application to analyzing partitioned data in BEAST [J].
Baele, Guy ;
Lemey, Philippe ;
Rambaut, Andrew ;
Suchard, Marc A. .
BIOINFORMATICS, 2017, 33 (12) :1798-1805
[9]   Emerging Concepts of Data Integration in Pathogen Phylodynamics [J].
Baele, Guy ;
Suchard, Marc A. ;
Rambaut, Andrew ;
Lemey, Philippe .
SYSTEMATIC BIOLOGY, 2017, 66 (01) :E47-E65
[10]   EPA-ng: Massively Parallel Evolutionary Placement of Genetic Sequences [J].
Barbera, Pierre ;
Kozlov, Alexey M. ;
Czech, Lucas ;
Morel, Benoit ;
Darriba, Diego ;
Flouri, Tomas ;
Stamatakis, Alexandros .
SYSTEMATIC BIOLOGY, 2019, 68 (02) :365-369