Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods

被引:11
|
作者
Duchene, Sebastian [1 ]
Duchene, David A. [2 ]
Geoghegan, Jemma L. [3 ]
Dyson, Zoe A. [1 ]
Hawkey, Jane [1 ]
Holt, Kathryn E. [1 ]
机构
[1] Univ Melbourne, Mol Sci & Biotechnol Inst Bio21, Dept Biochem & Mol Biol, Parkville, Vic 3020, Australia
[2] Univ Sydney, Sch Life & Environm Sci, Sydney, NSW 2006, Australia
[3] Macquarie Univ, Dept Biol Sci, Sydney, NSW 2109, Australia
来源
BMC EVOLUTIONARY BIOLOGY | 2018年 / 18卷
基金
英国医学研究理事会; 澳大利亚国家健康与医学研究理事会; 英国惠康基金;
关键词
Bayesian phylogenetics; Phylodynamics; Molecular clock; Bacterial evolution; ESTIMATING EVOLUTIONARY RATES; EPIDEMIC SPREAD; TRANSMISSION; PERFORMANCE; INFERENCE; HISTORY; MODELS; HIV;
D O I
10.1186/s12862-018-1210-5
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Recent developments in sequencing technologies make it possible to obtain genome sequences from a large number of isolates in a very short time. Bayesian phylogenetic approaches can take advantage of these data by simultaneously inferring the phylogenetic tree, evolutionary timescale, and demographic parameters (such as population growth rates), while naturally integrating uncertainty in all parameters. Despite their desirable properties, Bayesian approaches can be computationally intensive, hindering their use for outbreak investigations involving genome data for a large numbers of pathogen isolates. An alternative to using full Bayesian inference is to use a hybrid approach, where the phylogenetic tree and evolutionary timescale are estimated first using maximum likelihood. Under this hybrid approach, demographic parameters are inferred from estimated trees instead of the sequence data, using maximum likelihood, Bayesian inference, or approximate Bayesian computation. This can vastly reduce the computational burden, but has the disadvantage of ignoring the uncertainty in the phylogenetic tree and evolutionary timescale. Results: We compared the performance of a fully Bayesian and a hybrid method by analysing six whole-genome SNP data sets from a range of bacteria and simulations. The estimates from the two methods were very similar, suggesting that the hybrid method is a valid alternative for very large datasets. However, we also found that congruence between these methods is contingent on the presence of strong temporal structure in the data (i.e. clocklike behaviour), which is typically verified using a date-randomisation test in a Bayesian framework. To reduce the computational burden of this Bayesian test we implemented a date-randomisation test using a rapid maximum likelihood method, which has similar performance to its Bayesian counterpart. Conclusions: Hybrid approaches can produce reliable inferences of evolutionary timescales and phylodynamic parameters in a fraction of the time required for fully Bayesian analyses. As such, they are a valuable alternative in outbreak studies involving a large number of isolates.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] A practical introduction to sequentially Markovian coalescent methods for estimating demographic history from genomic data
    Mather, Niklas
    Traves, Samuel M.
    Ho, Simon Y. W.
    ECOLOGY AND EVOLUTION, 2020, 10 (01): : 579 - 589
  • [22] TiDeTree: a Bayesian phylogenetic framework to estimate single-cell trees and population dynamic parameters from genetic lineage tracing data
    Seidel, Sophie
    Stadler, Tanja
    PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2022, 289 (1986)
  • [23] Methods for Assessing Population Relationships and History Using Genomic Data
    Moorjani, Priya
    Hellenthal, Garrett
    ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, 2023, 24 : 305 - 332
  • [24] Inferring population genetics parameters of evolving viruses using time-series data
    Zinger, Tal
    Gelbart, Maoz
    Miller, Danielle
    Pennings, Pleuni S.
    Stern, Adi
    VIRUS EVOLUTION, 2019, 5 (01)
  • [25] Comparative evaluation of maximum parsimony and Bayesian phylogenetic reconstruction using empirical morphological data
    Schrago, Carlos G.
    Aguiar, Barbara O.
    Mello, Beatriz
    JOURNAL OF EVOLUTIONARY BIOLOGY, 2018, 31 (10) : 1477 - 1484
  • [26] Inferring the Demographic History of African Farmers and Pygmy Hunter-Gatherers Using a Multilocus Resequencing Data Set
    Patin, Etienne
    Laval, Guillaume
    Barreiro, Luis B.
    Salas, Antonio
    Semino, Ornella
    Santachiara-Benerecetti, Silvana
    Kidd, Kenneth K.
    Kidd, Judith R.
    Van der Veen, Lolke
    Hombert, Jean-Marie
    Gessain, Antoine
    Froment, Alain
    Bahuchet, Serge
    Heyer, Evelyne
    Quintana-Murci, Lluis
    PLOS GENETICS, 2009, 5 (04)
  • [27] Inferring past demographic changes from contemporary genetic data: A simulation-based evaluation of the ABC methods implemented in diyabc
    Cabrera, Andrea A.
    Palsboll, Per J.
    MOLECULAR ECOLOGY RESOURCES, 2017, 17 (06) : e94 - e110
  • [28] Demographic inferences using short-read genomic data in an approximate Bayesian computation framework: in silico evaluation of power, biases and proof of concept in Atlantic walrus
    Shafer, Aaron B. A.
    Gattepaille, Lucie M.
    Stewart, Robert E. A.
    Wolf, Jochen B. W.
    MOLECULAR ECOLOGY, 2015, 24 (02) : 328 - 345
  • [29] Using Historical Data With Bayesian Methods in Early Clinical Trial Monitoring
    French, Jonathan L.
    Thomas, Neal
    Wang, Cunshan
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2012, 4 (04): : 384 - 394
  • [30] The bioinvasion of Guam: inferring geographic origin, pace, pattern and process of an invasive lizard (Carlia) in the Pacific using multi-locus genomic data
    Austin, Christopher C.
    Rittmeyer, Eric N.
    Oliver, Lauren A.
    Andermann, John O.
    Zug, George R.
    Rodda, Gordon H.
    Jackson, Nathan D.
    BIOLOGICAL INVASIONS, 2011, 13 (09) : 1951 - 1967