Inference of Huge Trees under Maximum Likelihood

被引:0
作者
Izquierdo-Carrasco, Fernando [1 ]
Stamatakis, Alexandros [1 ]
机构
[1] Heidelberg Inst Theoret Studies, Heidelberg, Germany
来源
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW) | 2012年
关键词
Phylogenetic likelihood function; memory vs. runtime trade-offs; memory requirements; RAxML; PHYLOGENETIC INFERENCE; DNA-SEQUENCES; ALGORITHMS;
D O I
10.1109/IPDPSW.2012.309
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The wide adoption of Next-Generation Sequencing technologies in recent years has generated an avalanche of genetic data, which poses new challenges for large-scale maximum likelihood-based phylogenetic analyses. Improving the scalability of search algorithms and reducing the high memory requirements for computing the likelihood represent major computational challenges in this context. We have introduced methods for solving these key problems and provided respective proof-of-concept implementations. Moreover, we have developed a new tree search strategy that can reduce run times by more than 50% while yielding equally good trees (in the statistical sense). To reduce memory requirements, we explored the applicability of external memory (out-of-core) algorithms as well as a concept that trades memory for additional computations in the likelihood function. The latter concept, only induces a surprisingly small increase in overall execution times. When trading 50% of the required RAM for additional computations, the average execution time increase-because of additional computations-amounts to only 15%. All concepts presented here are sufficiently generic such that they can be applied to all programs that rely on the phylogenetic likelihood function. Thereby, the approaches we have developed will contribute to enable large-scale inferences of whole-genome phylogenies.
引用
收藏
页码:2490 / 2493
页数:4
相关论文
共 14 条
  • [1] [Anonymous], SYSTEMATIC BIOL
  • [2] EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH
    FELSENSTEIN, J
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) : 368 - 376
  • [3] Phylogenetic analysis of 73 060 taxa corroborates major eukaryotic groups
    Goloboff, Pablo A.
    Catalano, Santiago A.
    Mirande, J. Marcos
    Szumik, Claudia A.
    Arias, J. Salvador
    Kallersjo, Mari
    Farris, James S.
    [J]. CLADISTICS, 2009, 25 (03) : 211 - 230
  • [4] New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0
    Guindon, Stephane
    Dufayard, Jean-Francois
    Lefort, Vincent
    Anisimova, Maria
    Hordijk, Wim
    Gascuel, Olivier
    [J]. SYSTEMATIC BIOLOGY, 2010, 59 (03) : 307 - 321
  • [5] Algorithms, data structures, and numerics for likelihood-based phylogenetic inference of huge trees
    Izquierdo-Carrasco, Fernando
    Smith, Stephen A.
    Stamatakis, Alexandros
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [6] Izquierdo-Carrasco Fernando, 2011, P 2011 IEEE INT S PA, P444
  • [7] Le S.V., 2004, PROC OF GFKL CONFERE
  • [8] FastTree 2-Approximately Maximum-Likelihood Trees for Large Alignments
    Price, Morgan N.
    Dehal, Paramvir S.
    Arkin, Adam P.
    [J]. PLOS ONE, 2010, 5 (03):
  • [9] Pyrosequencing sheds light on DNA sequencing
    Ronaghi, M
    [J]. GENOME RESEARCH, 2001, 11 (01) : 3 - 11
  • [10] MrBayes 3: Bayesian phylogenetic inference under mixed models
    Ronquist, F
    Huelsenbeck, JP
    [J]. BIOINFORMATICS, 2003, 19 (12) : 1572 - 1574