Cophylogeny Reconstruction via an Approximate Bayesian Computation

被引:26
作者
Baudet, C. [1 ,2 ,3 ,4 ]
Donati, B. [1 ,2 ,3 ,4 ,5 ]
Sinaimeri, B. [1 ,2 ,3 ,4 ]
Crescenzi, P. [5 ]
Gautier, C. [1 ,2 ,3 ,4 ]
Matias, C. [6 ,7 ]
Sagot, M. -F. [1 ,2 ,3 ,4 ]
机构
[1] INRIA Grenoble Rhone Alpes, F-38330 Montbonnot St Martin, France
[2] Univ Lyon, F-69000 Lyon, France
[3] Univ Lyon 1, F-69622 Villeurbanne, France
[4] CNRS, Lab Biometrie & Biol Evolut, UMR5558, F-69622 Villeurbanne, France
[5] Univ Florence, Dipartimento Sistemi & Informat, I-50134 Florence, Italy
[6] Univ Evry, UMR CNRS 8071, Lab Stat & Genome, Evry, France
[7] Univ Evry, USC INRA, Evry, France
基金
欧洲研究理事会;
关键词
approximate Bayesian computation; cophylogeny; host; parasite systems; likelihood-free inference; RECONCILIATION; HOST; IDENTIFICATION; DUPLICATIONS; ALGORITHMS; DIVERSITY; EVOLUTION; LINEAGE; HISTORY; GENES;
D O I
10.1093/sysbio/syu129
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Despite an increasingly vast literature on cophylogenetic reconstructions for studying host-parasite associations, understanding the common evolutionary history of such systems remains a problem that is far from being solved. Most algorithms for host-parasite reconciliation use an event-based model, where the events include in general (a subset of) cospeciation, duplication, loss, and host switch. All known parsimonious event-based methods then assign a cost to each type of event in order to find a reconstruction of minimum cost. The main problem with this approach is that the cost of the events strongly influences the reconciliation obtained. Some earlier approaches attempt to avoid this problem by finding a Pareto set of solutions and hence by considering event costs under some minimization constraints. To deal with this problem, we developed an algorithm, called Coala, for estimating the frequency of the events based on an approximate Bayesian computation approach. The benefits of this method are 2-fold: (i) it provides more confidence in the set of costs to be used in a reconciliation, and (ii) it allows estimation of the frequency of the events in cases where the data set consists of trees with a large number of taxa. We evaluate our method on simulated and on biological data sets. We show that in both cases, for the same pair of host and parasite trees, different sets of frequencies for the events lead to equally probable solutions. Moreover, often these solutions differ greatly in terms of the number of inferred events. It appears crucial to take this into account before attempting any further biological interpretation of such reconciliations. More generally, we also show that the set of frequencies can vary widely depending on the input host and parasite trees. Indiscriminately applying a standard vector of costs may thus not be a good strategy.
引用
收藏
页码:416 / 431
页数:16
相关论文
共 62 条
[1]  
[Anonymous], 2003, Bayesian Data Analysis
[2]  
[Anonymous], 2001, P 5 ANN INT C COMPUT, DOI DOI 10.1145/369133.369188
[3]   Bayesian gene/species tree reconciliation and orthology analysis using MCMC [J].
Arvestad, Lars ;
Berglund, Ann-Charlotte ;
Lagergren, Jens ;
Sennblad, Bengt .
BIOINFORMATICS, 2003, 19 :i7-i15
[4]   Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss [J].
Bansal, Mukul S. ;
Alm, Eric J. ;
Kellis, Manolis .
BIOINFORMATICS, 2012, 28 (12) :I283-I291
[5]   Bounding the number of hybridisation events for a consistent evolutionary history [J].
Baroni, M ;
Grünewald, S ;
Moulton, V ;
Semple, C .
JOURNAL OF MATHEMATICAL BIOLOGY, 2005, 51 (02) :171-182
[6]  
Beaumont MA, 2002, GENETICS, V162, P2025
[7]   Adaptive approximate Bayesian computation [J].
Beaumont, Mark A. ;
Cornuet, Jean-Marie ;
Marin, Jean-Michel ;
Robert, Christian P. .
BIOMETRIKA, 2009, 96 (04) :983-990
[8]   Computing the Distribution of a Tree Metric [J].
Bryant, David ;
Steel, Mike .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2009, 6 (03) :420-426
[9]  
Charleston M.A., 2002, LECT NOTES PHYS, V585, P122
[10]  
Charleston M. A., 2012, TREEMAP 3B