A New Orthology Assessment Method for Phylogenomic Data: Unrooted Phylogenetic Orthology

被引:53
作者
Ballesteros, Jesus A. [1 ]
Hormiga, Gustavo [1 ]
机构
[1] George Washington Univ, Dept Biol Sci, Washington, DC 20052 USA
基金
美国国家科学基金会;
关键词
Markov cluster; protein homology; spiders; Araneae; transcriptomics; genomics; COMPARATIVE TRANSCRIPTOMICS; QUALITY ASSESSMENT; TREE; EVOLUTION; GENOMICS; SYSTEMATICS; PREDICTION; SEQUENCES; ALGORITHM; ALIGNMENT;
D O I
10.1093/molbev/msw069
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Current sequencing technologies are making available unprecedented amounts of genetic data for a large variety of species including nonmodel organisms. Although many phylogenomic surveys spend considerable time finding orthologs from the wealth of sequence data, these results do not transcend the original study and after being processed for specific phylogenetic purposes these orthologs do not become stable orthology hypotheses. We describe a procedure to detect and document the phylogenetic distribution of orthologs allowing researchers to use this information to guide selection of loci best suited to test specific evolutionary questions. At the core of this pipeline is a new phylogenetic orthology method that is neither affected by the position of the root nor requires explicit assignment of outgroups. We discuss the properties of this new orthology assessment method and exemplify its utility for phylogenomics using a small insects dataset. In addition, we exemplify the pipeline to identify and document stable orthologs for the group of orb-weaving spiders (Araneoidea) using RNAseq data. The scripts used in this study, along with sample files and additional documentation, are available at https://github.com/ballesterus/UPhO.
引用
收藏
页码:2117 / 2134
页数:18
相关论文
共 99 条
[31]   An efficient algorithm for large-scale detection of protein families [J].
Enright, AJ ;
Van Dongen, S ;
Ouzounis, CA .
NUCLEIC ACIDS RESEARCH, 2002, 30 (07) :1575-1584
[32]   The i5K Initiative:Advancing Arthropod Genomics for Knowledge, Human Health,Agriculture, and the Environment i5K CONSORTIUM [J].
Evans, Jay D. ;
Brown, Susan J. ;
Hackett, Kevin J. ;
Robinson, Gene ;
Richards, Stephen ;
Lawson, Daniel ;
Elsik, Christine ;
Coddington, Jonathan ;
Edwards, Owain ;
Emrich, Scott ;
Gabaldon, Toni ;
Goldsmith, Marian ;
Hanes, Glenn ;
Misof, Bernard ;
Munoz-Torres, Monica ;
Niehuis, Oliver ;
Papanicolaou, Alexie ;
Pfrender, Michael ;
Poelchau, Monica ;
Purcell-Miramontes, Mary ;
Robertson, Hugh M. ;
Ryder, Oliver ;
Tagu, Denis ;
Torres, Tatiana ;
Zdobnov, Evgeny ;
Zhang, Guojie ;
Zhou, Xin .
JOURNAL OF HEREDITY, 2013, 104 (05) :595-600
[33]   Phylogenomic Analysis of Spiders Reveals Nonmonophyly of Orb Weavers [J].
Fernandez, Rosa ;
Hormiga, Gustavo ;
Giribet, Gonzalo .
CURRENT BIOLOGY, 2014, 24 (15) :1772-1777
[34]   Evaluating Topological Conflict in Centipede Phylogeny Using Transcriptomic Data Sets [J].
Fernandez, Rosa ;
Laumer, Christopher E. ;
Vahtera, Varpu ;
Libro, Silvia ;
Kaluziak, Stefan ;
Sharma, Prashant P. ;
Perez-Porro, Alicia R. ;
Edgecombe, Gregory D. ;
Giribet, Gonzalo .
MOLECULAR BIOLOGY AND EVOLUTION, 2014, 31 (06) :1500-1513
[35]   DISTINGUISHING HOMOLOGOUS FROM ANALOGOUS PROTEINS [J].
FITCH, WM .
SYSTEMATIC ZOOLOGY, 1970, 19 (02) :99-&
[36]   CD-HIT: accelerated for clustering the next-generation sequencing data [J].
Fu, Limin ;
Niu, Beifang ;
Zhu, Zhengwei ;
Wu, Sitao ;
Li, Weizhong .
BIOINFORMATICS, 2012, 28 (23) :3150-3152
[37]   Functional and evolutionary implications of gene orthology [J].
Gabaldon, Toni ;
Koonin, Eugene V. .
NATURE REVIEWS GENETICS, 2013, 14 (05) :360-366
[38]   Large-scale assignment of orthology: back to phylogenetics? [J].
Gabaldon, Toni .
GENOME BIOLOGY, 2008, 9 (10) :235
[39]   Full-length transcriptome assembly from RNA-Seq data without a reference genome [J].
Grabherr, Manfred G. ;
Haas, Brian J. ;
Yassour, Moran ;
Levin, Joshua Z. ;
Thompson, Dawn A. ;
Amit, Ido ;
Adiconis, Xian ;
Fan, Lin ;
Raychowdhury, Raktima ;
Zeng, Qiandong ;
Chen, Zehua ;
Mauceli, Evan ;
Hacohen, Nir ;
Gnirke, Andreas ;
Rhind, Nicholas ;
di Palma, Federica ;
Birren, Bruce W. ;
Nusbaum, Chad ;
Lindblad-Toh, Kerstin ;
Friedman, Nir ;
Regev, Aviv .
NATURE BIOTECHNOLOGY, 2011, 29 (07) :644-U130
[40]   SuperQ: Computing Supernetworks from Quartets [J].
Gruenewald, Stefan ;
Spillner, Andreas ;
Bastkowski, Sarah ;
Boegershausen, Anja ;
Moulton, Vincent .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2013, 10 (01) :151-160