2x genomes - depth does matter

被引:43
作者
Milinkovitch, Michel C. [1 ]
Helaers, Raphael [2 ]
Depiereux, Eric [2 ]
Tzika, Athanasia C. [1 ,3 ]
Gabaldon, Toni [4 ]
机构
[1] Dept Zool & Anim Biol, LANE, CH-1211 Geneva 4, Switzerland
[2] Fac Univ Notre Dame Paix, Dept Biol, B-5000 Namur, Belgium
[3] Univ Libre Bruxelles, Dept Ecol & Evolutionary Biol, B-1050 Brussels, Belgium
[4] CRG, Barcelona 08003, Spain
基金
瑞士国家科学基金会;
关键词
TREE; IDENTIFICATION; DUPLICATION; ALGORITHM; SEQUENCE; TIME;
D O I
10.1186/gb-2010-11-2-r16
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Given the availability of full genome sequences, mapping gene gains, duplications, and losses during evolution should theoretically be straightforward. However, this endeavor suffers from overemphasis on detecting conserved genome features, which in turn has led to sequencing multiple eutherian genomes with low coverage rather than fewer genomes with high-coverage and more even distribution in the phylogeny. Although limitations associated with analysis of low coverage genomes are recognized, they have not been quantified. Results: Here, using recently developed comparative genomic application systems, we evaluate the impact of low-coverage genomes on inferences pertaining to gene gains and losses when analyzing eukaryote genome evolution through gene duplication. We demonstrate that, when performing inference of genome content evolution, low-coverage genomes generate not only a massive number of false gene losses, but also striking artifacts in gene duplication inference, especially at the most recent common ancestor of low-coverage genomes. We show that the artifactual gains are caused by the low coverage of genome sequence per se rather than by the increased taxon sampling in a biased portion of the species tree. Conclusions: We argue that it will remain difficult to differentiate artifacts from true changes in modes and tempo of genome evolution until there is better homogeneity in both taxon sampling and high-coverage sequencing. This is important for broadening the utility of full genome data to the community of evolutionary biologists, whose interests go well beyond widely conserved physiologies and developmental patterns as they seek to understand the generative mechanisms underlying biological diversity.
引用
收藏
页数:12
相关论文
共 34 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]   Automatic clustering of orthologs and inparalogs shared by multiple proteomes [J].
Alexeyenko, Andrey ;
Tamas, Ivica ;
Liu, Gang ;
Sonnhammer, Erik L. L. .
BIOINFORMATICS, 2006, 22 (14) :E9-E15
[3]  
[Anonymous], ENS GEN BROWS
[4]  
[Anonymous], MULTIPLE MAMMALIAN G
[5]   Orthologous repeats and mammalian phylogenetic inference [J].
Bashir, A ;
Ye, C ;
Price, AL ;
Bafna, V .
GENOME RESEARCH, 2005, 15 (07) :998-1006
[6]   Paleontological evidence to date the tree of life [J].
Benton, Michael J. ;
Donoghue, Philip C. J. .
MOLECULAR BIOLOGY AND EVOLUTION, 2007, 24 (01) :26-53
[7]   The gain and loss of genes during 600 million years of vertebrate evolution [J].
Blomme, Tine ;
Vandepoele, Klaas ;
De Bodt, Stefanie ;
Simillion, Cedric ;
Maere, Steven ;
Van de Peer, Yves .
GENOME BIOLOGY, 2006, 7 (05)
[8]   Broad phylogenomic sampling improves resolution of the animal tree of life [J].
Dunn, Casey W. ;
Hejnol, Andreas ;
Matus, David Q. ;
Pang, Kevin ;
Browne, William E. ;
Smith, Stephen A. ;
Seaver, Elaine ;
Rouse, Greg W. ;
Obst, Matthias ;
Edgecombe, Gregory D. ;
Sorensen, Martin V. ;
Haddock, Steven H. D. ;
Schmidt-Rhaesa, Andreas ;
Okusu, Akiko ;
Kristensen, Reinhardt Mobjerg ;
Wheeler, Ward C. ;
Martindale, Mark Q. ;
Giribet, Gonzalo .
NATURE, 2008, 452 (7188) :745-U5
[9]   MUSCLE: a multiple sequence alignment method with reduced time and space complexity [J].
Edgar, RC .
BMC BIOINFORMATICS, 2004, 5 (1) :1-19
[10]   Real-Time DNA Sequencing from Single Polymerase Molecules [J].
Eid, John ;
Fehr, Adrian ;
Gray, Jeremy ;
Luong, Khai ;
Lyle, John ;
Otto, Geoff ;
Peluso, Paul ;
Rank, David ;
Baybayan, Primo ;
Bettman, Brad ;
Bibillo, Arkadiusz ;
Bjornson, Keith ;
Chaudhuri, Bidhan ;
Christians, Frederick ;
Cicero, Ronald ;
Clark, Sonya ;
Dalal, Ravindra ;
deWinter, Alex ;
Dixon, John ;
Foquet, Mathieu ;
Gaertner, Alfred ;
Hardenbol, Paul ;
Heiner, Cheryl ;
Hester, Kevin ;
Holden, David ;
Kearns, Gregory ;
Kong, Xiangxu ;
Kuse, Ronald ;
Lacroix, Yves ;
Lin, Steven ;
Lundquist, Paul ;
Ma, Congcong ;
Marks, Patrick ;
Maxham, Mark ;
Murphy, Devon ;
Park, Insil ;
Pham, Thang ;
Phillips, Michael ;
Roy, Joy ;
Sebra, Robert ;
Shen, Gene ;
Sorenson, Jon ;
Tomaney, Austin ;
Travers, Kevin ;
Trulson, Mark ;
Vieceli, John ;
Wegener, Jeffrey ;
Wu, Dawn ;
Yang, Alicia ;
Zaccarin, Denis .
SCIENCE, 2009, 323 (5910) :133-138