Considering Transposable Element Diversification in De Novo Annotation Approaches

被引:324
作者
Flutre, Timothee [1 ]
Duprat, Elodie [2 ]
Feuillet, Catherine [3 ]
Quesneville, Hadi [1 ]
机构
[1] INRA Ctr Versailles Grignon, Unite Rech Genom Info, UR 1164, Versailles, France
[2] Univ Paris Diderot, Inst Mineral & Phys Milieux Condenses, IPGP, UPMC,CNRS,UMR 7590, Paris, France
[3] INRA Domaine Crouel, UMR 1095, Clermont Ferrand, France
关键词
MULTIPLE SEQUENCE ALIGNMENT; GENOME SEQUENCE; TANDEM REPEATS; IDENTIFICATION; CLASSIFICATION; DNA; EVOLUTION; TRANSPOSITION; FAMILIES; EFFICIENT;
D O I
10.1371/journal.pone.0016526
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Transposable elements (TEs) are mobile, repetitive DNA sequences that are almost ubiquitous in prokaryotic and eukaryotic genomes. They have a large impact on genome structure, function and evolution. With the recent development of high-throughput sequencing methods, many genome sequences have become available, making possible comparative studies of TE dynamics at an unprecedented scale. Several methods have been proposed for the de novo identification of TEs in sequenced genomes. Most begin with the detection of genomic repeats, but the subsequent steps for defining TE families differ. High-quality TE annotations are available for the Drosophila melanogaster and Arabidopsis thaliana genome sequences, providing a solid basis for the benchmarking of such methods. We compared the performance of specific algorithms for the clustering of interspersed repeats and found that only a particular combination of algorithms detected TE families with good recovery of the reference sequences. We then applied a new procedure for reconciling the different clustering results and classifying TE sequences. The whole approach was implemented in a pipeline using the REPET package. Finally, we show that our combined approach highlights the dynamics of well defined TE families by making it possible to identify structural variations among their copies. This approach makes it possible to annotate TE families and to study their diversification in a single analysis, improving our understanding of TE dynamics at the whole-genome scale and for diverse species.
引用
收藏
页数:15
相关论文
共 64 条
[1]  
*12 DROS GEN CONS, 2007, NATURE, V450, P203
[2]  
ABAD P, 2008, NAT BIOTECHNOL, P909
[3]   TEclass-a tool for automated classification of unknown eukaryotic transposable elements [J].
Abrusan, Gyorgy ;
Grundmann, Norbert ;
DeMester, Luc ;
Makalowski, Wojciech .
BIOINFORMATICS, 2009, 25 (10) :1329-1330
[4]   The genome sequence of Drosophila melanogaster [J].
Adams, MD ;
Celniker, SE ;
Holt, RA ;
Evans, CA ;
Gocayne, JD ;
Amanatides, PG ;
Scherer, SE ;
Li, PW ;
Hoskins, RA ;
Galle, RF ;
George, RA ;
Lewis, SE ;
Richards, S ;
Ashburner, M ;
Henderson, SN ;
Sutton, GG ;
Wortman, JR ;
Yandell, MD ;
Zhang, Q ;
Chen, LX ;
Brandon, RC ;
Rogers, YHC ;
Blazej, RG ;
Champe, M ;
Pfeiffer, BD ;
Wan, KH ;
Doyle, C ;
Baxter, EG ;
Helt, G ;
Nelson, CR ;
Miklos, GLG ;
Abril, JF ;
Agbayani, A ;
An, HJ ;
Andrews-Pfannkoch, C ;
Baldwin, D ;
Ballew, RM ;
Basu, A ;
Baxendale, J ;
Bayraktaroglu, L ;
Beasley, EM ;
Beeson, KY ;
Benos, PV ;
Berman, BP ;
Bhandari, D ;
Bolshakov, S ;
Borkova, D ;
Botchan, MR ;
Bouck, J ;
Brokstein, P .
SCIENCE, 2000, 287 (5461) :2185-2195
[5]   Transposition mediated by RAG1 and RAG2 and its implications for the evolution of the immune system [J].
Agrawal, A ;
Eastman, QM ;
Schatz, DG .
NATURE, 1998, 394 (6695) :744-751
[6]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[7]   Automated de novo identification of repeat sequence families in sequenced genomes [J].
Bao, ZR ;
Eddy, SR .
GENOME RESEARCH, 2002, 12 (08) :1269-1276
[8]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[9]   Discovering and detecting transposable elements in genome sequences [J].
Bergman, Casey M. ;
Quesneville, Hadi .
BRIEFINGS IN BIOINFORMATICS, 2007, 8 (06) :382-392
[10]   Recurrent insertion and duplication generate networks of transposable element sequences in the Drosophila melanogaster genome [J].
Bergman, Casey M. ;
Quesneville, Hadi ;
Anxolabehere, Dominique ;
Ashburner, Michael .
GENOME BIOLOGY, 2006, 7 (11)