Identification of transposable elements using multiple alignments of related genomes

被引:38
作者
Caspi, A [1 ]
Pachter, L
机构
[1] Univ Calif San Franciswco Univ Calif Berkeley, Joint Grad Grp Bioengn, Portland, OR 97210 USA
[2] Univ Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
关键词
D O I
10.1101/gr.4361206
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Accurate genome-wide cataloging of transposable elements (TEs) will facilitate our understanding of mobile DNA evolution, expose the genomic effects of TEs on the host genome, and improve the quality of assembled genomes. Using the availability of several nearly complete Drosophila genomes and developments in whole genome alignment methods, we introduce a large-scale comparative method for identifying repetitive mobile DNA regions. These regions are highly enriched for transposable elements. Our method has two main features distinguishing it from other repeat-finding methods. First, rather than relying on sequence similarity to determine the location of repeats, the genomic artifacts of the transposition mechanism itself are systematically tracked in the context of multiple alignments. Second, we can derive bounds on the age of each repeat instance based on the phylogenetic species tree. We report results obtained using both complete and draft sequences of four closely related Drosophila genomes and validate our results with manually curated TE annotations in the Drosophila melanogaster euchromatin. We show the utility of our findings in exploring both transposable elements and their host genomes: In the study of TEs, we offer predictions for novel families, annotate new insertions of known families, and show data that support the hypothesis that all known TE families in A melanogaster were recently active; in the study of the host, we show how our findings can be used to determine shifts in the eu-heterochromatin junction in the pericentric chromosome regions.
引用
收藏
页码:260 / 270
页数:11
相关论文
共 41 条
  • [1] Agarwal P, 1994, Proc Int Conf Intell Syst Mol Biol, V2, P1
  • [2] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [3] [Anonymous], GENOME BIOL
  • [4] [Anonymous], GENOME BIOL
  • [5] Automated de novo identification of repeat sequence families in sequenced genomes
    Bao, ZR
    Eddy, SR
    [J]. GENOME RESEARCH, 2002, 12 (08) : 1269 - 1276
  • [6] On the abundance and distribution of transposable elements in the genome of Drosophila melanogaster
    Bartolomé, C
    Maside, X
    Charlesworth, B
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2002, 19 (06) : 926 - 937
  • [7] MaskerAid:: a performance enhancement to RepeatMasker
    Bedell, JA
    Korf, I
    Gish, W
    [J]. BIOINFORMATICS, 2000, 16 (11) : 1040 - 1041
  • [8] Natural genetic variation caused by transposable elements in humans
    Bennettt, EA
    Coleman, LE
    Tsui, C
    Pittard, WS
    Devine, SE
    [J]. GENETICS, 2004, 168 (02) : 933 - 951
  • [9] Evolutionary sequence analysis of complete eukaryote genomes
    Blair, JE
    Shah, P
    Hedges, SB
    [J]. BMC BIOINFORMATICS, 2005, 6 (1)
  • [10] Phylogenetic shadowing of primate sequences to find functional regions of the human genome
    Boffelli, D
    McAuliffe, J
    Ovcharenko, D
    Lewis, KD
    Ovcharenko, I
    Pachter, L
    Rubin, EM
    [J]. SCIENCE, 2003, 299 (5611) : 1391 - 1394