The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes

被引:15
作者
Estill, James C. [1 ]
Bennetzen, Jeffrey L. [2 ]
机构
[1] Univ Georgia, Dept Plant Biol, Athens, GA 30602 USA
[2] Univ Georgia, Dept Genet, Athens, GA 30602 USA
关键词
DE-NOVO IDENTIFICATION; DATABASE; PREDICTION; ALIGNMENT; SEQUENCE; PROGRAM; FAMILIES; VISUALIZATION; RESOURCE; BROWSER;
D O I
10.1186/1746-4811-5-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: High quality annotation of the genes and transposable elements in complex genomes requires a human-curated integration of multiple sources of computational evidence. These evidences include results from a diversity of ab initio prediction programs as well as homology-based searches. Most of these programs operate on a single contiguous sequence at a time, and the results are generated in a diverse array of readable formats that must be translated to a standardized file format. These translated results must then be concatenated into a single source, and then presented in an integrated form for human curation. Results: We have designed, implemented, and assessed a Perl-based workflow named DAWGPAWS for the generation of computational results for human curation of the genes and transposable elements in plant genomes. The use of DAWGPAWS was found to accelerate annotation of 80-200 kb wheat DNA inserts in bacterial artificial chromosome (BAC) vectors by approximately twenty-fold and to also significantly improve the quality of the annotation in terms of completeness and accuracy. Conclusion: The DAWGPAWS genome annotation pipeline fills an important need in the annotation of plant genomes by generating computational evidences in a high throughput manner, translating these results to a common file format, and facilitating the human curation of these computational results. We have verified the value of DAWGPAWS by using this pipeline to annotate the genes and transposable elements in 220 BAC insertions from the hexaploid wheat genome (Triticum aestivum L.). DAWGPAWS can be applied to annotation efforts in other plant genomes with minor modifications of program-specific configuration files, and the modular design of the workflow facilitates integration into existing pipelines.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] K-mer counting and curated libraries drive efficient annotation of repeats in plant genomes
    Contreras-Moreira, Bruno
    Filippi, Carla, V
    Naamati, Guy
    Giron, Carlos Garcia
    Allen, James E.
    Flicek, Paul
    PLANT GENOME, 2021, 14 (03)
  • [42] Diversity and evolution of transposable elements in the plant-parasitic nematodes
    Dayi, Mehmet
    BMC GENOMICS, 2024, 25 (01):
  • [43] Intron-rich dinoflagellate genomes driven by Introner transposable elements of unprecedented diversity
    Roy, Scott William
    Gozashti, Landen
    Bowser, Bradley A.
    Weinstein, Brooke N.
    Larue, Graham E.
    Corbett-Detig, Russell
    CURRENT BIOLOGY, 2023, 33 (01) : 189 - +
  • [44] A comprehensive annotation dataset of intact LTR retrotransposons of 300 plant genomes
    Zhou, Shan-Shan
    Yan, Xue-Mei
    Zhang, Kai-Fu
    Liu, Hui
    Xu, Jie
    Nie, Shuai
    Jia, Kai-Hua
    Jiao, Si-Qian
    Zhao, Wei
    Zhao, You-Jie
    Porth, Ilga
    El Kassaby, Yousry A.
    Wang, Tongli
    Mao, Jian-Feng
    SCIENTIFIC DATA, 2021, 8 (01)
  • [45] Rapid structural and epigenetic reorganization near transposable elements in hybrid and allopolyploid genomes in Spartina
    Parisod, Christian
    Salmon, Armel
    Zerjal, Tatiana
    Tenaillon, Maud
    Grandbastien, Marie-Angele
    Ainouche, Malika
    NEW PHYTOLOGIST, 2009, 184 (04) : 1003 - 1015
  • [46] Long identical multispecies elements in plant and animal genomes
    Reneker, Jeff
    Lyons, Eric
    Conant, Gavin C.
    Pires, J. Chris
    Freeling, Michael
    Shyu, Chi-Ren
    Korkin, Dmitry
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (19) : E1183 - E1191
  • [47] Welcome to the big leaves: Best practices for improving genome annotation in non-model plant genomes
    Vuruputoor, Vidya S.
    Monyak, Daniel
    Fetter, Karl C.
    Webster, Cynthia
    Bhattarai, Akriti
    Shrestha, Bikash
    Zaman, Sumaira
    Bennett, Jeremy
    McEvoy, Susan L.
    Caballero, Madison
    Wegrzyn, Jill L.
    APPLICATIONS IN PLANT SCIENCES, 2023, 11 (04):
  • [48] Improved annotation of protein-coding genes boundaries in metazoan mitochondrial genomes
    Donath, Alexander
    Juehling, Frank
    Al-Arab, Marwa
    Bernhart, Stephan H.
    Reinhardt, Franziska
    Stadler, Peter F.
    Middendorf, Martin
    Bernt, Matthias
    NUCLEIC ACIDS RESEARCH, 2019, 47 (20) : 10543 - 10552
  • [49] THE TEMPO AND MODE OF EVOLUTION OF TRANSPOSABLE ELEMENTS AS REVEALED BY MOLECULAR PHYLOGENIES RECONSTRUCTED FROM MOSQUITO GENOMES
    Struchiner, Claudio J.
    Massad, Eduardo
    Tu, Zhijian
    Ribeiro, Jose M. C.
    EVOLUTION, 2009, 63 (12) : 3136 - 3146
  • [50] Phytophthora infestans Argonaute 1 binds microRNA and small RNAs from effector genes and transposable elements
    Asman, Anna K. M.
    Fogelqvist, Johan
    Vetukuri, Ramesh R.
    Dixelius, Christina
    NEW PHYTOLOGIST, 2016, 211 (03) : 993 - 1007