The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes

被引:15
|
作者
Estill, James C. [1 ]
Bennetzen, Jeffrey L. [2 ]
机构
[1] Univ Georgia, Dept Plant Biol, Athens, GA 30602 USA
[2] Univ Georgia, Dept Genet, Athens, GA 30602 USA
关键词
DE-NOVO IDENTIFICATION; DATABASE; PREDICTION; ALIGNMENT; SEQUENCE; PROGRAM; FAMILIES; VISUALIZATION; RESOURCE; BROWSER;
D O I
10.1186/1746-4811-5-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: High quality annotation of the genes and transposable elements in complex genomes requires a human-curated integration of multiple sources of computational evidence. These evidences include results from a diversity of ab initio prediction programs as well as homology-based searches. Most of these programs operate on a single contiguous sequence at a time, and the results are generated in a diverse array of readable formats that must be translated to a standardized file format. These translated results must then be concatenated into a single source, and then presented in an integrated form for human curation. Results: We have designed, implemented, and assessed a Perl-based workflow named DAWGPAWS for the generation of computational results for human curation of the genes and transposable elements in plant genomes. The use of DAWGPAWS was found to accelerate annotation of 80-200 kb wheat DNA inserts in bacterial artificial chromosome (BAC) vectors by approximately twenty-fold and to also significantly improve the quality of the annotation in terms of completeness and accuracy. Conclusion: The DAWGPAWS genome annotation pipeline fills an important need in the annotation of plant genomes by generating computational evidences in a high throughput manner, translating these results to a common file format, and facilitating the human curation of these computational results. We have verified the value of DAWGPAWS by using this pipeline to annotate the genes and transposable elements in 220 BAC insertions from the hexaploid wheat genome (Triticum aestivum L.). DAWGPAWS can be applied to annotation efforts in other plant genomes with minor modifications of program-specific configuration files, and the modular design of the workflow facilitates integration into existing pipelines.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] SINES ELEMENTS IN PLANT GENOMES
    Sakowicz, Tomasz
    Gadzalski, Marek
    Pszczolkowski, Wiktor
    POSTEPY BIOLOGII KOMORKI, 2009, 36 (01) : 37 - 53
  • [22] Terminal-Repeat Retrotransposons with GAG Domain in Plant Genomes: A New Testimony on the Complex World of Transposable Elements
    Chaparro, Cristian
    Gayraud, Thomas
    de Souza, Rogerio Fernandes
    Domingues, Douglas Silva
    Akaffou, Selastique
    Laforga Vanzela, Andre Luis
    de Kochko, Alexandre
    Rigoreau, Michel
    Crouzillat, Dominique
    Hamon, Serge
    Hamon, Perla
    Guyot, Romain
    GENOME BIOLOGY AND EVOLUTION, 2015, 7 (02): : 493 - 504
  • [23] ANNOTATION OF PROTEIN-CODING GENES IN FUNGAL GENOMES
    Martinez, Diego
    Grigoriev, Igor
    Salamov, Asaf
    APPLIED AND COMPUTATIONAL MATHEMATICS, 2010, 9 : 56 - 65
  • [24] De Novo Annotation of Transposable Elements: Tackling the Fat Genome Issue
    Jamilloux, Veronique
    Daron, Josquin
    Choulet, Frederic
    Quesneville, Hadi
    PROCEEDINGS OF THE IEEE, 2017, 105 (03) : 474 - 481
  • [25] Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation
    Liu, Xuanzeng
    Zhao, Lina
    Majid, Muhammad
    Huang, Yuan
    MOBILE DNA, 2024, 15 (01)
  • [26] Annotation and sequence diversity of transposable elements in common bean (Phaseolus vulgaris)
    Gao, Dongying
    Abernathy, Brian
    Rohksar, Daniel
    Schmutz, Jeremy
    Jackson, Scott A.
    FRONTIERS IN PLANT SCIENCE, 2014, 5
  • [27] MEGAnnotator: a user-friendly pipeline for microbial genomes assembly and annotation
    Lugli, Gabriele Andrea
    Milani, Christian
    Mancabelli, Leonardo
    van Sinderen, Douwe
    Ventura, Marco
    FEMS MICROBIOLOGY LETTERS, 2016, 363 (07)
  • [28] HAMAP as SPARQL rules-A portable annotation pipeline for genomes and proteomes
    Bolleman, Jerven
    de Castro, Edouard
    Baratin, Delphine
    Gehant, Sebastien
    Cuche, Beatrice A.
    Auchincloss, Andrea H.
    Coudert, Elisabeth
    Hulo, Chantal
    Masson, Patrick
    Pedruzzi, Ivo
    Rivoire, Catherine
    Xenarios, Ioannis
    Redaschi, Nicole
    Bridge, Alan
    GIGASCIENCE, 2020, 9 (02):
  • [29] Genome-wide comparative analysis of transposable elements in Palmae genomes
    Ibrahim, Mohanad A.
    Al-Shomrani, Badr M.
    Alharbi, Sultan N.
    Elliott, Tyler A.
    Alsuabeyl, Mohammed S.
    Alqahtani, Fahad H.
    Manee, Manee M.
    FRONTIERS IN BIOSCIENCE-LANDMARK, 2021, 26 (11): : 1119 - 1131
  • [30] MicroRNA annotation of plant genomes - Do it right or not at all
    Taylor, Richard S.
    Tarver, James E.
    Foroozani, Alireza
    Donoghue, Philip C. J.
    BIOESSAYS, 2017, 39 (02)