The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes

被引:15
作者
Estill, James C. [1 ]
Bennetzen, Jeffrey L. [2 ]
机构
[1] Univ Georgia, Dept Plant Biol, Athens, GA 30602 USA
[2] Univ Georgia, Dept Genet, Athens, GA 30602 USA
关键词
DE-NOVO IDENTIFICATION; DATABASE; PREDICTION; ALIGNMENT; SEQUENCE; PROGRAM; FAMILIES; VISUALIZATION; RESOURCE; BROWSER;
D O I
10.1186/1746-4811-5-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: High quality annotation of the genes and transposable elements in complex genomes requires a human-curated integration of multiple sources of computational evidence. These evidences include results from a diversity of ab initio prediction programs as well as homology-based searches. Most of these programs operate on a single contiguous sequence at a time, and the results are generated in a diverse array of readable formats that must be translated to a standardized file format. These translated results must then be concatenated into a single source, and then presented in an integrated form for human curation. Results: We have designed, implemented, and assessed a Perl-based workflow named DAWGPAWS for the generation of computational results for human curation of the genes and transposable elements in plant genomes. The use of DAWGPAWS was found to accelerate annotation of 80-200 kb wheat DNA inserts in bacterial artificial chromosome (BAC) vectors by approximately twenty-fold and to also significantly improve the quality of the annotation in terms of completeness and accuracy. Conclusion: The DAWGPAWS genome annotation pipeline fills an important need in the annotation of plant genomes by generating computational evidences in a high throughput manner, translating these results to a common file format, and facilitating the human curation of these computational results. We have verified the value of DAWGPAWS by using this pipeline to annotate the genes and transposable elements in 220 BAC insertions from the hexaploid wheat genome (Triticum aestivum L.). DAWGPAWS can be applied to annotation efforts in other plant genomes with minor modifications of program-specific configuration files, and the modular design of the workflow facilitates integration into existing pipelines.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Deciphering the organelle genomes and transcriptomes of a common ornamental plant Ligustrum quihoui reveals multiple fragments of transposable elements in the mitogenome
    Yu, Xiaolei
    Jiang, Weiling
    Tan, Wei
    Zhang, Xiaoying
    Tian, Xiaoxuan
    INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2020, 165 (165) : 1988 - 1999
  • [32] Resolution of structural variation in diverse mouse genomes reveals chromatin remodeling due to transposable elements
    Ferraj, Ardian
    Audano, Peter A.
    Balachandran, Parithi
    Czechanski, Anne
    Flores, Jacob I.
    Radecki, Alexander A.
    Mosur, Varun
    Gordon, David S.
    Walawalkar, Isha A.
    Eichler, Evan E.
    Reinholdt, Laura G.
    Beck, Christine R.
    CELL GENOMICS, 2023, 3 (05):
  • [33] Annotation of transposable elements in the transcriptome of the Neotropical brown stink bug Euschistus heros and its chromosomal distribution
    Dionisio, Jaqueline Fernanda
    Pezenti, Larissa Forim
    de Souza, Rogerio Fernandes
    Sosa-Gomez, Daniel Ricardo
    da Rosa, Renata
    MOLECULAR GENETICS AND GENOMICS, 2023, 298 (06) : 1377 - 1388
  • [34] MITE Tracker: an accurate approach to identify miniature inverted-repeat transposable elements in large genomes
    Crescente, Juan Manuel
    Zavallo, Diego
    Helguera, Marcelo
    Vanzetti, Leonardo Sebastian
    BMC BIOINFORMATICS, 2018, 19
  • [35] Transcriptional Activity, Chromosomal Distribution and Expression Effects of Transposable Elements in Coffea Genomes
    Lopes, Fabricio R.
    Jjingo, Daudi
    da Silva, Carlos R. M.
    Andrade, Alan C.
    Marraccini, Pierre
    Teixeira, Joao B.
    Carazzolle, Marcelo F.
    Pereira, Goncalo A. G.
    Pereira, Luiz Filipe P.
    Vanzela, Andre L. L.
    Wang, Lu
    King Jordan, I.
    Carareto, Claudia M. A.
    PLOS ONE, 2013, 8 (11):
  • [36] Genomic Plasticity Mediated by Transposable Elements in the Plant Pathogenic Fungus Colletotrichum higginsianum
    Tsushima, Ayako
    Gan, Pamela
    Kumakura, Naoyoshi
    Narusaka, Mari
    Takano, Yoshitaka
    Narusaka, Yoshihiro
    Shirasu, Ken
    GENOME BIOLOGY AND EVOLUTION, 2019, 11 (05): : 1487 - 1500
  • [37] detectMITE: A novel approach to detect miniature inverted repeat transposable elements in genomes
    Ye, Congting
    Ji, Guoli
    Liang, Chun
    SCIENTIFIC REPORTS, 2016, 6
  • [38] Scanning of Transposable Elements and Analyzing Expression of Transposase Genes of Sweet Potato [Ipomoea batatas]
    Yan, Lang
    Gu, Ying-Hong
    Tao, Xiang
    Lai, Xian-Jun
    Zhang, Yi-Zheng
    Tan, Xue-Mei
    Wang, Haiyan
    PLOS ONE, 2014, 9 (03):
  • [39] Identifying repeats and transposable elements in sequenced genomes: how to find your way through the dense forest of programs
    Lerat, E.
    HEREDITY, 2010, 104 (06) : 520 - 533
  • [40] Semi-Supervised Pipeline for Autonomous Annotation of SARS-CoV-2 Genomes
    Beck, Kristen L.
    Seabolt, Edward
    Agarwal, Akshay
    Nayar, Gowri
    Bianco, Simone
    Krishnareddy, Harsha
    Ngo, Timothy A.
    Kunitomi, Mark
    Mukherjee, Vandana
    Kaufman, James H.
    VIRUSES-BASEL, 2021, 13 (12):