Accurate annotation of protein-coding genes in mitochondrial genomes

被引:29
|
作者
Al Arab, Marwa [1 ,2 ,8 ]
zu Siederdissen, Christian Hoener [1 ,2 ,3 ]
Tout, Kifah [8 ]
Sahyoun, Abdullah H. [1 ,2 ,8 ,9 ]
Stadler, Peter F. [1 ,2 ,3 ,4 ,5 ,6 ,7 ]
Bernt, Matthias [1 ,10 ]
机构
[1] Univ Leipzig, Dept Comp Sci, Bioinformat Grp, Hartelstr 16-18, D-04107 Leipzig, Germany
[2] Univ Leipzig, Interdisciplinary Ctr Bioinformat, Hartelstr 16-18, D-04107 Leipzig, Germany
[3] Univ Vienna, Inst Theoret Chem, Wahringerstr 17, A-1090 Vienna, Austria
[4] Max Planck Inst Math Sci, Inselstr 22, D-04103 Leipzig, Germany
[5] Fraunhofer Inst Zelltherapie & Immunol, Perlickstr 1, D-04103 Leipzig, Germany
[6] Univ Copenhagen, Ctr Noncoding RNA Technol & Hlth, Gronnegardsvej 3, DK-1870 Frederiksberg C, Denmark
[7] Santa Fe Inst, 1399 Hyde Pk Rd, Santa Fe, NM 87501 USA
[8] Lebanese Univ, Doctoral Sch Sci & Technol, AZM Ctr Biotechnol Res, Tripoli, Lebanon
[9] Johannes Gutenberg Univ Mainz gGmbH, Univ Med Ctr, TRON Translat Oncol, Mainz, Germany
[10] Univ Leipzig, Parallel Comp & Complex Syst Grp, Dept Comp Sci, Augustuspl 10, D-04103 Leipzig, Germany
关键词
Protein coding genes; Metazoa; Mitochondrial DNA; Annotation; Hidden Markov models; AUTOMATIC ANNOTATION; SEQUENCE; PHYLOGENY; DNA; TRANSCRIPTS; ALIGNMENTS; DATABASE; TURTLES; BIRDS; CODE;
D O I
10.1016/j.ympev.2016.09.024
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Mitochondrial genome sequences are available in large number and new sequences become published nowadays with increasing pace. Fast, automatic, consistent, and high quality annotations are a prerequisite for downstream analyses. Therefore, we present an automated pipeline for fast de novo annotation of mitochondrial protein-coding genes. The annotation is based on enhanced phylogeny-aware hidden Markov models (HMMs). The pipeline builds taxon-specific enhanced multiple sequence alignments (MSA) of already annotated sequences and corresponding HMMs using an approximation of the phylogeny. The MSAs are enhanced by fixing unannotated frameshifts, purging of wrong sequences, and removal of non-conserved columns from both ends. A comparison with reference annotations highlights the high quality of the results. The frameshift correction method predicts a large number of frameshifts, many of which are unknown. A detailed analysis of the frameshifts in nad3 of the Archosauria-Testudines group has been conducted. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:209 / 216
页数:8
相关论文
共 50 条
  • [41] Re-Annotation of Protein-Coding Genes in 10 Complete Genomes of Neisseriaceae Family by Combining Similarity-Based and Composition-Based Methods
    Guo, Feng-Biao
    Xiong, Lifeng
    Teng, Jade L. L.
    Yuen, Kwok-Yung
    Lau, Susanna K. P.
    Woo, Patrick C. Y.
    DNA RESEARCH, 2013, 20 (03) : 273 - 286
  • [42] The mitochondrial genome of lberobaenia (Coleoptera: Iberobaeniidae): first rearrangement of protein-coding genes in the beetles
    Andujar, Carmelo
    Arribas, Paula
    Linard, Benjamin
    Kundrata, Robin
    Bocak, Ladislav
    Vogler, Alfried P.
    MITOCHONDRIAL DNA PART A, 2017, 28 (1-2) : 156 - 158
  • [43] Mitochondrial genomes of the jungle crow Corvus macrorhynchos (Passeriformes: Corvidae) from shed feathers and a phylogenetic analysis of genus Corvus using mitochondrial protein-coding genes
    Krzeminska, Urszula
    Wilson, Robyn
    Rahman, Sadequr
    Song, Beng Kah
    Seneviratne, Sampath
    Gan, Han Ming
    Austin, Christopher M.
    MITOCHONDRIAL DNA PART A, 2016, 27 (04) : 2668 - 2670
  • [44] PROMOTER SEQUENCES OF EUKARYOTIC PROTEIN-CODING GENES
    CHAMBON, P
    HOPPE-SEYLERS ZEITSCHRIFT FUR PHYSIOLOGISCHE CHEMIE, 1981, 362 (04): : 381 - 381
  • [45] Evolutionary Patterns of the Mitochondrial Genome in Metazoa: Exploring the Role of Mutation and Selection in Mitochondrial Protein-Coding Genes
    Castellana, S.
    Vicario, S.
    Saccone, C.
    GENOME BIOLOGY AND EVOLUTION, 2011, 3 : 1067 - 1079
  • [46] The Complete Mitochondrial Genome and Expression Profile of Mitochondrial Protein-Coding Genes in the Bisexual and Parthenogenetic Haemaphysalis longicornis
    Wang, Tianhong
    Zhang, Shiqi
    Pei, Tingwei
    Yu, Zhijun
    Liu, Jingze
    FRONTIERS IN PHYSIOLOGY, 2019, 10
  • [47] Annotation of Protein-Coding in Drosophila biarmipes Contig6
    Yonke, J. M.
    Sadikot, T.
    MOLECULAR BIOLOGY OF THE CELL, 2016, 27
  • [48] A phylogeny of the Caniformia (order Carnivora) based on 12 complete protein-coding mitochondrial genes
    Delisle, I
    Strobeck, C
    MOLECULAR PHYLOGENETICS AND EVOLUTION, 2005, 37 (01) : 192 - 201
  • [49] Unequal synonymous substitution rates within and between two protein-coding mitochondrial genes
    Bielawski, JP
    Gold, JR
    MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (06) : 889 - 892
  • [50] Subordinal artiodactyl relationships in the light of phylogenetic analysis of 12 mitochondrial protein-coding genes
    Ursing, BM
    Slack, KE
    Arnason, U
    ZOOLOGICA SCRIPTA, 2000, 29 (02) : 83 - 88