Accurate annotation of protein-coding genes in mitochondrial genomes

被引:30
作者
Al Arab, Marwa [1 ,2 ,8 ]
zu Siederdissen, Christian Hoener [1 ,2 ,3 ]
Tout, Kifah [8 ]
Sahyoun, Abdullah H. [1 ,2 ,8 ,9 ]
Stadler, Peter F. [1 ,2 ,3 ,4 ,5 ,6 ,7 ]
Bernt, Matthias [1 ,10 ]
机构
[1] Univ Leipzig, Dept Comp Sci, Bioinformat Grp, Hartelstr 16-18, D-04107 Leipzig, Germany
[2] Univ Leipzig, Interdisciplinary Ctr Bioinformat, Hartelstr 16-18, D-04107 Leipzig, Germany
[3] Univ Vienna, Inst Theoret Chem, Wahringerstr 17, A-1090 Vienna, Austria
[4] Max Planck Inst Math Sci, Inselstr 22, D-04103 Leipzig, Germany
[5] Fraunhofer Inst Zelltherapie & Immunol, Perlickstr 1, D-04103 Leipzig, Germany
[6] Univ Copenhagen, Ctr Noncoding RNA Technol & Hlth, Gronnegardsvej 3, DK-1870 Frederiksberg C, Denmark
[7] Santa Fe Inst, 1399 Hyde Pk Rd, Santa Fe, NM 87501 USA
[8] Lebanese Univ, Doctoral Sch Sci & Technol, AZM Ctr Biotechnol Res, Tripoli, Lebanon
[9] Johannes Gutenberg Univ Mainz gGmbH, Univ Med Ctr, TRON Translat Oncol, Mainz, Germany
[10] Univ Leipzig, Parallel Comp & Complex Syst Grp, Dept Comp Sci, Augustuspl 10, D-04103 Leipzig, Germany
关键词
Protein coding genes; Metazoa; Mitochondrial DNA; Annotation; Hidden Markov models; AUTOMATIC ANNOTATION; SEQUENCE; PHYLOGENY; DNA; TRANSCRIPTS; ALIGNMENTS; DATABASE; TURTLES; BIRDS; CODE;
D O I
10.1016/j.ympev.2016.09.024
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Mitochondrial genome sequences are available in large number and new sequences become published nowadays with increasing pace. Fast, automatic, consistent, and high quality annotations are a prerequisite for downstream analyses. Therefore, we present an automated pipeline for fast de novo annotation of mitochondrial protein-coding genes. The annotation is based on enhanced phylogeny-aware hidden Markov models (HMMs). The pipeline builds taxon-specific enhanced multiple sequence alignments (MSA) of already annotated sequences and corresponding HMMs using an approximation of the phylogeny. The MSAs are enhanced by fixing unannotated frameshifts, purging of wrong sequences, and removal of non-conserved columns from both ends. A comparison with reference annotations highlights the high quality of the results. The frameshift correction method predicts a large number of frameshifts, many of which are unknown. A detailed analysis of the frameshifts in nad3 of the Archosauria-Testudines group has been conducted. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:209 / 216
页数:8
相关论文
共 40 条
  • [1] Attardi G., 1996, MITOCHONDRIAL BIOGEN
  • [2] Single nucleotide+1 frameshifts in an apparently functional mitochondrial cytochrome b gene in ants of the genus Polrhachis
    Beckenbach, AT
    Robson, SKA
    Crozier, RH
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 2005, 60 (02) : 141 - 152
  • [3] Benson D. A., 2008, NUCLEIC ACIDS RES, V24, pD26
  • [4] A comprehensive analysis of bilaterian mitochondrial genomes and phylogeny
    Bernt, Matthias
    Bleidorn, Christoph
    Braband, Anke
    Dambach, Johannes
    Donath, Alexander
    Fritzsch, Guido
    Golombek, Anja
    Hadrys, Heike
    Juehling, Frank
    Meusemann, Karen
    Middendorf, Martin
    Misof, Bernhard
    Perseke, Marleen
    Podsiadlowski, Lars
    von Reumont, Bjoern
    Schierwater, Bernd
    Schlegel, Martin
    Schroedl, Michael
    Simon, Sabrina
    Stadler, Peter F.
    Stoeger, Isabella
    Struck, Torsten H.
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 2013, 69 (02) : 352 - 364
  • [5] Genetic aspects of mitochondrial genome evolution
    Bernt, Matthias
    Braband, Anke
    Schierwater, Bernd
    Stadler, Peter F.
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 2013, 69 (02) : 328 - 338
  • [6] MITOS: Improved de novo metazoan mitochondrial genome annotation
    Bernt, Matthias
    Donath, Alexander
    Juehling, Frank
    Externbrink, Fabian
    Florentz, Catherine
    Fritzsch, Guido
    Puetz, Joern
    Middendorf, Martin
    Stadler, Peter F.
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 2013, 69 (02) : 313 - 319
  • [7] The complete sequence of the mitochondrial genome of Nautilus macromphalus (Mollusca: Cephalopoda)
    Boore, Jeffrey L.
    [J]. BMC GENOMICS, 2006, 7 (1)
  • [8] Testing the new animal phylogeny: A phylum level molecular analysis of the animal kingdom
    Bourlat, Sarah J.
    Nielsen, Claus
    Economou, Andrew D.
    Telford, Maximilian J.
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 2008, 49 (01) : 23 - 31
  • [9] How to sequence and annotate insect mitochondrial genomes for systematic and comparative genomics research
    Cameron, Stephen L.
    [J]. SYSTEMATIC ENTOMOLOGY, 2014, 39 (03) : 400 - 411
  • [10] Dinman Jonathan D, 2006, Microbe Wash DC, V1, P521, DOI 10.1128/microbe.1.521.1