Norgal: extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data

被引:62
作者
Al-Nakeeb, Kosai [1 ]
Petersen, Thomas Nordahl [1 ]
Sicheritz-Ponten, Thomas [1 ]
机构
[1] Tech Univ Denmark, Dept Bio & Hlth Informat, Bldg 208, DK-2800 Lyngby, Denmark
关键词
Mitochondrial dna; K-mer; Next-generation sequencing; De novo assembly; ALIGNMENT; CELL;
D O I
10.1186/s12859-017-1927-y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Whole-genome sequencing (WGS) projects provide short read nucleotide sequences from nuclear and possibly organelle DNA depending on the source of origin. Mitochondrial DNA is present in animals and fungi, while plants contain DNA from both mitochondria and chloroplasts. Current techniques for separating organelle reads from nuclear reads in WGS data require full reference or partial seed sequences for assembling. Results: Norgal (de Novo ORGAneLle extractor) avoids this requirement by identifying a high frequency subset of k-mers that are predominantly of mitochondrial origin and performing a de novo assembly on a subset of reads that contains these k-mers. The method was applied to WGS data from a panda, brown algae seaweed, butterfly and filamentous fungus. We were able to extract full circular mitochondrial genomes and obtained sequence identities to the reference sequences in the range from 98.5 to 99.5%. We also assembled the chloroplasts of grape vines and cucumbers using Norgal together with seed-based de novo assemblers. Conclusion: Norgal is a pipeline that can extract and assemble full or partial mitochondrial and chloroplast genomes from WGS short reads without prior knowledge. The program is available at: https://bitbucket.org/kosaidtu/norgal.
引用
收藏
页数:7
相关论文
共 18 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
AQUADRO CF, 1983, GENETICS, V103, P287
[3]  
Bushnell B, BBMap short read aligner, and other bioinformatic tools
[4]  
CAMACHO C, 2009, BMC BIOINFORMATICS, V10, DOI DOI 10.1186/1471-2105-10-421
[5]  
Cormode G, 2004, IMPROVED DATA STREAM, P29
[6]   NOVOPlasty: de novo assembly of organelle genomes from whole genome data [J].
Dierckxsens, Nicolas ;
Mardulyn, Patrick ;
Smits, Guillaume .
NUCLEIC ACIDS RESEARCH, 2017, 45 (04)
[7]   Complete mitochondrial genome of the Oriental Hornet, Vespa orientalis F. (Hymenoptera: Vespidae) [J].
Haddad, Nizar Jamal ;
Al-Nakeeb, Kosai ;
Petersen, Bent ;
Dalen, Love ;
Blom, Nikolaj ;
Sicheritz-Ponten, Thomas .
MITOCHONDRIAL DNA PART B-RESOURCES, 2017, 2 (01) :139-140
[8]   Reconstructing mitochondrial genomes directly from genomic next-generation sequencing reads-a baiting and iterative mapping approach [J].
Hahn, Christoph ;
Bachmann, Lutz ;
Chevreux, Bastien .
NUCLEIC ACIDS RESEARCH, 2013, 41 (13) :e129
[9]  
Heldt H.-W., 2011, 20 PLANT CELL HAS 3, V4th ed., P487, DOI [10.1016/B978-0-12-384986-1.00020-X, DOI 10.1016/B978-0-12-384986-1.00020-X]
[10]   Quake: quality-aware detection and correction of sequencing errors [J].
Kelley, David R. ;
Schatz, Michael C. ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2010, 11 (11)