mInDel: a high-throughput and efficient pipeline for genome-wide InDel marker development

被引:25
作者
Lv, Yuanda [1 ]
Liu, Yuhe [2 ]
Zhao, Han [1 ]
机构
[1] Jiangsu Acad Agr Sci, Inst Agr Biotechnol, Nanjing 210014, Jiangsu, Peoples R China
[2] Univ Illinois, Dept Crop Sci, Urbana, IL 61801 USA
来源
BMC GENOMICS | 2016年 / 17卷
关键词
Next-generation sequencing (NGS); Insertions and deletions (InDels); Genome-wide marker discovery; Marker platform; Marker-assisted breeding;
D O I
10.1186/s12864-016-2614-5
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Rich in genetic information and cost-effective to genotype, the Insertion-Deletion (InDel) molecular marker system is an important tool for studies in genetics, genomics and for marker-assisted breeding. Advent of next-generation sequencing (NGS) revolutionized the speed and throughput of sequence data generation, and enabled genome-wide identification of insertion and deletion variation. However, current NGS-based InDel mining tools, such as Samtools, GATK and Atlas2, all rely on a reference genome for variant calling which hinders their application on unsequenced organisms and the output of short InDels compromised their use on gel-based genotyping platforms. To address these issues, an enhanced platform is needed to identify longer InDels and develop markers in absence of a reference genome. Results: Here we present mInDel (multiple InDel), a next-generation variant calling tool specifically designed for InDel marker discovery. By taking in raw sequence reads and assembling them into contigs de novo, this software identifies InDel polymorphisms using a sliding window alignment from assembled contigs, rendering a unique advantage when a reference genome is unavailable. By providing an option of combining multiple discovered InDels as output, mInDel is amiable to gel-based genotyping platforms where markers with large polymorphisms are preferred. We demonstrated the usability and performance of this software through a case study using a set of maize NGS data, and experimentally validated the accuracy of markers generated from mInDel. Conclusions: mInDel is a novel and practical tool that enables rapid genome-wide InDel marker discovery. The features of being independent from a reference genome and the flexibility with downstream genotyping platforms will allow a broad range of applications across genetics research and plant breeding.
引用
收藏
页数:5
相关论文
共 17 条
  • [1] An integrative variant analysis suite for whole exome next-generation sequencing data
    Challis, Danny
    Yu, Jin
    Evani, Uday S.
    Jackson, Andrew R.
    Paithankar, Sameer
    Coarfa, Cristian
    Milosavljevic, Aleksandar
    Gibbs, Richard A.
    Yu, Fuli
    [J]. BMC BIOINFORMATICS, 2012, 13
  • [2] SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data
    Cox, Murray P.
    Peterson, Daniel A.
    Biggs, Patrick J.
    [J]. BMC BIOINFORMATICS, 2010, 11
  • [3] De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data
    DiGuistini, Scott
    Liao, Nancy Y.
    Platt, Darren
    Robertson, Gordon
    Seidel, Michael
    Chan, Simon K.
    Docking, T. Roderick
    Birol, Inanc
    Holt, Robert A.
    Hirst, Martin
    Mardis, Elaine
    Marra, Marco A.
    Hamelin, Richard C.
    Bohlmann, Joerg
    Breuil, Colette
    Jones, Steven J. M.
    [J]. GENOME BIOLOGY, 2009, 10 (09):
  • [4] Molecular markers in a commercial breeding program
    Eathington, Sam R.
    Crosbie, Theodore M.
    Edwards, Marlin D.
    Reiter, Roberts.
    Bull, Jason K.
    [J]. CROP SCIENCE, 2007, 47 : S154 - S163
  • [5] Applications of next generation sequencing in molecular ecology of non-model organisms
    Ekblom, R.
    Galindo, J.
    [J]. HEREDITY, 2011, 107 (01) : 1 - 15
  • [6] Sequencing breakthroughs for genomic ecology and evolutionary biology
    Hudson, Matthew E.
    [J]. MOLECULAR ECOLOGY RESOURCES, 2008, 8 (01) : 3 - 17
  • [7] A comprehensive evaluation of assembly scaffolding tools
    Hunt, Martin
    Newbold, Chris
    Berriman, Matthew
    Otto, Thomas D.
    [J]. GENOME BIOLOGY, 2014, 15 (03):
  • [8] De novo assembly and genotyping of variants using colored de Bruijn graphs
    Iqbal, Zamin
    Caccamo, Mario
    Turner, Isaac
    Flicek, Paul
    McVean, Gil
    [J]. NATURE GENETICS, 2012, 44 (02) : 226 - 232
  • [9] Enhancements and modifications of primer design program Primer3
    Koressaar, Triinu
    Remm, Maido
    [J]. BIOINFORMATICS, 2007, 23 (10) : 1289 - 1291
  • [10] Langmead B, 2012, NAT METHODS, V9, P357, DOI [10.1038/NMETH.1923, 10.1038/nmeth.1923]