Whole-Genome Sequence Accuracy Is Improved by Replication in a Population of Mutagenized Sorghum

被引:32
作者
Addo-Quaye, Charles [1 ,3 ]
Tuinstra, Mitch [2 ]
Carraro, Nicola [2 ]
Weil, Clifford [2 ]
Dilkes, Brian P. [1 ]
机构
[1] Purdue Univ, Dept Biochem, 170 S Univ Ave, W Lafayette, IN 47907 USA
[2] Purdue Univ, Dept Agron, W Lafayette, IN 47907 USA
[3] Lewis Clark State Coll, Div Nat Sci & Math, Lewiston, ID 83501 USA
基金
美国国家科学基金会;
关键词
accuracy; EMS; mutagenesis; mutants; mutations; polymorphisms; rare variants; SNP; sorghum; SINGLE NUCLEOTIDE POLYMORPHISMS; CHEMICAL MUTAGENESIS; INDUCED MUTATIONS; GENETIC SCREENS; MUTANTS; BICOLOR; SPECIFICITY; DISCOVERY; IDENTIFICATION; TECHNOLOGIES;
D O I
10.1534/g3.117.300301
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The accurate detection of induced mutations is critical for both forward and reverse genetics studies. Experimental chemical mutagenesis induces relatively few single base changes per individual. In a complex eukaryotic genome, false positive detection of mutations can occur at or above this mutagenesis rate. We demonstrate here, using a population of ethyl methanesulfonate (EMS)-treated Sorghum bicolor BTx623 individuals, that using replication to detect false positive-induced variants in next-generation sequencing (NGS) data permits higher throughput variant detection with greater accuracy. We used a lower sequence coverage depth (average of 7x) from 586 independently mutagenized individuals and detected 5,399,493 homozygous single nucleotide polymorphisms (SNPs). Of these, 76% originated from only 57,872 genomic positions prone to false positive variant calling. These positions are characterized by high copy number paralogs where the error-prone SNP positions are at copies containing a variant at the SNP position. The ability of short stretches of homology to generate these error-prone positions suggests that incompletely assembled or poorly mapped repeated sequences are one driver of these error-prone positions. Removal of these false positives left 1,275,872 homozygous and 477,531 heterozygous EMS-induced SNPs, which, congruent with the mutagenic mechanism of EMS, were >98% G:C to A:T transitions. Through this analysis, we generated a collection of sequence indexed mutants of sorghum. This collection contains 4035 high-impact homozygous mutations in 3637 genes and 56,514 homozygous missense mutations in 23,227 genes. Each line contains, on average, 2177 annotated homozygous SNPs per genome, including seven likely gene knockouts and 96 missense mutations. The number of mutations in a transcript was linearly correlated with the transcript length and also the G+C count, but not with the GC/AT ratio. Analysis of the detected mutagenized positions identified CG-rich patches, and flanking sequences strongly influenced EMS-induced mutation rates. This method for detecting false positive-induced mutations is generally applicable to any organism, is independent of the choice of in silico variant-calling algorithm, and is most valuable when the true mutation rate is likely to be low, such as in laboratory-induced mutations or somatic mutation detection in medicine.
引用
收藏
页码:1079 / 1094
页数:16
相关论文
共 77 条
[1]   Forward Genetics by Sequencing EMS Variation-Induced Inbred Lines [J].
Addo-Quaye, Charles ;
Buescher, Elizabeth ;
Best, Norman ;
Chaikam, Vijay ;
Baxter, Ivan ;
Dilkes, Brian P. .
G3-GENES GENOMES GENETICS, 2017, 7 (02) :413-425
[2]   Moving forward in reverse:: genetic technologies to enable genome-wide phenomic screens in Arabidopsis [J].
Alonso, Jose M. ;
Ecker, Joseph R. .
NATURE REVIEWS GENETICS, 2006, 7 (07) :524-536
[3]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[4]  
[Anonymous], GENOME BIOL
[5]   Activities at the Universal Protein Resource (UniProt) [J].
Apweiler, Rolf ;
Bateman, Alex ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Alpi, Emanuele ;
Antunes, Ricardo ;
Arganiska, Joanna ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Cibrian-Uhalte, Elena ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Fazzini, Francesco ;
Gane, Paul ;
Castro, Leyla Garcia ;
Garmiri, Penelope ;
Hatton-Ellis, Emma ;
Hieta, Reija ;
Huntley, Rachael ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
MacDougall, Alistair ;
Mutowo, Prudence ;
Nightingale, Andrew ;
Orchard, Sandra ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Turner, Edward ;
Volynkin, Vladimir ;
Wardell, Tony ;
Watkins, Xavier ;
Zellner, Hermann ;
Corbett, Matt .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D191-D198
[6]  
Arumuganathan K., 1991, Plant Mol Biol Rep, V9, P208, DOI [10.1007/BF02672069, DOI 10.1007/BF02672069]
[7]   MEME SUITE: tools for motif discovery and searching [J].
Bailey, Timothy L. ;
Boden, Mikael ;
Buske, Fabian A. ;
Frith, Martin ;
Grant, Charles E. ;
Clementi, Luca ;
Ren, Jingyuan ;
Li, Wilfred W. ;
Noble, William S. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W202-W208
[8]   Genomic innovation for crop improvement [J].
Bevan, Michael W. ;
Uauy, Cristobal ;
Wulff, Brande B. H. ;
Zhou, Ji ;
Krasileva, Ksenia ;
Clark, Matthew D. .
NATURE, 2017, 543 (7645) :346-354
[9]   A combined biochemical screen and TILLING approach identifies mutations in Sorghum bicolor L. Moench resulting in acyanogenic forage production [J].
Blomstedt, Cecilia K. ;
Gleadow, Roslyn M. ;
O'Donnell, Natalie ;
Naur, Peter ;
Jensen, Kenneth ;
Laursen, Tomas ;
Olsen, Carl Erik ;
Stuart, Peter ;
Hamill, John D. ;
Moller, Birger Lindberg ;
Neale, Alan D. .
PLANT BIOTECHNOLOGY JOURNAL, 2012, 10 (01) :54-66
[10]   BLAST plus : architecture and applications [J].
Camacho, Christiam ;
Coulouris, George ;
Avagyan, Vahram ;
Ma, Ning ;
Papadopoulos, Jason ;
Bealer, Kevin ;
Madden, Thomas L. .
BMC BIOINFORMATICS, 2009, 10