A novel post hoc method for detecting index switching finds no evidence for increased switching on the Illumina HiSeq X

被引:17
作者
Owens, Gregory L. [1 ,2 ]
Todesco, Marco [1 ,2 ]
Drummond, Emily B. M. [1 ,2 ]
Yeaman, Sam [3 ]
Rieseberg, Loren H. [1 ,2 ]
机构
[1] Univ British Columbia, Dept Bot, Vancouver, BC, Canada
[2] Univ British Columbia, Beaty Biodivers Ctr, Vancouver, BC, Canada
[3] Univ Calgary, Dept Biol Sci, Calgary, AB, Canada
关键词
barcode; bioinformatics/phyloinformatics; genomics/proteomics; index hopping; sequencing; GENOME; CONTAMINATION; EVOLUTION;
D O I
10.1111/1755-0998.12713
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput sequencing using the Illumina HiSeq platform is a pervasive and critical molecular ecology resource, and has provided the data underlying many recent advances. A recent study has suggested that "index switching," where reads are misattributed to the wrong sample, may be higher in new versions of the HiSeq platform. This has the potential to invalidate both published and in-progress work across the field. Here, we test for evidence of index switching in an exemplar whole-genome shotgun data set sequenced on both the Illumina HiSeq 2500, which should not have the problem, and the Illumina HiSeq X, which may. We leverage unbalanced heterozygotes, which may be produced by index switching, and ask whether the undersequenced allele is more likely to be found in other samples in the same lane than expected based on the allele frequency. Although we validate the sensitivity of this method using simulations, we find that neither the HiSeq 2500 nor the HiSeq X has evidence of index switching. This suggests that, thankfully, index switching may not be a ubiquitous problem in HiSeq X sequence data. Lastly, we provide scripts for applying our method so that index switching can be tested for in other data sets.
引用
收藏
页码:169 / 175
页数:7
相关论文
共 21 条
[1]   The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution [J].
Badouin, Helene ;
Gouzy, Jerome ;
Grassa, Christopher J. ;
Murat, Florent ;
Staton, S. Evan ;
Cottret, Ludovic ;
Lelandais-Briere, Christine ;
Owens, Gregory L. ;
Carrere, Sebastien ;
Mayjonade, Baptiste ;
Legrand, Ludovic ;
Gill, Navdeep ;
Kane, Nolan C. ;
Bowers, John E. ;
Hubner, Sariel ;
Bellec, Arnaud ;
Berard, Aurelie ;
Berges, Helene ;
Blanchet, Nicolas ;
Boniface, Marie-Claude ;
Brunel, Dominique ;
Catrice, Olivier ;
Chaidir, Nadia ;
Claudel, Clotilde ;
Donnadieu, Cecile ;
Faraut, Thomas ;
Fievet, Ghislain ;
Helmstetter, Nicolas ;
King, Matthew ;
Knapp, Steven J. ;
Lai, Zhao ;
Le Paslier, Marie-Christine ;
Lippi, Yannick ;
Lorenzon, Lolita ;
Mandel, Jennifer R. ;
Marage, Gwenola ;
Marchand, Gwenaelle ;
Marquand, Elodie ;
Bret-Mestries, Emmanuelle ;
Morien, Evan ;
Nambeesan, Savithri ;
Thuy Nguyen ;
Pegot-Espagnet, Prune ;
Pouilly, Nicolas ;
Raftis, Frances ;
Sallet, Erika ;
Schiex, Thomas ;
Thomas, Justine ;
Vandecasteele, Celine ;
Vares, Didier .
NATURE, 2017, 546 (7656) :148-+
[2]   Population genomics based on low coverage sequencing: how low should we go? [J].
Buerkle, C. Alex ;
Gompert, Zachariah .
MOLECULAR ECOLOGY, 2013, 22 (11) :3028-3035
[3]   Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data [J].
Flickinger, Matthew ;
Jun, Goo ;
Abecasis, Goncalo R. ;
Boehnke, Michael ;
Kang, Hyun Min .
AMERICAN JOURNAL OF HUMAN GENETICS, 2015, 97 (02) :284-290
[4]  
Garrison E., 2012, arXiv, V1207, P3907, DOI [10.48550/arXiv.1207.3907, DOI 10.48550/ARXIV.1207.3907]
[5]   Population genomic scans suggest novel genes underlie convergent flowering time evolution in the introduced range of Arabidopsis thaliana [J].
Gould, Billie A. ;
Stinchcombe, John R. .
MOLECULAR ECOLOGY, 2017, 26 (01) :92-106
[6]  
Illumina, 2017, EFF IND MIS MULT DOW
[7]   Detecting and Estimating Contamination of Human DNA Samples in Sequencing and Array-Based Genotype Data [J].
Jun, Goo ;
Flickinger, Matthew ;
Hetrick, Kurt N. ;
Romm, Jane M. ;
Doheny, Kimberly F. ;
Abecasis, Goncalo R. ;
Boehnke, Michael ;
Kang, Hyun Min .
AMERICAN JOURNAL OF HUMAN GENETICS, 2012, 91 (05) :839-848
[8]   Toward better understanding of artifacts in variant calling from high-coverage samples [J].
Li, Heng .
BIOINFORMATICS, 2014, 30 (20) :2843-2851
[9]   Fast and accurate short read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (14) :1754-1760
[10]  
Li H, 2009, BIOINFORMATICS, V25, P1094, DOI [10.1093/bioinformatics/btp100, 10.1093/bioinformatics/btp324]