Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms

被引:113
作者
Costello, Maura [1 ]
Fleharty, Mark [1 ]
Abreu, Justin [1 ]
Farjoun, Yossi [1 ]
Ferriera, Steven [1 ]
Holmes, Laurie [1 ]
Granger, Brian [1 ]
Green, Lisa [1 ]
Howd, Tom [1 ]
Mason, Tamara [1 ]
Vicente, Gina [1 ]
Dasilva, Michael [1 ]
Brodeur, Wendy [1 ]
DeSmet, Timothy [1 ]
Dodge, Sheila [1 ]
Lennon, Niall J. [1 ]
Gabriel, Stacey [1 ]
机构
[1] Broad Inst MIT & Harvard, Broad Genom, 320 Charles St, Cambridge, MA 02141 USA
关键词
Next generation sequencing; Massively parallel sequencing; ILLUMINA sequencing; Index swapping; Index hopping; Multiplexing; Barcodes; Index; Indexes; Exclusion amplification; DNA; CANCER;
D O I
10.1186/s12864-018-4703-0
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Here we present an in-depth characterization of the mechanism of sequencer-induced sample contamination due to the phenomenon of index swapping that impacts Illumina sequencers employing patterned flow cells with Exclusion Amplification (ExAmp) chemistry (HiSeqX, HiSeq4000, and NovaSeq). We also present a remediation method that minimizes the impact of such swaps. Results: Leveraging data collected over a two-year period, we demonstrate the widespread prevalence of index swapping in patterned flow cell data. We calculate mean swap rates across multiple sample preparation methods and sequencer models, demonstrating that different library methods can have vastly different swapping rates and that even non-ExAmp chemistry instruments display trace levels of index swapping. We provide methods for eliminating sample data cross contamination by utilizing non-redundant dual indexing for complete filtering of index swapped reads, and share the sequences for 96 non-combinatorial dual indexes we have validated across various library preparation methods and sequencer models. Finally, using computational methods we provide a greater insight into the mechanism of index swapping. Conclusions: Index swapping in pooled libraries is a prevalent phenomenon that we observe at a rate of 0.2 to 6% in all sequencing runs on HiSeqX, HiSeq 4000/3000, and NovaSeq. Utilizing non-redundant dual indexing allows for the removal (flagging/filtering) of these swapped reads and eliminates swapping induced sample contamination, which is critical for sensitive applications such as RNA-seq, single cell, blood biopsy using circulating tumor DNA, or clinical sequencing.
引用
收藏
页数:10
相关论文
共 18 条
[1]   Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples [J].
Cibulskis, Kristian ;
Lawrence, Michael S. ;
Carter, Scott L. ;
Sivachenko, Andrey ;
Jaffe, David ;
Sougnez, Carrie ;
Gabriel, Stacey ;
Meyerson, Matthew ;
Lander, Eric S. ;
Getz, Gad .
NATURE BIOTECHNOLOGY, 2013, 31 (03) :213-219
[2]   A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries [J].
Fisher, Sheila ;
Barry, Andrew ;
Abreu, Justin ;
Minie, Brian ;
Nolan, Jillian ;
Delorey, Toni M. ;
Young, Geneva ;
Fennell, Timothy J. ;
Allen, Alexander ;
Ambrogio, Lauren ;
Berlin, Aaron M. ;
Blumenstiel, Brendan ;
Cibulskis, Kristian ;
Friedrich, Dennis ;
Johnson, Ryan ;
Juhn, Frank ;
Reilly, Brian ;
Shammas, Ramy ;
Stalker, John ;
Sykes, Sean M. ;
Thompson, Jon ;
Walsh, John ;
Zimmer, Andrew ;
Zwirko, Zac ;
Gabriel, Stacey ;
Nicol, Robert ;
Nusbaum, Chad .
GENOME BIOLOGY, 2011, 12 (01)
[3]  
Griffiths JA, 2017, BIORXIV, DOI [10.1101/177048, DOI 10.1101/177048]
[4]  
Illumina, 2017, EFF IND MIS MULT DOW
[5]  
Illumina Inc, 2017, ILL NOVASEQ SPEC SHE
[6]  
Illumina Inc., 2017, ILL HISEQX SER SPEC
[7]   Detecting and Estimating Contamination of Human DNA Samples in Sequencing and Array-Based Genotype Data [J].
Jun, Goo ;
Flickinger, Matthew ;
Hetrick, Kurt N. ;
Romm, Jane M. ;
Doheny, Kimberly F. ;
Abecasis, Goncalo R. ;
Boehnke, Michael ;
Kang, Hyun Min .
AMERICAN JOURNAL OF HUMAN GENETICS, 2012, 91 (05) :839-848
[8]   Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform [J].
Kircher, Martin ;
Sawyer, Susanna ;
Meyer, Matthias .
NUCLEIC ACIDS RESEARCH, 2012, 40 (01) :e3
[9]  
Larsson AJ, 2017, BIORXIV, DOI [10.1101/176537, DOI 10.1101/176537]
[10]   A novel post hoc method for detecting index switching finds no evidence for increased switching on the Illumina HiSeq X [J].
Owens, Gregory L. ;
Todesco, Marco ;
Drummond, Emily B. M. ;
Yeaman, Sam ;
Rieseberg, Loren H. .
MOLECULAR ECOLOGY RESOURCES, 2018, 18 (01) :169-175