Analysis of error profiles in deep next-generation sequencing data

被引:181
作者
Ma, Xiaotu [1 ]
Shao, Ying [1 ]
Tian, Liqing [1 ]
Flasch, Diane A. [1 ]
Mulder, Heather L. [1 ]
Edmonson, Michael N. [1 ]
Liu, Yu [1 ]
Chen, Xiang [1 ]
Newman, Scott [1 ]
Nakitandwe, Joy [2 ]
Li, Yongjin [1 ]
Li, Benshang [3 ]
Shen, Shuhong [3 ]
Wang, Zhaoming [1 ,4 ]
Shurtleff, Sheila [2 ]
Robison, Leslie L. [4 ]
Levy, Shawn [5 ]
Easton, John [1 ]
Zhang, Jinghui [1 ]
机构
[1] St Jude Childrens Res Hosp, Dept Computat Biol, 332 N Lauderdale St, Memphis, TN 38105 USA
[2] St Jude Childrens Res Hosp, Dept Pathol, 332 N Lauderdale St, Memphis, TN 38105 USA
[3] Shanghai Jiao Tong Univ, Shanghai Childrens Med Ctr, Key Lab Pediat Hematol & Oncol, Minist Hlth,Dept Hematol & Oncol,Sch Med, Shanghai 200127, Peoples R China
[4] St Jude Childrens Res Hosp, Dept Epidemiol & Canc Control, 332 N Lauderdale St, Memphis, TN 38105 USA
[5] HudsonAlpha Inst Biotechnol, Huntsville, AL 35806 USA
关键词
Deep sequencing; Error rate; Substitution; Subclonal; Detection; Hotspot mutation; CLONAL HEMATOPOIESIS; MUTATIONAL PROCESSES; DNA; RISK; SIGNATURES; LANDSCAPE; GENOME; GENES; AGE;
D O I
10.1186/s13059-019-1659-6
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundSequencing errors are key confounding factors for detecting low-frequency genetic variants that are important for cancer molecular diagnosis, treatment, and surveillance using deep next-generation sequencing (NGS). However, there is a lack of comprehensive understanding of errors introduced at various steps of a conventional NGS workflow, such as sample handling, library preparation, PCR enrichment, and sequencing. In this study, we use current NGS technology to systematically investigate these questions.ResultsBy evaluating read-specific error distributions, we discover that the substitution error rate can be computationally suppressed to 10(-5) to 10(-4), which is 10- to 100-fold lower than generally considered achievable (10(-3)) in the current literature. We then quantify substitution errors attributable to sample handling, library preparation, enrichment PCR, and sequencing by using multiple deep sequencing datasets. We find that error rates differ by nucleotide substitution types, ranging from 10(-5) for A>C/T>G, C>A/G>T, and C>G/G>C changes to 10(-4) for A>G/T>C changes. Furthermore, C>T/G>A errors exhibit strong sequence context dependency, sample-specific effects dominate elevated C>A/G>T errors, and target-enrichment PCR led to 6-fold increase of overall error rate. We also find that more than 70% of hotspot variants can be detected at 0.10.01% frequency with the current NGS technology by applying in silico error suppression.ConclusionsWe present the first comprehensive analysis of sequencing error sources in conventional NGS workflows. The error profiles revealed by our study highlight new directions for further improving NGS analysis accuracy both experimentally and computationally, ultimately enhancing the precision of deep sequencing.
引用
收藏
页数:15
相关论文
共 40 条
[1]   Phylogenetic ctDNA analysis depicts early-stage lung cancer evolution [J].
Abbosh, Christopher ;
Birkbak, Nicolai J. ;
Wilson, Gareth A. ;
Jamal-Hanjani, Mariam ;
Constantin, Tudor ;
Salari, Raheleh ;
Le Quesne, John ;
Moore, David A. ;
Veeriah, Selvaraju ;
Rosenthal, Rachel ;
Marafioti, Teresa ;
Kirkizlar, Eser ;
Watkins, Thomas B. K. ;
McGranahan, Nicholas ;
Ward, Sophia ;
Martinson, Luke ;
Riley, Joan ;
Fraioli, Francesco ;
Al Bakir, Maise ;
Gronroos, Eva ;
Zambrana, Francisco ;
Endozo, Raymondo ;
Bi, Wenya Linda ;
Fennessy, Fiona M. ;
Sponer, Nicole ;
Johnson, Diana ;
Laycock, Joanne ;
Shafi, Seema ;
Czyzewska-Khan, Justyna ;
Rowan, Andrew ;
Chambers, Tim ;
Matthews, Nik ;
Turajlic, Samra ;
Hiley, Crispin ;
Lee, Siow Ming ;
Forster, Martin D. ;
Ahmad, Tanya ;
Falzon, Mary ;
Borg, Elaine ;
Lawrence, David ;
Hayward, Martin ;
Kolvekar, Shyam ;
Panagiotopoulos, Nikolaos ;
Janes, Sam M. ;
Thakrar, Ricky ;
Ahmed, Asia ;
Blackhall, Fiona ;
Summers, Yvonne ;
Hafez, Dina ;
Naik, Ashwini .
NATURE, 2017, 545 (7655) :446-+
[2]   Signatures of mutational processes in human cancer [J].
Alexandrov, Ludmil B. ;
Nik-Zainal, Serena ;
Wedge, David C. ;
Aparicio, Samuel A. J. R. ;
Behjati, Sam ;
Biankin, Andrew V. ;
Bignell, Graham R. ;
Bolli, Niccolo ;
Borg, Ake ;
Borresen-Dale, Anne-Lise ;
Boyault, Sandrine ;
Burkhardt, Birgit ;
Butler, Adam P. ;
Caldas, Carlos ;
Davies, Helen R. ;
Desmedt, Christine ;
Eils, Roland ;
Eyfjord, Jorunn Erla ;
Foekens, John A. ;
Greaves, Mel ;
Hosoda, Fumie ;
Hutter, Barbara ;
Ilicic, Tomislav ;
Imbeaud, Sandrine ;
Imielinsk, Marcin ;
Jaeger, Natalie ;
Jones, David T. W. ;
Jones, David ;
Knappskog, Stian ;
Kool, Marcel ;
Lakhani, Sunil R. ;
Lopez-Otin, Carlos ;
Martin, Sancha ;
Munshi, Nikhil C. ;
Nakamura, Hiromi ;
Northcott, Paul A. ;
Pajic, Marina ;
Papaemmanuil, Elli ;
Paradiso, Angelo ;
Pearson, John V. ;
Puente, Xose S. ;
Raine, Keiran ;
Ramakrishna, Manasa ;
Richardson, Andrea L. ;
Richter, Julia ;
Rosenstiel, Philip ;
Schlesner, Matthias ;
Schumacher, Ton N. ;
Span, Paul N. ;
Teague, Jon W. .
NATURE, 2013, 500 (7463) :415-+
[3]   Deciphering Signatures of Mutational Processes Operative in Human Cancer [J].
Alexandrov, Ludmil B. ;
Nik-Zainal, Serena ;
Wedge, David C. ;
Campbell, Peter J. ;
Stratton, Michael R. .
CELL REPORTS, 2013, 3 (01) :246-259
[4]   Detection of Circulating Tumor DNA in Early- and Late-Stage Human Malignancies [J].
Bettegowda, Chetan ;
Sausen, Mark ;
Leary, Rebecca J. ;
Kinde, Isaac ;
Wang, Yuxuan ;
Agrawal, Nishant ;
Bartlett, Bjarne R. ;
Wang, Hao ;
Luber, Brandon ;
Alani, Rhoda M. ;
Antonarakis, Emmanuel S. ;
Azad, Nilofer S. ;
Bardelli, Alberto ;
Brem, Henry ;
Cameron, John L. ;
Lee, Clarence C. ;
Fecher, Leslie A. ;
Gallia, Gary L. ;
Gibbs, Peter ;
Le, Dung ;
Giuntoli, Robert L. ;
Goggins, Michael ;
Hogarty, Michael D. ;
Holdhoff, Matthias ;
Hong, Seung-Mo ;
Jiao, Yuchen ;
Juhl, Hartmut H. ;
Kim, Jenny J. ;
Siravegna, Giulia ;
Laheru, Daniel A. ;
Lauricella, Calogero ;
Lim, Michael ;
Lipson, Evan J. ;
Marie, Suely Kazue Nagahashi ;
Netto, George J. ;
Oliner, Kelly S. ;
Olivi, Alessandro ;
Olsson, Louise ;
Riggins, Gregory J. ;
Sartore-Bianchi, Andrea ;
Schmidt, Kerstin ;
Shih, Ie-Ming ;
Oba-Shinjo, Sueli Mieko ;
Siena, Salvatore ;
Theodorescu, Dan ;
Tie, Jeanne ;
Harkins, Timothy T. ;
Veronese, Silvio ;
Wang, Tian-Li ;
Weingart, Jon D. .
SCIENCE TRANSLATIONAL MEDICINE, 2014, 6 (224)
[5]   BlackOPs: increasing confidence in variant detection through mappability filtering [J].
Cabanski, Christopher R. ;
Wilkerson, Matthew D. ;
Soloway, Matthew ;
Parker, Joel S. ;
Liu, Jinze ;
Prins, Jan F. ;
Marron, J. S. ;
Perou, Charles M. ;
Hayes, D. Neil .
NUCLEIC ACIDS RESEARCH, 2013, 41 (19) :e178
[6]   Accelerating Discovery of Functional Mutant Alleles in Cancer [J].
Chang, Matthew T. ;
Bhattarai, Tripti Shrestha ;
Schram, Alison M. ;
Bielski, Craig M. ;
Donoghue, Mark T. A. ;
Jonsson, Philip ;
Chakravarty, Debyani ;
Phillips, Sarah ;
Kandoth, Cyriac ;
Penson, Alexander ;
Gorelick, Alexander ;
Shamu, Tambudzai ;
Patel, Swati ;
Harris, Christopher ;
Gao, JianJiong ;
Sumer, Selcuk Onur ;
Kundra, Ritika ;
Razavi, Pedram ;
Li, Bob T. ;
Reales, Dalicia N. ;
Socci, Nicholas D. ;
Jayakumaran, Gowtham ;
Zehir, Ahmet ;
Benayed, Ryma ;
Arcila, Maria E. ;
Chandarlapaty, Sarat ;
Ladanyi, Marc ;
Schultz, Nikolaus ;
Baselga, Jose ;
Berger, Michael F. ;
Rosen, Neal ;
Solit, David B. ;
Hyman, David M. ;
Taylor, Barry S. .
CANCER DISCOVERY, 2018, 8 (02) :174-183
[7]   Cytosine Deamination Is a Major Cause of Baseline Noise in Next-Generation Sequencing [J].
Chen, Guoli ;
Mosier, Stacy ;
Gocke, Christopher D. ;
Lin, Ming-Tseh ;
Eshleman, James R. .
MOLECULAR DIAGNOSIS & THERAPY, 2014, 18 (05) :587-593
[8]   DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification [J].
Chen, Lixin ;
Liu, Pingfang ;
Evans, Thomas C ;
Ettwiller, Laurence M. .
SCIENCE, 2017, 355 (6326) :752-+
[9]   CONSERTING: integrating copy-number analysis with structural-variation detection [J].
Chen, Xiang ;
Gupta, Pankaj ;
Wang, Jianmin ;
Nakitandwe, Joy ;
Roberts, Kathryn ;
Dalton, James D. ;
Parker, Matthew ;
Patel, Samir ;
Holmfeldt, Linda ;
Payne, Debbie ;
Easton, John ;
Ma, Jing ;
Rusch, Michael ;
Wu, Gang ;
Patel, Aman ;
Baker, Suzanne J. ;
Dyer, Michael A. ;
Shurtleff, Sheila ;
Espy, Stephen ;
Pounds, Stanley ;
Downing, James R. ;
Ellison, David W. ;
Mullighan, Charles G. ;
Zhang, Jinghui .
NATURE METHODS, 2015, 12 (06) :527-+
[10]   Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT) A Hybridization Capture-Based Next-Generation Sequencing Clinical Assay for Solid Tumor Molecular Oncology [J].
Cheng, Donavan T. ;
Mitchell, Talia N. ;
Zehir, Ahmet ;
Shah, Ronak H. ;
Benayed, Ryma ;
Syed, Aijazuddin ;
Chandramohan, Raghu ;
Liu, Zhen Yu ;
Won, Helen H. ;
Scott, Sasinya N. ;
Brannon, A. Rose ;
O'Reilly, Catherine ;
Sadowska, Justyna ;
Casanova, Jacklyn ;
Yannes, Angela ;
Hechtman, Jaclyn F. ;
Yao, Jinjuan ;
Song, Wei ;
Ross, Dara S. ;
Oultache, Alifya ;
Dogan, Snjezana ;
Borsu, Laetitia ;
Hameed, Meera ;
Nafa, Khedoudja ;
Arcila, Maria E. ;
Ladanyi, Marc ;
Berger, Michael F. .
JOURNAL OF MOLECULAR DIAGNOSTICS, 2015, 17 (03) :251-264