A solid quality-control analysis of AB SOLiD short-read sequencing data

被引:6
作者
Castellana, Stefano
Romani, Marta
Valente, Enza Maria
Mazza, Tommaso
机构
[1] The Bioinformatics Unit, CSS-Mendel Institute
[2] Neurogenetics Unit, CSS-Mendel Institute
关键词
genetics; next generation sequencing; sequencing quality control; GENERATION; ERROR;
D O I
10.1093/bib/bbs048
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Next generation sequencers have greatly improved our ability to mine polymorphisms and mutations out of entire (or portions of) genomes. The reliability of their outputs, though, showed to be very related to the sequencing chemistry and to deeply affect the quality of the downstream analyses. We focus here on the two-base color code chemistry of AB SOLiD sequencers and propose a comprehensive quality control methodological and software pipeline. We used existing and custom tools to detect and purge short-reads of some common flaws due to sequencing errors and chemical hitches. We apply them to a cohort of SOLiD 4 runs and measure their joint efficacy in terms of the resulting ability to detect the greatest possible number of true variants.
引用
收藏
页码:684 / 695
页数:12
相关论文
共 30 条
[1]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[2]   Exome sequencing as a tool for Mendelian disease gene discovery [J].
Bamshad, Michael J. ;
Ng, Sarah B. ;
Bigham, Abigail W. ;
Tabor, Holly K. ;
Emond, Mary J. ;
Nickerson, Deborah A. ;
Shendure, Jay .
NATURE REVIEWS GENETICS, 2011, 12 (11) :745-755
[3]   RETRACTED: Evaluation of next-generation sequencing software in mapping and assembly (Retracted article. See vol. 56, pg. 687, 2011) [J].
Bao, Suying ;
Jiang, Rui ;
Kwan, WingKeung ;
Wang, BinBin ;
Ma, Xu ;
Song, You-Qiang .
JOURNAL OF HUMAN GENETICS, 2011, 56 (06) :406-414
[4]   Ultra High Throughput Sequencing in Human DNA Variation Detection: A Comparative Study on the NDUFA3-PRPF31 Region [J].
Benaglio, Paola ;
Rivolta, Carlo .
PLOS ONE, 2010, 5 (09)
[5]   NGSQC: cross-platform quality analysis pipeline for deep sequencing data [J].
Dai, Manhong ;
Thompson, Robert C. ;
Maher, Christopher ;
Contreras-Galindo, Rafael ;
Kaplan, Mark H. ;
Markovitz, David M. ;
Omenn, Gil ;
Meng, Fan .
BMC GENOMICS, 2010, 11
[6]   A framework for variation discovery and genotyping using next-generation DNA sequencing data [J].
DePristo, Mark A. ;
Banks, Eric ;
Poplin, Ryan ;
Garimella, Kiran V. ;
Maguire, Jared R. ;
Hartl, Christopher ;
Philippakis, Anthony A. ;
del Angel, Guillermo ;
Rivas, Manuel A. ;
Hanna, Matt ;
McKenna, Aaron ;
Fennell, Tim J. ;
Kernytsky, Andrew M. ;
Sivachenko, Andrey Y. ;
Cibulskis, Kristian ;
Gabriel, Stacey B. ;
Altshuler, David ;
Daly, Mark J. .
NATURE GENETICS, 2011, 43 (05) :491-+
[7]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3
[8]   Evaluation of next generation sequencing platforms for population targeted sequencing studies [J].
Harismendy, Olivier ;
Ng, Pauline C. ;
Strausberg, Robert L. ;
Wang, Xiaoyun ;
Stockwell, Timothy B. ;
Beeson, Karen Y. ;
Schork, Nicholas J. ;
Murray, Sarah S. ;
Topol, Eric J. ;
Levy, Samuel ;
Frazer, Kelly A. .
GENOME BIOLOGY, 2009, 10 (03)
[9]   The distribution of GC nucleotides and regulatory sequence motifs in genes and their adjacent sequences [J].
Jaksik, Roman ;
Rzeszowska-Wolny, Joanna .
GENE, 2012, 492 (02) :375-381
[10]   Revisiting Mendelian disorders through exome sequencing [J].
Ku, Chee-Seng ;
Naidoo, Nasheen ;
Pawitan, Yudi .
HUMAN GENETICS, 2011, 129 (04) :351-370