A solid quality-control analysis of AB SOLiD short-read sequencing data

被引:6
作者
Castellana, Stefano
Romani, Marta
Valente, Enza Maria
Mazza, Tommaso
机构
[1] The Bioinformatics Unit, CSS-Mendel Institute
[2] Neurogenetics Unit, CSS-Mendel Institute
关键词
genetics; next generation sequencing; sequencing quality control; GENERATION; ERROR;
D O I
10.1093/bib/bbs048
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Next generation sequencers have greatly improved our ability to mine polymorphisms and mutations out of entire (or portions of) genomes. The reliability of their outputs, though, showed to be very related to the sequencing chemistry and to deeply affect the quality of the downstream analyses. We focus here on the two-base color code chemistry of AB SOLiD sequencers and propose a comprehensive quality control methodological and software pipeline. We used existing and custom tools to detect and purge short-reads of some common flaws due to sequencing errors and chemical hitches. We apply them to a cohort of SOLiD 4 runs and measure their joint efficacy in terms of the resulting ability to detect the greatest possible number of true variants.
引用
收藏
页码:684 / 695
页数:12
相关论文
共 30 条
  • [1] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [2] Exome sequencing as a tool for Mendelian disease gene discovery
    Bamshad, Michael J.
    Ng, Sarah B.
    Bigham, Abigail W.
    Tabor, Holly K.
    Emond, Mary J.
    Nickerson, Deborah A.
    Shendure, Jay
    [J]. NATURE REVIEWS GENETICS, 2011, 12 (11) : 745 - 755
  • [3] RETRACTED: Evaluation of next-generation sequencing software in mapping and assembly (Retracted article. See vol. 56, pg. 687, 2011)
    Bao, Suying
    Jiang, Rui
    Kwan, WingKeung
    Wang, BinBin
    Ma, Xu
    Song, You-Qiang
    [J]. JOURNAL OF HUMAN GENETICS, 2011, 56 (06) : 406 - 414
  • [4] Ultra High Throughput Sequencing in Human DNA Variation Detection: A Comparative Study on the NDUFA3-PRPF31 Region
    Benaglio, Paola
    Rivolta, Carlo
    [J]. PLOS ONE, 2010, 5 (09):
  • [5] NGSQC: cross-platform quality analysis pipeline for deep sequencing data
    Dai, Manhong
    Thompson, Robert C.
    Maher, Christopher
    Contreras-Galindo, Rafael
    Kaplan, Mark H.
    Markovitz, David M.
    Omenn, Gil
    Meng, Fan
    [J]. BMC GENOMICS, 2010, 11
  • [6] A framework for variation discovery and genotyping using next-generation DNA sequencing data
    DePristo, Mark A.
    Banks, Eric
    Poplin, Ryan
    Garimella, Kiran V.
    Maguire, Jared R.
    Hartl, Christopher
    Philippakis, Anthony A.
    del Angel, Guillermo
    Rivas, Manuel A.
    Hanna, Matt
    McKenna, Aaron
    Fennell, Tim J.
    Kernytsky, Andrew M.
    Sivachenko, Andrey Y.
    Cibulskis, Kristian
    Gabriel, Stacey B.
    Altshuler, David
    Daly, Mark J.
    [J]. NATURE GENETICS, 2011, 43 (05) : 491 - +
  • [7] A second generation human haplotype map of over 3.1 million SNPs
    Frazer, Kelly A.
    Ballinger, Dennis G.
    Cox, David R.
    Hinds, David A.
    Stuve, Laura L.
    Gibbs, Richard A.
    Belmont, John W.
    Boudreau, Andrew
    Hardenbol, Paul
    Leal, Suzanne M.
    Pasternak, Shiran
    Wheeler, David A.
    Willis, Thomas D.
    Yu, Fuli
    Yang, Huanming
    Zeng, Changqing
    Gao, Yang
    Hu, Haoran
    Hu, Weitao
    Li, Chaohua
    Lin, Wei
    Liu, Siqi
    Pan, Hao
    Tang, Xiaoli
    Wang, Jian
    Wang, Wei
    Yu, Jun
    Zhang, Bo
    Zhang, Qingrun
    Zhao, Hongbin
    Zhao, Hui
    Zhou, Jun
    Gabriel, Stacey B.
    Barry, Rachel
    Blumenstiel, Brendan
    Camargo, Amy
    Defelice, Matthew
    Faggart, Maura
    Goyette, Mary
    Gupta, Supriya
    Moore, Jamie
    Nguyen, Huy
    Onofrio, Robert C.
    Parkin, Melissa
    Roy, Jessica
    Stahl, Erich
    Winchester, Ellen
    Ziaugra, Liuda
    Altshuler, David
    Shen, Yan
    [J]. NATURE, 2007, 449 (7164) : 851 - U3
  • [8] Evaluation of next generation sequencing platforms for population targeted sequencing studies
    Harismendy, Olivier
    Ng, Pauline C.
    Strausberg, Robert L.
    Wang, Xiaoyun
    Stockwell, Timothy B.
    Beeson, Karen Y.
    Schork, Nicholas J.
    Murray, Sarah S.
    Topol, Eric J.
    Levy, Samuel
    Frazer, Kelly A.
    [J]. GENOME BIOLOGY, 2009, 10 (03):
  • [9] The distribution of GC nucleotides and regulatory sequence motifs in genes and their adjacent sequences
    Jaksik, Roman
    Rzeszowska-Wolny, Joanna
    [J]. GENE, 2012, 492 (02) : 375 - 381
  • [10] Revisiting Mendelian disorders through exome sequencing
    Ku, Chee-Seng
    Naidoo, Nasheen
    Pawitan, Yudi
    [J]. HUMAN GENETICS, 2011, 129 (04) : 351 - 370