NGSQC: cross-platform quality analysis pipeline for deep sequencing data

被引:78
作者
Dai, Manhong [1 ,2 ]
Thompson, Robert C. [1 ,2 ]
Maher, Christopher [3 ,5 ,6 ]
Contreras-Galindo, Rafael [4 ]
Kaplan, Mark H. [4 ]
Markovitz, David M. [4 ,5 ]
Omenn, Gil [5 ,6 ]
Meng, Fan [1 ,2 ,5 ,6 ]
机构
[1] Univ Michigan, Dept Psychiat & Mol, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Behav Neurosci Inst, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Dept Pathol, Michigan Ctr Translat Pathol, Ann Arbor, MI 48109 USA
[4] Univ Michigan, Dept Internal Med, Div Infect Dis, Ann Arbor, MI 48109 USA
[5] Univ Michigan, Ctr Computat Med & Biol, Ann Arbor, MI 48109 USA
[6] Univ Michigan, Natl Ctr Integrat Biomed Informat, Ann Arbor, MI 48109 USA
来源
BMC GENOMICS | 2010年 / 11卷
关键词
GENOME; BEADARRAY;
D O I
10.1186/1471-2164-11-S4-S7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: While the accuracy and precision of deep sequencing data is significantly better than those obtained by the earlier generation of hybridization-based high throughput technologies, the digital nature of deep sequencing output often leads to unwarranted confidence in their reliability. Results: The NGSQC (Next Generation Sequencing Quality Control) pipeline provides a set of novel quality control measures for quickly detecting a wide variety of quality issues in deep sequencing data derived from two dimensional surfaces, regardless of the assay technology used. It also enables researchers to determine whether sequencing data related to their most interesting biological discoveries are caused by sequencing quality issues. Conclusions: Next generation sequencing platforms have their own share of quality issues and there can be significant lab-to-lab, batch-to-batch and even within chip/slide variations. NGSQC can help to ensure that biological conclusions, in particular those based on relatively rare sequence alterations, are not caused by low quality sequencing.
引用
收藏
页数:9
相关论文
共 12 条
  • [1] BASH: a tool for managing BeadArray spatial artefacts
    Cairns, J. M.
    Dunning, M. J.
    Ritchie, M. E.
    Russell, R.
    Lynch, A. G.
    [J]. BIOINFORMATICS, 2008, 24 (24) : 2921 - 2922
  • [2] A benchmark for affymetrix GeneChip expression measures
    Cope, LM
    Irizarry, RA
    Jaffee, HA
    Wu, ZJ
    Speed, TP
    [J]. BIOINFORMATICS, 2004, 20 (03) : 323 - 331
  • [3] Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data
    Dai, MH
    Wang, PL
    Boyd, AD
    Kostov, G
    Athey, B
    Jones, EG
    Bunney, WE
    Myers, RM
    Speed, TP
    Akil, H
    Watson, SJ
    Meng, F
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 (20) : e175.1 - e175.9
  • [4] TileQC: A system for tile-based quality control of Solexa data
    Dolan, Peter C.
    Denver, Dee R.
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)
  • [5] beadarray:: R classes and methods for Illumina bead-based data
    Dunning, Mark J.
    Smith, Mike L.
    Ritchie, Matthew E.
    Tavare, Simon
    [J]. BIOINFORMATICS, 2007, 23 (16) : 2183 - 2184
  • [6] Summaries of affymetrix GeneChip probe level data
    Irizarry, RA
    Bolstad, BM
    Collin, F
    Cope, LM
    Hobbs, B
    Speed, TP
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (04) : e15
  • [7] 1000 Genomes Project Promises Closer Look at Variation in Human Genome
    Kuehn, Bridget M.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2008, 300 (23): : 2715 - 2715
  • [8] Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
    Langmead, Ben
    Trapnell, Cole
    Pop, Mihai
    Salzberg, Steven L.
    [J]. GENOME BIOLOGY, 2009, 10 (03):
  • [9] PIQA: pipeline for Illumina G1 genome analyzer data quality assessment
    Martinez-Alcantara, A.
    Ballesteros, E.
    Feng, C.
    Rojas, M.
    Koshinsky, H.
    Fofanov, V. Y.
    Havlak, P.
    Fofanov, Y.
    [J]. BIOINFORMATICS, 2009, 25 (18) : 2438 - 2439
  • [10] Identification and correction of previously unreported spatial phenomena using raw Illumina BeadArray data
    Smith, Mike L.
    Dunning, Mark J.
    Tavare, Simon
    Lynch, Andy G.
    [J]. BMC BIOINFORMATICS, 2010, 11