SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data

被引:5
作者
Liu, Qian [1 ,2 ]
Hu, Qiang [2 ]
Yao, Song [3 ]
Kwan, Marilyn L. [4 ]
Roh, Janise M. [4 ]
Zhao, Hua [5 ]
Ambrosone, Christine B. [3 ]
Kushi, Lawrence H. [4 ]
Liu, Song [2 ]
Zhu, Qianqian [2 ]
机构
[1] Univ Buffalo SUNY, Dept Biostat, Buffalo, NY 14260 USA
[2] Roswell Pk Comprehens Canc Ctr, Dept Biostat & Bioinformat, Buffalo, NY 14263 USA
[3] Roswell Pk Comprehens Canc Ctr, Dept Canc Prevent & Control, Buffalo, NY 14263 USA
[4] Kaiser Permanent Northern Calif, Div Res, Oakland, CA 94612 USA
[5] Univ Texas MD Anderson Canc Ctr, Dept Epidemiol, Houston, TX 77030 USA
基金
美国国家卫生研究院;
关键词
Next-generation sequencing; Quality assessment; 1000 Genomes Project; Whole-exome sequencing; Bioconductor package; FUNCTIONAL IMPACT; GENOME; ASSOCIATION; ANEUPLOIDY; MUTATIONS; TOOL; DNA;
D O I
10.1016/j.gpb.2018.07.006
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
As next-generation sequencing (NGS) technology has become widely used to identify genetic causal variants for various diseases and traits, a number of packages for checking NGS data quality have sprung up in public domains. In addition to the quality of sequencing data, sample quality issues, such as gender mismatch, abnormal inbreeding coefficient, cryptic relatedness, and population outliers, can also have fundamental impact on downstream analysis. However, there is a lack of tools specialized in identifying problematic samples from NGS data, often due to the limitation of sample size and variant counts. We developed SeqSQC, a Bioconductor package, to automate and accelerate sample cleaning in NGS data of any scale. SeqSQC is designed for efficient data storage and access, and equipped with interactive plots for intuitive data visualization to expedite the identification of problematic samples. SeqSQC is available at http://bioconductor.org/packages/SeqSQC.
引用
收藏
页码:211 / 218
页数:8
相关论文
共 39 条
[1]  
Adzhubei Ivan, 2013, Curr Protoc Hum Genet, VChapter 7, DOI 10.1002/0471142905.hg0720s76
[2]   A method and server for predicting damaging missense mutations [J].
Adzhubei, Ivan A. ;
Schmidt, Steffen ;
Peshkin, Leonid ;
Ramensky, Vasily E. ;
Gerasimova, Anna ;
Bork, Peer ;
Kondrashov, Alexey S. ;
Sunyaev, Shamil R. .
NATURE METHODS, 2010, 7 (04) :248-249
[3]   A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[4]   Establishing a Cancer Center Data Bank and Biorepository for multidisciplinary research [J].
Ambrosone, Christine B. ;
Nesline, Mary K. ;
Davis, Warren .
CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2006, 15 (09) :1575-1577
[5]   GenABEL: an R library for genome-wide association analysis [J].
Aulchenko, Yurii S. ;
Ripke, Stephan ;
Isaacs, Aaron ;
Van Duijn, Cornelia M. .
BIOINFORMATICS, 2007, 23 (10) :1294-1296
[6]   Segmental duplications: Organization and impact within the current Human Genome Project assembly [J].
Bailey, JA ;
Yavor, AM ;
Massa, HF ;
Trask, BJ ;
Eichler, EE .
GENOME RESEARCH, 2001, 11 (06) :1005-1017
[7]   Exome sequencing as a tool for Mendelian disease gene discovery [J].
Bamshad, Michael J. ;
Ng, Sarah B. ;
Bigham, Abigail W. ;
Tabor, Holly K. ;
Emond, Mary J. ;
Nickerson, Deborah A. ;
Shendure, Jay .
NATURE REVIEWS GENETICS, 2011, 12 (11) :745-755
[8]   DNA Sequencing versus Standard Prenatal Aneuploidy Screening [J].
Bianchi, Diana W. ;
Parker, R. Lamar ;
Wentworth, Jeffrey ;
Madankumar, Rajeevi ;
Saffer, Craig ;
Das, Anita F. ;
Craig, Joseph A. ;
Chudova, Darya I. ;
Devers, Patricia L. ;
Jones, Keith W. ;
Oliver, Kelly ;
Rava, Richard P. ;
Sehnert, Amy J. .
NEW ENGLAND JOURNAL OF MEDICINE, 2014, 370 (09) :799-808
[9]   Analysis of 589,306 genomes identifies individuals resilient to severe Mendelian childhood diseases [J].
Chen, Rong ;
Shi, Lisong ;
Hakenberg, Joerg ;
Naughton, Brian ;
Sklar, Pamela ;
Zhang, Jianguo ;
Zhou, Hanlin ;
Tian, Lifeng ;
Prakash, Om ;
Lemire, Mathieu ;
Sleiman, Patrick ;
Cheng, Wei-yi ;
Chen, Wanting ;
Shah, Hardik ;
Shen, Yulan ;
Fromer, Menachem ;
Omberg, Larsson ;
Deardorff, Matthew A. ;
Zackai, Elaine ;
Bobe, Jason R. ;
Levin, Elissa ;
Hudson, Thomas J. ;
Groop, Leif ;
Wang, Jun ;
Hakonarson, Hakon ;
Wojcicki, Anne ;
Diaz, George A. ;
Edelmann, Lisa ;
Schadt, Eric E. ;
Friend, Stephen H. .
NATURE BIOTECHNOLOGY, 2016, 34 (05) :531-538
[10]   Noninvasive prenatal diagnosis of fetal chromosomal aneuploidy by massively parallel genomic sequencing of DNA in maternal plasma [J].
Chiu, Rossa W. K. ;
Chan, K. C. Allen ;
Gao, Yuan ;
Lau, Virginia Y. M. ;
Zheng, Wenli ;
Leung, Tak Y. ;
Foo, Chris H. F. ;
Xie, Bin ;
Tsui, Nancy B. Y. ;
Lun, Fiona M. F. ;
Zee, Benny C. Y. ;
Lau, Tze K. ;
Cantor, Charles R. ;
Lo, Y. M. Dennis .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (51) :20458-20463