A powerful test for multiple rare variants association studies that incorporates sequencing qualities

被引:19
作者
Daye, Z. John [2 ]
Li, Hongzhe [2 ]
Wei, Zhi [1 ]
机构
[1] New Jersey Inst Technol, Dept Comp Sci, Newark, NJ 07102 USA
[2] Univ Penn, Sch Med, Dept Biostat & Epidemiol, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
COMMON DISEASES; GENES; SUSCEPTIBILITY; FRAMEWORK; CONTRIBUTE; ALLELES; VALUES; SNPS;
D O I
10.1093/nar/gks024
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Next-generation sequencing data will soon become routinely available for association studies between complex traits and rare variants. Sequencing data, however, are characterized by the presence of sequencing errors at each individual genotype. This makes it especially challenging to perform association studies of rare variants, which, due to their low minor allele frequencies, can be easily perturbed by genotype errors. In this article, we develop the quality-weighted multivariate score association test (qMSAT), a new procedure that allows powerful association tests between complex traits and multiple rare variants under the presence of sequencing errors. Simulation results based on quality scores from real data show that the qMSAT often dominates over current methods, that do not utilize quality information. In particular, the qMSAT can dramatically increase power over existing methods under moderate sample sizes and relatively low coverage. Moreover, in an obesity data study, we identified using the qMSAT two functional regions (MGLL promoter and MGLL 3'-untranslated region) where rare variants are associated with extreme obesity. Due to the high cost of sequencing data, the qMSAT is especially valuable for large-scale studies involving rare variants, as it can potentially increase power without additional experimental cost. qMSAT is freely available at http://qmsat.sourceforge.net/.
引用
收藏
页数:12
相关论文
共 55 条
[1]   Medical sequencing at the extremes of human body mass [J].
Ahituv, Nadav ;
Kavaslar, Nihan ;
Schackwitz, Wendy ;
Ustaszewska, Anna ;
Martin, Joel ;
Hebert, Sybil ;
Doelle, Heather ;
Ersoy, Baran ;
Kryukov, Gregory ;
Schmidt, Steffen ;
Yosef, Nir ;
Ruppin, Eytan ;
Sharan, Roded ;
Vaisse, Christian ;
Sunyaev, Shamil ;
Dent, Robert ;
Cohen, Jonathan ;
McPherson, Ruth ;
Pennacchio, Len A. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 80 (04) :779-791
[2]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[3]  
[Anonymous], 1988, Transformation and weighting in regressionNew
[4]  
Bai ZD, 1996, STAT SINICA, V6, P311
[5]   Comparison of Statistical Tests for Disease Association With Rare Variants [J].
Basu, Saonli ;
Pan, Wei .
GENETIC EPIDEMIOLOGY, 2011, 35 (07) :606-619
[6]   A Covering Method for Detecting Genetic Associations between Rare Variants and Common Phenotypes [J].
Bhatia, Gaurav ;
Bansal, Vikas ;
Harismendy, Olivier ;
Schork, Nicholas J. ;
Topol, Eric J. ;
Frazer, Kelly ;
Bafna, Vineet .
PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (10)
[7]   Common and rare variants in multifactorial susceptibility to common diseases [J].
Bodmer, Walter ;
Bonilla, Carolina .
NATURE GENETICS, 2008, 40 (06) :695-701
[8]   Analysis of multiple SNPs in a candidate gene or region [J].
Chapman, Juliet ;
Whittaker, John .
GENETIC EPIDEMIOLOGY, 2008, 32 (06) :560-566
[9]   A TWO-SAMPLE TEST FOR HIGH-DIMENSIONAL DATA WITH APPLICATIONS TO GENE-SET TESTING [J].
Chen, Song Xi ;
Qin, Ying-Li .
ANNALS OF STATISTICS, 2010, 38 (02) :808-835
[10]  
CHURCHILL GA, 1994, GENETICS, V138, P963