Maximizing the Power of Principal-Component Analysis of Correlated Phenotypes in Genome-wide Association Studies

被引:133
作者
Aschard, Hugues [1 ]
Vilhjalmsson, Bjarni J. [1 ,2 ]
Greliche, Nicolas [3 ,4 ]
Morange, Pierre-Emmanuel [6 ]
Tregouet, David-Alexandre [3 ,4 ,5 ]
Kraft, Peter [1 ]
机构
[1] Harvard Univ, Sch Publ Hlth, Dept Epidemiol, Program Genet Epidemiol & Stat Genet, Boston, MA 02115 USA
[2] Broad Inst, Med & Populat Genet Program, Cambridge, MA 02142 USA
[3] UPMC Univ Paris 06, UMR S 1166, Sorbonne Univ, F-75005 Paris, France
[4] INSERM, UMR S 1166, Genom & Physiopathol Cardiovasc Dis, F-75013 Paris, France
[5] Inst Cardiometabolism & Nutr ICAN, F-75013 Paris, France
[6] Aix Marseille Univ, INSERM, UMR S 1062, F-13385 Marseille, France
关键词
POPULATION; PLEIOTROPY; GENOTYPE; TRAITS; LOCUS;
D O I
10.1016/j.ajhg.2014.03.016
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Many human traits are highly correlated. This correlation can be leveraged to improve the power of genetic association tests to identify markers associated with one or more of the traits. Principal component analysis (PCA) is a useful tool that has been widely used for the multivariate analysis of correlated variables. PCA is usually applied as a dimension reduction method: the few top principal components (PCs) explaining most of total trait variance are tested for association with a predictor of interest, and the remaining components are not analyzed. In this study we review the theoretical basis of PCA and describe the behavior of PCA when testing for association between a SNP and correlated traits. We then use simulation to compare the power of various PCA-based strategies when analyzing up to 100 correlated traits. We show that contrary to widespread practice, testing only the top PCs often has low power, whereas combining signal across all PCs can have greater power. This power gain is primarily due to increased power to detect genetic variants with opposite effects on positively correlated traits and variants that are exclusively associated with a single trait. Relative to other methods, the combined-PC approach has close to optimal power in all scenarios considered while offering more flexibility and more robustness to potential confounders. Finally, we apply the proposed PCA strategy to the genome-wide association study of five correlated coagulation traits where we identify two candidate SNPs that were not found by the standard approach.
引用
收藏
页码:662 / 676
页数:15
相关论文
共 36 条
[1]   An integrated map of genetic variation from 1,092 human genomes [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Schmidt, Jeanette P. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Dinh, Huyen ;
Kovar, Christie ;
Lee, Sandra ;
Lewis, Lora ;
Muzny, Donna ;
Reid, Jeff ;
Wang, Min ;
Wang, Jun ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Li, Zhuo ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Su, Zhe ;
Tai, Shuaishuai ;
Tang, Meifang .
NATURE, 2012, 491 (7422) :56-65
[2]   Combined analysis of three genome-wide association studies on vWF and FVIII plasma levels [J].
Antoni, Guillemette ;
Oudot-Mellakh, Tiphaine ;
Dimitromanolakis, Apostolos ;
Germain, Marine ;
Cohen, William ;
Wells, Philip ;
Lathrop, Mark ;
Gagnon, France ;
Morange, Pierre-Emmanuel ;
Tregouet, David-Alexandre .
BMC MEDICAL GENETICS, 2011, 12
[3]   A Nonparametric Test to Detect Quantitative Trait Loci Where the Phenotypic Distribution Differs by Genotypes [J].
Aschard, Hugues ;
Zaitlen, Noah ;
Tamimi, Rulla M. ;
Lindstroem, Sara ;
Kraft, Peter .
GENETIC EPIDEMIOLOGY, 2013, 37 (04) :323-333
[4]   A Phenomics-Based Strategy Identifies Loci on APOC1, BRAP, and PLCG1 Associated with Metabolic Syndrome Phenotype Domains [J].
Avery, Christy L. ;
He, Qianchuan ;
North, Kari E. ;
Ambite, Jose L. ;
Boerwinkle, Eric ;
Fornage, Myriam ;
Hindorff, Lucia A. ;
Kooperberg, Charles ;
Meigs, James B. ;
Pankow, James S. ;
Pendergrass, Sarah A. ;
Psaty, Bruce M. ;
Ritchie, Marylyn D. ;
Rotter, Jerome I. ;
Taylor, Kent D. ;
Wilkens, Lynne R. ;
Heiss, Gerardo ;
Lin, Dan Yu .
PLOS GENETICS, 2011, 7 (10)
[5]  
Elston RC, 2000, GENET EPIDEMIOL, V19, P1, DOI 10.1002/1098-2272(200007)19:1<1::AID-GEPI1>3.0.CO
[6]  
2-E
[7]   A multivariate test of association [J].
Ferreira, Manuel A. R. ;
Purcell, Shaun M. .
BIOINFORMATICS, 2009, 25 (01) :132-133
[8]  
GILL JC, 1987, BLOOD, V69, P1691
[9]   Average information REML: An efficient algorithm for variance parameter estimation in linear mixed models [J].
Gilmour, AR ;
Thompson, R ;
Cullis, BR .
BIOMETRICS, 1995, 51 (04) :1440-1450
[10]   Genomewide linkage scan for combined obesity phenotypes using principal component analysis [J].
He, L. -N. ;
Liu, Y. -J. ;
Xiao, P. ;
Zhang, L. ;
Guo, Y. ;
Yang, T. -L. ;
Zhao, L. -J. ;
Drees, B. ;
Hamilton, J. ;
Deng, H. -Y. ;
Recker, R. R. ;
Deng, H. -W. .
ANNALS OF HUMAN GENETICS, 2008, 72 :319-326