Efficient Adaptively Weighted Analysis of Secondary Phenotypes in Case-Control Genome-Wide Association Studies

被引:21
作者
Li, Huilin [1 ]
Gail, Mitchell H. [2 ]
机构
[1] NYU, Sch Med, Dept Populat Hlth, Div Biostat, New York, NY 10016 USA
[2] NCI, Div Canc Epidemiol & Genet, Biostat Branch, NIH, Rockville, MD USA
关键词
Adaptively weighted analysis; Case-control study; Genome-wide association study; Maximum likelihood; Secondary phenotype;
D O I
10.1159/000338943
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
We propose and compare methods of analysis for detecting associations between genotypes of a single nucleotide polymorphism (SNP) and a dichotomous secondary phenotype (X), when the data arise from a case-control study of a primary dichotomous phenotype (D), which is not rare. We considered both a dichotomous genotype (G) as in recessive or dominant models and an additive genetic model based on the number of minor alleles present. To estimate the log odds ratio beta(1) relating X to G in the general population, one needs to understand the conditional distribution [D | X, G] in the general population. For the most general model, [D | X, G], one needs external data on P(D = 1) to estimate A. We show that for this 'full model', the maximum likelihood (FM) corresponds to a previously proposed weighted logistic regression (WL) approach if G is dichotomous. For the additive model, WL yields results numerically close, but not identical, to those of the maximum likelihood FM. Efficiency can be gained by assuming that [D | X, G] is a logistic model with no interaction between X and G (the 'reduced model'). However, the resulting maximum likelihood (RM) can be misleading in the presence of interactions. We therefore propose an adaptively weighted approach (AW) that captures the efficiency of RM but is robust to the occasional SNP that might interact with the secondary phenotype to affect the risk of the primary disease. We study the robustness of FM, WL, RM and AW to misspecification of P(D = 1). In principle, one should be able to estimate A without external information on P(D = 1) under the reduced model. However, our simulations show that the resulting inference is unreliable. Therefore, in practice one needs to introduce external information on P(D = 1), even in the absence of interactions between X and G. Copyright (C) 20125. Karger AG, Basel
引用
收藏
页码:159 / 173
页数:15
相关论文
共 7 条
[1]   Secondary analysis of case-control data [J].
Jiang, YN ;
Scott, AJ ;
Wild, CJ .
STATISTICS IN MEDICINE, 2006, 25 (08) :1323-1339
[2]   Using Cases to Strengthen Inference on the Association Between Single Nucleotide Polymorphisms and a Secondary Phenotype in Genome-Wide Association Studies [J].
Li, Huilin ;
Gail, Mitchell H. ;
Berndt, Sonja ;
Chatterjee, Nilanjan .
GENETIC EPIDEMIOLOGY, 2010, 34 (05) :427-433
[3]   Proper Analysis of Secondary Phenotype Data in Case-Control Association Studies [J].
Lin, D. Y. ;
Zeng, D. .
GENETIC EPIDEMIOLOGY, 2009, 33 (03) :256-265
[4]   Genome-Wide Association Scans for Secondary Traits Using Case-Control Samples [J].
Monsees, Genevieve M. ;
Tamimi, Rulla M. ;
Kraft, Peter .
GENETIC EPIDEMIOLOGY, 2009, 33 (08) :717-728
[5]   Exploiting gene-environment independence for analysis of case-control studies: An empirical bayes-type shrinkage estimator to trade-off between bias and efficiency [J].
Mukherjee, Bhramar ;
Chatterjee, Nilanjan .
BIOMETRICS, 2008, 64 (03) :685-694
[6]   Analyses of case-control data for additional outcomes [J].
Richardson, David B. ;
Rzehak, Peter ;
Klenk, Jochen ;
Weiland, Stephan K. .
EPIDEMIOLOGY, 2007, 18 (04) :441-445
[7]   Estimation of Odds Ratios of Genetic Variants for the Secondary Phenotypes Associated With Primary Diseases [J].
Wang, Jian ;
Shete, Sanjay .
GENETIC EPIDEMIOLOGY, 2011, 35 (03) :190-200