A simple Bayesian mixture model with a hybrid procedure for genome-wide association studies

被引:9
作者
Wei, Yu-Chung [1 ,2 ,3 ]
Wen, Shu-Hui [4 ]
Chen, Pei-Chun [1 ,2 ,5 ]
Wang, Chih-Hao [6 ]
Hsiao, Chuhsing K. [1 ,2 ,5 ]
机构
[1] Natl Taiwan Univ, Dept Publ Hlth, Inst Epidemiol, Taipei 100, Taiwan
[2] Natl Taiwan Univ, Res Ctr Gene Environm & Human Hlth, Taipei 100, Taiwan
[3] Natl Chiao Tung Univ, Inst Stat, Hsinchu, Taiwan
[4] Tzu Chi Univ, Coll Med, Dept Publ Hlth, Hualien, Taiwan
[5] Natl Taiwan Univ, Res Ctr Med Excellence, Taipei 100, Taiwan
[6] Fu Jen Catholic Univ, Coll Med, Cardinal Tien Hosp, Dept Cardiol, Taipei, Taiwan
基金
英国惠康基金;
关键词
Bayesian inference; GWAS; mixture model; WTCCC; FALSE DISCOVERY; RHEUMATOID-ARTHRITIS; POSITIVE REPORT; P-VALUES; PROBABILITY; GENE; EPIDEMIOLOGY; MICROARRAY; SCAN;
D O I
10.1038/ejhg.2010.51
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Genome-wide association studies often face the undesirable result of either failing to detect any influential markers at all because of a stringent level for testing error corrections or encountering difficulty in quantifying the importance of markers by their P-values. Advocates of estimation procedures prefer to estimate the proportion of association rather than test significance to avoid overinterpretation. Here, we adopt a Bayesian hierarchical mixture model to estimate directly the proportion of influential markers, and then proceed to a selection procedure based on the Bayes factor (BF). This mixture model is able to accommodate different sources of dependence in the data through only a few parameters. Specifically, we focus on a standardized risk measure of unit variance so that fewer parameters are involved in inference. The expected value of this measure follows a mixture distribution with a mixing probability of association, and it is robust to minor allele frequencies. Furthermore, to select promising markers, we use the magnitude of the BF to represent the strength of evidence in support of the association between markers and disease. We demonstrate this procedure both with simulations and with SNP data from studies on rheumatoid arthritis, coronary artery disease, and Crohn's disease obtained from the Wellcome Trust Case-Control Consortium. This Bayesian procedure outperforms other existing methods in terms of accuracy, power, and computational efficiency. The R code that implements this method is available at http://homepage.ntu.edu.tw/similar to ckhsiao/Bmix/Bmix.htm. European Journal of Human Genetics (2010) 18, 942-947; doi:10.1038/ejhg.2010.51; published online 21 April 2010
引用
收藏
页码:942 / 947
页数:6
相关论文
共 23 条
[11]   Hierarchical Bayes prioritization of marker associations from a genome-wide association scan for further investigation [J].
Lewinger, Juan Pablo ;
Conti, David V. ;
Baurley, James W. ;
Triche, Timothy J. ;
Thomas, Duncan C. .
GENETIC EPIDEMIOLOGY, 2007, 31 (08) :871-882
[12]   A Critique of the False-Positive Report Probability [J].
Lucke, Joseph E. .
GENETIC EPIDEMIOLOGY, 2009, 33 (02) :145-150
[13]   A mixture model-based approach to the clustering of microarray expression data [J].
McLachlan, GJ ;
Bean, RW ;
Peel, D .
BIOINFORMATICS, 2002, 18 (03) :413-422
[14]  
Pan Wei, 2003, Functional & Integrative Genomics, V3, P117
[15]   Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values [J].
Pounds, S ;
Morris, SW .
BIOINFORMATICS, 2003, 19 (10) :1236-1242
[16]   An exploration of aspects of Bayesian multiple testing [J].
Scott, JG ;
Berger, JO .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2006, 136 (07) :2144-2162
[17]   A direct approach to false discovery rates [J].
Storey, JD .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 :479-498
[18]   Selection of influential genetic markers among a large number of candidates based on effect estimation rather than hypothesis testing -: An approach for genome-wide association studies [J].
Stromberg, Ulf ;
Bjork, Jonas ;
Broberg, Karin ;
Mertens, Fredrik ;
Vineis, Paolo .
EPIDEMIOLOGY, 2008, 19 (02) :302-308
[19]   Empirical Bayes and semi-Bayes adjustments for a vast number of estimations [J].
Stromberg, Ulf .
EUROPEAN JOURNAL OF EPIDEMIOLOGY, 2009, 24 (12) :737-741
[20]   Assessing the probability that a positive report is false: An approach for molecular epidemiology studies [J].
Wacholder, S ;
Chanock, S ;
Garcia-Closas, M ;
El ghormli, L ;
Rothman, N .
JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2004, 96 (06) :434-442