A simple Bayesian mixture model with a hybrid procedure for genome-wide association studies

被引:9
作者
Wei, Yu-Chung [1 ,2 ,3 ]
Wen, Shu-Hui [4 ]
Chen, Pei-Chun [1 ,2 ,5 ]
Wang, Chih-Hao [6 ]
Hsiao, Chuhsing K. [1 ,2 ,5 ]
机构
[1] Natl Taiwan Univ, Dept Publ Hlth, Inst Epidemiol, Taipei 100, Taiwan
[2] Natl Taiwan Univ, Res Ctr Gene Environm & Human Hlth, Taipei 100, Taiwan
[3] Natl Chiao Tung Univ, Inst Stat, Hsinchu, Taiwan
[4] Tzu Chi Univ, Coll Med, Dept Publ Hlth, Hualien, Taiwan
[5] Natl Taiwan Univ, Res Ctr Med Excellence, Taipei 100, Taiwan
[6] Fu Jen Catholic Univ, Coll Med, Cardinal Tien Hosp, Dept Cardiol, Taipei, Taiwan
基金
英国惠康基金;
关键词
Bayesian inference; GWAS; mixture model; WTCCC; FALSE DISCOVERY; RHEUMATOID-ARTHRITIS; POSITIVE REPORT; P-VALUES; PROBABILITY; GENE; EPIDEMIOLOGY; MICROARRAY; SCAN;
D O I
10.1038/ejhg.2010.51
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Genome-wide association studies often face the undesirable result of either failing to detect any influential markers at all because of a stringent level for testing error corrections or encountering difficulty in quantifying the importance of markers by their P-values. Advocates of estimation procedures prefer to estimate the proportion of association rather than test significance to avoid overinterpretation. Here, we adopt a Bayesian hierarchical mixture model to estimate directly the proportion of influential markers, and then proceed to a selection procedure based on the Bayes factor (BF). This mixture model is able to accommodate different sources of dependence in the data through only a few parameters. Specifically, we focus on a standardized risk measure of unit variance so that fewer parameters are involved in inference. The expected value of this measure follows a mixture distribution with a mixing probability of association, and it is robust to minor allele frequencies. Furthermore, to select promising markers, we use the magnitude of the BF to represent the strength of evidence in support of the association between markers and disease. We demonstrate this procedure both with simulations and with SNP data from studies on rheumatoid arthritis, coronary artery disease, and Crohn's disease obtained from the Wellcome Trust Case-Control Consortium. This Bayesian procedure outperforms other existing methods in terms of accuracy, power, and computational efficiency. The R code that implements this method is available at http://homepage.ntu.edu.tw/similar to ckhsiao/Bmix/Bmix.htm. European Journal of Human Genetics (2010) 18, 942-947; doi:10.1038/ejhg.2010.51; published online 21 April 2010
引用
收藏
页码:942 / 947
页数:6
相关论文
共 23 条
[21]   A Bayesian measure of the probability of false discovery in genetic epidemiology studies [J].
Wakefield, Jon .
AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (02) :208-227
[22]   Bayes Factors for Genome-Wide Association Studies: Comparison with P-values [J].
Wakefield, Jon .
GENETIC EPIDEMIOLOGY, 2009, 33 (01) :79-86
[23]   A two-stage design for multiple testing in large-scale association studies [J].
Wen, Shu-Hui ;
Tzeng, Jung-Ying ;
Kao, Jau-Tsuen ;
Hsiao, Chuhsing Kate .
JOURNAL OF HUMAN GENETICS, 2006, 51 (06) :523-532