GUESS-ing Polygenic Associations with Multiple Phenotypes Using a GPU-Based Evolutionary Stochastic Search Algorithm

被引:50
作者
Bottolo, Leonardo [1 ]
Chadeau-Hyam, Marc [2 ]
Hastie, David I. [2 ]
Zeller, Tanja [3 ]
Liquet, Benoit [4 ,5 ]
Newcombe, Paul [5 ]
Yengo, Loic [6 ,7 ]
Wild, Philipp S. [8 ]
Schillert, Arne [9 ]
Ziegler, Andreas [9 ]
Nielsen, Sune F. [10 ,11 ]
Butterworth, Adam S. [12 ]
Ho, Weang Kee [12 ]
Castagne, Raphaele [13 ]
Munzel, Thomas [14 ]
Tregouet, David [12 ]
Falchi, Mario [15 ]
Cambien, Francois [13 ]
Nordestgaard, Borge G. [10 ,11 ]
Fumeron, Frederic [16 ,17 ]
Tybjaerg-Hansen, Anne [11 ]
Froguel, Philippe [6 ,7 ,15 ,18 ]
Danesh, John [12 ]
Petretto, Enrico [19 ]
Blankenberg, Stefan [3 ]
Tiret, Laurence [13 ]
Richardson, Sylvia [5 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Math, London, England
[2] Univ London Imperial Coll Sci Technol & Med, Dept Epidemiol & Biostat, London, England
[3] Univ Heart Ctr Hamburg, Dept Gen & Intervent Cardiol, Hamburg, Germany
[4] Univ Victor Segalen, INSERM, U897, Bordeaux, France
[5] Inst Publ Hlth, MRC Biostat Unit, Cambridge, England
[6] European Genom Inst Diabet, Lille, France
[7] Inst Pasteur, CNRS, UMR 8199, F-59019 Lille, France
[8] Univ Med Ctr Mainz, Ctr Thrombosis & Haemostasis, Mainz, Germany
[9] Med Univ Lubeck, Inst Med Biometry & Stat, D-23538 Lubeck, Germany
[10] Herlev Hosp, Dept Clin Biochem, Copenhagen, Denmark
[11] Univ Copenhagen, Copenhagen Univ Hosp, Copenhagen, Denmark
[12] Univ Cambridge, Dept Publ Hlth & Primary Care, Cambridge, England
[13] Univ Paris 06, INSERM, UMRS 937, Paris, France
[14] Univ Med Ctr Mainz, Dept Med 2, Mainz, Germany
[15] Univ London Imperial Coll Sci Technol & Med, Hammersmith Hosp, Dept Genom Common Dis, Sch Publ Hlth, London, England
[16] INSERM, U695, Paris, France
[17] Univ Paris 07, UFR Med Site Bichat, Paris, France
[18] Univ Lille 2, Lille, France
[19] Univ London Imperial Coll Sci Technol & Med, Fac Med, Med Res Council Clin Sci Ctr, London, England
来源
PLOS GENETICS | 2013年 / 9卷 / 08期
基金
英国医学研究理事会; 英国惠康基金;
关键词
GENOME-WIDE ASSOCIATION; BAYESIAN VARIABLE SELECTION; MODEL; LASSO; REGULARIZATION; REGRESSION; SORT1; RISK;
D O I
10.1371/journal.pgen.1003657
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genome-wide association studies (GWAS) yielded significant advances in defining the genetic architecture of complex traits and disease. Still, a major hurdle of GWAS is narrowing down multiple genetic associations to a few causal variants for functional studies. This becomes critical in multi-phenotype GWAS where detection and interpretability of complex SNP(s)trait( s) associations are complicated by complex Linkage Disequilibrium patterns between SNPs and correlation between traits. Here we propose a computationally efficient algorithm (GUESS) to explore complex genetic-association models and maximize genetic variant detection. We integrated our algorithm with a new Bayesian strategy for multi-phenotype analysis to identify the specific contribution of each SNP to different trait combinations and study genetic regulation of lipid metabolism in the Gutenberg Health Study (GHS). Despite the relatively small size of GHS (n = 3,175), when compared with the largest published meta-GWAS (n>100,000), GUESS recovered most of the major associations and was better at refining multi-trait associations than alternative methods. Amongst the new findings provided by GUESS, we revealed a strong association of SORT1 with TG-APOB and LIPC with TG-HDL phenotypic groups, which were overlooked in the larger meta-GWAS and not revealed by competing approaches, associations that we replicated in two independent cohorts. Moreover, we demonstrated the increased power of GUESS over alternative multi-phenotype approaches, both Bayesian and non-Bayesian, in a simulation study that mimics real-case scenarios. We showed that our parallel implementation based on Graphics Processing Units outperforms alternative multi-phenotype methods. Beyond multivariate modelling of multiphenotypes, our Bayesian model employs a flexible hierarchical prior structure for genetic effects that adapts to any correlation structure of the predictors and increases the power to identify associated variants. This provides a powerful tool for the analysis of diverse genomic features, for instance including gene expression and exome sequencing data, where complex dependencies are present in the predictor space.
引用
收藏
页数:17
相关论文
共 56 条
  • [1] An integrated map of genetic variation from 1,092 human genomes
    Altshuler, David M.
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Donnelly, Peter
    Eichler, Evan E.
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Green, Eric D.
    Hurles, Matthew E.
    Knoppers, Bartha M.
    Korbel, Jan O.
    Lander, Eric S.
    Lee, Charles
    Lehrach, Hans
    Mardis, Elaine R.
    Marth, Gabor T.
    McVean, Gil A.
    Nickerson, Deborah A.
    Schmidt, Jeanette P.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Dinh, Huyen
    Kovar, Christie
    Lee, Sandra
    Lewis, Lora
    Muzny, Donna
    Reid, Jeff
    Wang, Min
    Wang, Jun
    Fang, Xiaodong
    Guo, Xiaosen
    Jian, Min
    Jiang, Hui
    Jin, Xin
    Li, Guoqing
    Li, Jingxiang
    Li, Yingrui
    Li, Zhuo
    Liu, Xiao
    Lu, Yao
    Ma, Xuedi
    Su, Zhe
    Tai, Shuaishuai
    Tang, Meifang
    [J]. NATURE, 2012, 491 (7422) : 56 - 65
  • [2] Role of Hepatic Lipase and Endothelial Lipase in High-Density Lipoprotein-Mediated Reverse Cholesterol Transport
    Annema, Wijtske
    Tietge, Uwe J. F.
    [J]. CURRENT ATHEROSCLEROSIS REPORTS, 2011, 13 (03) : 257 - 265
  • [3] Balkau B, 1996, REV EPIDEMIOL SANTE, V44, P373
  • [4] Optimal predictive model selection
    Barbieri, MM
    Berger, JO
    [J]. ANNALS OF STATISTICS, 2004, 32 (03) : 870 - 897
  • [5] Evolutionary Stochastic Search for Bayesian Model Exploration
    Bottolo, Leonard
    Richardson, Sylvia
    [J]. BAYESIAN ANALYSIS, 2010, 5 (03): : 583 - 618
  • [6] ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration
    Bottolo, Leonardo
    Chadeau-Hyam, Marc
    Hastie, David I.
    Langley, Sarah R.
    Petretto, Enrico
    Tiret, Laurence
    Tregouet, David
    Richardson, Sylvia
    [J]. BIOINFORMATICS, 2011, 27 (04) : 587 - 588
  • [7] A Replication Study of GWAS-Derived Lipid Genes in Asian Indians: The Chromosomal Region 11q23.3 Harbors Loci Contributing to Triglycerides
    Braun, Timothy R.
    Been, Latonya F.
    Singhal, Akhil
    Worsham, Jacob
    Ralhan, Sarju
    Wander, Gurpreet S.
    Chambers, John C.
    Kooner, Jaspal S.
    Aston, Christopher E.
    Sanghera, Dharambir K.
    [J]. PLOS ONE, 2012, 7 (05):
  • [8] Multivariate Bayesian variable selection and prediction
    Brown, PJ
    Vannucci, M
    Fearn, T
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1998, 60 : 627 - 641
  • [9] Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium
    Carlson, CS
    Eberle, MA
    Rieder, MJ
    Yi, Q
    Kruglyak, L
    Nickerson, DA
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2004, 74 (01) : 106 - 120
  • [10] Bayesian Variable Selection with Joint Modeling of Categorical and Survival Outcomes: An Application to Individualizing Chemotherapy Treatment in Advanced Colorectal Cancer
    Chen, Wei
    Ghosh, Debashis
    Raghunathan, Trivellore E.
    Sargent, Daniel J.
    [J]. BIOMETRICS, 2009, 65 (04) : 1030 - 1040