Reverse GWAS: Using genetics to identify and model phenotypic subtypes

被引:28
作者
Dahl, Andy [1 ]
Cai, Na [2 ,3 ]
Ko, Arthur [4 ]
Laakso, Markku [5 ,6 ]
Pajukanta, Paivi [4 ]
Flint, Jonathan [7 ]
Zaitlen, Noah [1 ]
机构
[1] UCSF, Dept Med, San Francisco, CA 94143 USA
[2] Wellcome Sanger Inst, Cambridge, England
[3] European Bioinformat Inst EMBL EBI, Cambridge, England
[4] UCLA, David Geffen Sch Med, Dept Human Genet, Los Angeles, CA 90095 USA
[5] Univ Eastern Finland, Inst Clin Med, Internal Med, Kuopio, Finland
[6] Kuopio Univ Hosp, Kuopio, Finland
[7] UCLA, Semel Inst Neurosci & Human Behav, Ctr Neurobehav Genet, Los Angeles, CA 90024 USA
来源
PLOS GENETICS | 2019年 / 15卷 / 04期
基金
美国国家卫生研究院;
关键词
RISK PREDICTION; HETEROGENEITY; ASSOCIATION; HERITABILITY; VARIANTS; IDENTIFICATION; THERAPY; GENOME; EXPRESSION; DISORDER;
D O I
10.1371/journal.pgen.1008009
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Recent and classical work has revealed biologically and medically significant subtypes in complex diseases and traits. However, relevant subtypes are often unknown, unmeasured, or actively debated, making automated statistical approaches to subtype definition valuable. We propose reverse GWAS (RGWAS) to identify and validate subtypes using genetics and multiple traits: while GWAS seeks the genetic basis of a given trait, RGWAS seeks to define trait subtypes with distinct genetic bases. Unlike existing approaches relying on off-the-shelf clustering methods, RGWAS uses a novel decomposition, MFMR, to model covariates, binary traits, and population structure. We use extensive simulations to show that modelling these features can be crucial for power and calibration. We validate RGWAS in practice by recovering a recently discovered stress subtype in major depression. We then show the utility of RGWAS by identifying three novel subtypes of metabolic traits. We biologically validate these metabolic subtypes with SNP-level tests and a novel polygenic test: the former recover known metabolic GxE SNPs; the latter suggests subtypes may explain substantial missing heritability. Crucially, statins, which are widely prescribed and theorized to increase diabetes risk, have opposing effects on blood glucose across metabolic subtypes, suggesting the subtypes have potential translational value. Author summary Complex diseases depend on interactions between many known and unknown genetic and environmental factors. However, most studies aggregate these strata and test for associations on average across samples, though biological factors and medical interventions can have dramatically different effects on different people. Further, more-sophisticated models are often infeasible because relevant sources of heterogeneity are not generally known a priori. We introduce Reverse GWAS to simultaneously split samples into homogeneous subtypes and to learn differences in genetic or treatment effects between subtypes. Unlike existing approaches to computational subtype identification from high-dimensional trait data, RGWAS accounts for covariates, binary disease traits and, especially, population structure-important features of real genetic datasets. We validate RGWAS by recovering known genetic subtypes of major depression. We demonstrate RGWAS can uncover useful novel subtypes in a metabolic dataset, finding three novel subtypes with both SNP- and polygenic-level heterogeneity. Importantly, we show that RGWAS can uncover subtypes with differential treatment response: we show that statin, a common drug and potential type 2 diabetes risk factor, may have opposing subtype-specific effects on blood glucose.
引用
收藏
页数:22
相关论文
共 76 条
  • [1] Exploring patterns enriched in a dataset with contrastive principal component analysis
    Abid, Abubakar
    Zhang, Martin J.
    Bagaria, Vivek K.
    Zou, James
    [J]. NATURE COMMUNICATIONS, 2018, 9
  • [2] Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables
    Ahlqvist, Emma
    Storm, Petter
    Karajamaki, Annemari
    Martinell, Mats
    Dorkhan, Mozhgan
    Carlsson, Annelie
    Vikman, Petter
    Prasad, Rashmi B.
    Aly, Dina Mansour
    Almgren, Peter
    Wessman, Ylva
    Shaat, Nael
    Spegel, Peter
    Mulder, Hindrik
    Lindholm, Eero
    Melander, Olle
    Hansson, Ola
    Malmqvist, Ulf
    Lernmark, Ake
    Lahti, Kaj
    Forsen, Tom
    Tuomi, Tiinamaija
    Rosengren, Anders H.
    Groop, Leif
    [J]. LANCET DIABETES & ENDOCRINOLOGY, 2018, 6 (05) : 361 - 369
  • [3] [Anonymous], 2016, HUMAN MOL GENETICS, V13, pE40
  • [4] [Anonymous], BIORXIV
  • [5] Uncovering the Hidden Risk Architecture of the Schizophrenias: Confirmation in Three Independent Genome-Wide Association Studies
    Arnedo, Javier
    Svrakic, Dragan M.
    del Val, Coral
    Romero-Zaliz, Rocio
    Hernandez-Cuervo, Helena
    Fanous, Ayman H.
    Pato, Michele T.
    Pato, Carlos N.
    de Erausquin, Gabriel A.
    Cloninger, C. Robert
    Zwir, Igor
    [J]. AMERICAN JOURNAL OF PSYCHIATRY, 2015, 172 (02) : 139 - 153
  • [6] Variants Identified in a GWAS Meta-Analysis for Blood Lipids Are Associated with the Lipid Response to Fenofibrate
    Aslibekyan, Stella
    Goodarzi, Mark O.
    Frazier-Wood, Alexis C.
    Yan, Xiaofei
    Irvin, Marguerite R.
    Kim, Eric
    Tiwari, Hemant K.
    Guo, Xiuqing
    Straka, Robert J.
    Taylor, Kent D.
    Tsai, Michael Y.
    Hopkins, Paul N.
    Korenman, Stanley G.
    Borecki, Ingrid B.
    Chen, Yii-Der I.
    Ordovas, Jose M.
    Rotter, Jerome I.
    Arnett, Donna K.
    [J]. PLOS ONE, 2012, 7 (10):
  • [7] Bothwell PM, 2018, LANCET, V392, P387, DOI [10.1016/S0140-6736(18)31133-4, 10.1016/s0140-6736(18)31133-4]
  • [8] Genetic interactions affecting human gene expression identified by variance association mapping
    Brown, Andrew Anand
    Buil, Alfonso
    Vinuela, Ana
    Lappalainen, Tuuli
    Zheng, Hou-Feng
    Richards, John B.
    Small, Kerrin S.
    Spector, Timothy D.
    Dermitzakis, Emmanouil T.
    Durbin, Richard
    [J]. ELIFE, 2014, 3
  • [9] Sparse whole-genome sequencing identifies two loci for major depressive disorder
    Cai, Na
    Bigdeli, Tim B.
    Kretzschmar, Warren
    Li, Yihan
    Liang, Jieqin
    Song, Li
    Hu, Jingchu
    Li, Qibin
    Jin, Wei
    Hu, Zhenfei
    Wang, Guangbiao
    Wang, Linmao
    Qian, Puyi
    Liu, Yuan
    Jiang, Tao
    Lu, Yao
    Zhang, Xiuqing
    Yin, Ye
    Li, Yingrui
    Xu, Xun
    Gao, Jingfang
    Reimers, Mark
    Webb, Todd
    Riley, Brien
    Bacanu, Silviu
    Peterson, Roseann E.
    Chen, Yiping
    Zhong, Hui
    Liu, Zhengrong
    Wang, Gang
    Sun, Jing
    Sang, Hong
    Jiang, Guoqing
    Zhou, Xiaoyan
    Li, Yi
    Li, Yi
    Zhang, Wei
    Wang, Xueyi
    Fang, Xiang
    Pan, Runde
    Miao, Guodong
    Zhang, Qiwen
    Hu, Jian
    Yu, Fengyu
    Du, Bo
    Sang, Wenhua
    Li, Keqing
    Chen, Guibing
    Cai, Min
    Yang, Lijun
    [J]. NATURE, 2015, 523 (7562) : 588 - +
  • [10] Heterogeneity of autoimmune diseases: pathophysiologic insights from genetics and implications for new therapies
    Cho, Judy H.
    Feldman, Marc
    [J]. NATURE MEDICINE, 2015, 21 (07) : 730 - 738