Strategies and issues in the detection of pathway enrichment in genome-wide association studies

被引:95
作者
Hong, Mun-Gwan [1 ]
Pawitan, Yudi [1 ]
Magnusson, Patrik K. E. [1 ]
Prince, Jonathan A. [1 ]
机构
[1] Karolinska Inst, Dept Med Epidemiol & Biostat, S-17177 Stockholm, Sweden
基金
英国医学研究理事会; 美国国家卫生研究院;
关键词
COMPLEX HUMAN TRAITS; SUSCEPTIBILITY LOCI; GENE; DISEASE; RISK; PRIORITIZATION; POLYMORPHISM; REPLICATION; ANNOTATION; EXPRESSION;
D O I
10.1007/s00439-009-0676-z
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
A fundamental question in human genetics is the degree to which the polygenic character of complex traits derives from polymorphism in genes with similar or with dissimilar functions. The many genome-wide association studies now being performed offer an opportunity to investigate this, and although early attempts are emerging, new tools and modeling strategies still need to be developed and deployed. Towards this goal, we implemented a new algorithm to facilitate the transition from genetic marker lists (principally those generated by PLINK) to pathway analyses of representational gene sets in either threshold or threshold-free downstream applications (e.g. DAVID, GSEA-P, and Ingenuity Pathway Analysis). This was applied to several large genome-wide association studies covering diverse human traits that included type 2 diabetes, Crohn's disease, and plasma lipid levels. Validation of this approach was obtained for plasma HDL levels, where functional categories related to lipid metabolism emerged as the most significant in two independent studies. From analyses of these samples, we highlight and address numerous issues related to this strategy, including appropriate gene based correction statistics, the utility of imputed versus non-imputed marker sets, and the apparent enrichment of pathways due solely to the positional clustering of functionally related genes. The latter in particular emphasizes the importance of studies that directly tie genetic variation to functional characteristics of specific genes. The software freely provided that we have called ProxyGeneLD may resolve an important bottleneck in pathway-based analyses of genome-wide association data. This has allowed us to identify at least one replicable case of pathway enrichment but also to highlight functional gene clustering as a potentially serious problem that may lead to spurious pathway findings if not corrected.
引用
收藏
页码:289 / 301
页数:13
相关论文
共 37 条
  • [1] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [2] Pathways-based analyses of whole-genome association study data in bipolar disorder reveal genes mediating ion channel activity and synaptic neurotransmission
    Askland, Kathleen
    Read, Cynthia
    Moore, Jason
    [J]. HUMAN GENETICS, 2009, 125 (01) : 63 - 79
  • [3] Loci influencing lipid levels and coronary heart disease risk in 16 European population cohorts
    Aulchenko, Yurii S.
    Ripatti, Samuli
    Lindqvist, Ida
    Boomsma, Dorret
    Heid, Iris M.
    Pramstaller, Peter P.
    Penninx, Brenda W. J. H.
    Janssens, A. Cecile J. W.
    Wilson, James F.
    Spector, Tim
    Martin, Nicholas G.
    Pedersen, Nancy L.
    Kyvik, Kirsten Ohm
    Kaprio, Jaakko
    Hofman, Albert
    Freimer, Nelson B.
    Jarvelin, Marjo-Riitta
    Gyllensten, Ulf
    Campbell, Harry
    Rudan, Igor
    Johansson, Asa
    Marroni, Fabio
    Hayward, Caroline
    Vitart, Veronique
    Jonasson, Inger
    Pattaro, Cristian
    Wright, Alan
    Hastie, Nick
    Pichler, Irene
    Hicks, Andrew A.
    Falchi, Mario
    Willemsen, Gonneke
    Hottenga, Jouke-Jan
    de Geus, Eco J. C.
    Montgomery, Grant W.
    Whitfield, John
    Magnusson, Patrik
    Saharinen, Juha
    Perola, Markus
    Silander, Kaisa
    Isaacs, Aaron
    Sijbrands, Eric J. G.
    Uitterlinden, Andre G.
    Witteman, Jacqueline C. M.
    Oostra, Ben A.
    Elliott, Paul
    Ruokonen, Aimo
    Sabatti, Chiara
    Gieger, Christian
    Meitinger, Thomas
    [J]. NATURE GENETICS, 2009, 41 (01) : 47 - 55
  • [4] Genome-wide association analysis of susceptibility and clinical phenotype in multiple sclerosis
    Baranzini, Sergio E.
    Wang, Joanne
    Gibson, Rachel A.
    Galwey, Nicholas
    Naegelin, Yvonne
    Barkhof, Frederik
    Radue, Ernst-Wilhelm
    Lindberg, Raija L. P.
    Uitdehaag, Bernard M. G.
    Johnson, Michael R.
    Angelakopoulou, Aspasia
    Hall, Leslie
    Richardson, Jill C.
    Prinjha, Rab K.
    Gass, Achim
    Geurts, Jeroen J. G.
    Kragt, Jolijn
    Sombekke, Madeleine
    Vrenken, Hugo
    Qualley, Pamela
    Lincoln, Robin R.
    Gomez, Refujia
    Caillier, Stacy J.
    George, Michaela F.
    Mousavi, Hourieh
    Guerrero, Rosa
    Okuda, Darin T.
    Cree, Bruce A. C.
    Green, Ari J.
    Waubant, Emmanuelle
    Goodin, Douglas S.
    Pelletier, Daniel
    Matthews, Paul M.
    Hauser, Stephen L.
    Kappos, Ludwig
    Polman, Chris H.
    Oksenberg, Jorge R.
    [J]. HUMAN MOLECULAR GENETICS, 2009, 18 (04) : 767 - 778
  • [5] Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease
    Barrett, Jeffrey C.
    Hansoul, Sarah
    Nicolae, Dan L.
    Cho, Judy H.
    Duerr, Richard H.
    Rioux, John D.
    Brant, Steven R.
    Silverberg, Mark S.
    Taylor, Kent D.
    Barmada, M. Michael
    Bitton, Alain
    Dassopoulos, Themistocles
    Datta, Lisa Wu
    Green, Todd
    Griffiths, Anne M.
    Kistner, Emily O.
    Murtha, Michael T.
    Regueiro, Miguel D.
    Rotter, Jerome I.
    Schumm, L. Philip
    Steinhart, A. Hillary
    Targan, Stephan R.
    Xavier, Ramnik J.
    Libioulle, Cecile
    Sandor, Cynthia
    Lathrop, Mark
    Belaiche, Jacques
    Dewit, Olivier
    Gut, Ivo
    Heath, Simon
    Laukens, Debby
    Mni, Myriam
    Rutgeerts, Paul
    Van Gossum, Andre
    Zelenika, Diana
    Franchimont, Denis
    Hugot, Jean-Pierre
    de Vos, Martine
    Vermeire, Severine
    Louis, Edouard
    Cardon, Lon R.
    Anderson, Carl A.
    Drummond, Hazel
    Nimmo, Elaine
    Ahmad, Tariq
    Prescott, Natalie J.
    Onnie, Clive M.
    Fisher, Sheila A.
    Marchini, Jonathan
    Ghori, Jilur
    [J]. NATURE GENETICS, 2008, 40 (08) : 955 - 962
  • [6] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [7] Dissecting complex disease: the quest for the Philosopher's Stone?
    Buchanan, Anne V.
    Weiss, Kenneth M.
    Fullerton, Stephanie M.
    [J]. INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2006, 35 (03) : 562 - 571
  • [8] GenoWatch: a disease gene mining browser for association study
    Chen, Yan-Hau
    Liu, Chuan-Kun
    Chang, Shu-Chuan
    Lin, Yi-Jung
    Tsai, Ming-Fang
    Chen, Yuan-Tsong
    Yao, Adam
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : W336 - W340
  • [9] Angiotensin-1-converting enzyme (ACE) plasma concentration is influenced by multiple ACE-linked quantitative trait nucleotides
    Cox, R
    Bouzekri, N
    Martin, S
    Southam, L
    Hugill, A
    Golamaully, M
    Cooper, R
    Adeyemo, A
    Soubrier, F
    Ward, R
    Lathrop, GM
    Matsuda, F
    Farrall, M
    [J]. HUMAN MOLECULAR GENETICS, 2002, 11 (23) : 2969 - 2977
  • [10] A genome-wide association study of global gene expression
    Dixon, Anna L.
    Liang, Liming
    Moffatt, Miriam F.
    Chen, Wei
    Heath, Simon
    Wong, Kenny C. C.
    Taylor, Jenny
    Burnett, Edward
    Gut, Ivo
    Farrall, Martin
    Lathrop, G. Mark
    Abecasis, Goncalo R.
    Cookson, William O. C.
    [J]. NATURE GENETICS, 2007, 39 (10) : 1202 - 1207