Gene set analysis of genome-wide association studies: Methodological issues and perspectives

被引:143
作者
Wang, Lily [1 ]
Jia, Peilin [2 ,3 ]
Wolfinger, Russell D. [4 ]
Chen, Xi [1 ]
Zhao, Zhongming [2 ,3 ,5 ]
机构
[1] Vanderbilt Univ, Dept Biostat, Sch Med, Div Canc Biostat, Nashville, TN 37232 USA
[2] Vanderbilt Univ, Dept Biomed Informat, Sch Med, Nashville, TN 37232 USA
[3] Vanderbilt Univ, Dept Psychiat, Sch Med, Nashville, TN 37232 USA
[4] SAS Inst Inc, Cary, NC 27513 USA
[5] Vanderbilt Univ, Dept Canc Biol, Sch Med, Nashville, TN 37232 USA
关键词
Genome-wide association study; Gene set; Pathway; Gene-set enrichment analysis; Statistical significance; Complex disease; PATHWAY ANALYSIS; ENRICHMENT ANALYSIS; STATISTICAL-METHODS; TRUNCATED PRODUCT; FALSE DISCOVERY; DISEASE; SNP; KNOWLEDGE; COMMON; POLYMORPHISMS;
D O I
10.1016/j.ygeno.2011.04.006
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Recent studies have demonstrated that gene set analysis, which tests disease association with genetic variants in a group of functionally related genes, is a promising approach for analyzing and interpreting genome-wide association studies (GWAS) data. These approaches aim to increase power by combining association signals from multiple genes in the same gene set. I In addition, gene set analysis can also shed more light on the biological processes underlying complex diseases. However, current approaches for gene set analysis are still in an early stage of development in that analysis results are often prone to sources of bias, including gene set size and gene length, linkage disequilibrium patterns and the presence of overlapping genes. In this paper, we provide an in-depth review of the gene set analysis procedures, along with parameter choices and the particular methodology challenges at each stage. In addition to providing a survey of recently developed tools, we also classify the analysis methods into larger categories and discuss their strengths and limitations. In the last section, we outline several important areas for improving the analytical strategies in gene set analysis. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 94 条
  • [31] A second generation human haplotype map of over 3.1 million SNPs
    Frazer, Kelly A.
    Ballinger, Dennis G.
    Cox, David R.
    Hinds, David A.
    Stuve, Laura L.
    Gibbs, Richard A.
    Belmont, John W.
    Boudreau, Andrew
    Hardenbol, Paul
    Leal, Suzanne M.
    Pasternak, Shiran
    Wheeler, David A.
    Willis, Thomas D.
    Yu, Fuli
    Yang, Huanming
    Zeng, Changqing
    Gao, Yang
    Hu, Haoran
    Hu, Weitao
    Li, Chaohua
    Lin, Wei
    Liu, Siqi
    Pan, Hao
    Tang, Xiaoli
    Wang, Jian
    Wang, Wei
    Yu, Jun
    Zhang, Bo
    Zhang, Qingrun
    Zhao, Hongbin
    Zhao, Hui
    Zhou, Jun
    Gabriel, Stacey B.
    Barry, Rachel
    Blumenstiel, Brendan
    Camargo, Amy
    Defelice, Matthew
    Faggart, Maura
    Goyette, Mary
    Gupta, Supriya
    Moore, Jamie
    Nguyen, Huy
    Onofrio, Robert C.
    Parkin, Melissa
    Roy, Jessica
    Stahl, Erich
    Winchester, Ellen
    Ziaugra, Liuda
    Altshuler, David
    Shen, Yan
    [J]. NATURE, 2007, 449 (7164) : 851 - U3
  • [32] Locus category based analysis of a large genome-wide association study of rheumatoid arthritis
    Freudenberg, Jan
    Lee, Annette T.
    Siminovitch, Katherine A.
    Amos, Christopher I.
    Ballard, David
    Li, Wentian
    Gregersen, Peter K.
    [J]. HUMAN MOLECULAR GENETICS, 2010, 19 (19) : 3863 - 3872
  • [33] SCAN: SNP and copy number annotation
    Gamazon, Eric R.
    Zhang, Wei
    Konkashbaev, Anuar
    Duan, Shiwei
    Kistner, Emily O.
    Nicolae, Dan L.
    Dolan, M. Eileen
    Cox, Nancy J.
    [J]. BIOINFORMATICS, 2010, 26 (02) : 259 - 262
  • [34] Analyzing gene expression data in terms of gene sets:: methodological issues
    Goeman, Jelle J.
    Buehlmann, Peter
    [J]. BIOINFORMATICS, 2007, 23 (08) : 980 - 987
  • [35] A new permutation strategy of pathway-based approach for genome-wide association study
    Guo, Yan-Fang
    Li, Jian
    Chen, Yuan
    Zhang, Li-Shu
    Deng, Hong-Wen
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [36] Trimming, weighting, and grouping SNPs in human case-control association studies
    Hoh, J
    Wille, A
    Ott, J
    [J]. GENOME RESEARCH, 2001, 11 (12) : 2115 - 2119
  • [37] GSEA-SNP: applying gene set enrichment analysis to SNP data from genome-wide association studies
    Holden, Marit
    Deng, Shiwei
    Wojnowski, Leszek
    Kulle, Bettina
    [J]. BIOINFORMATICS, 2008, 24 (23) : 2784 - 2785
  • [38] Gene Ontology Analysis of GWA Study Data Sets Provides Insights into the Biology of Bipolar Disorder
    Holmans, Peter
    Green, Elaine K.
    Pahwa, Jaspreet Singh
    Ferreira, Manuel A. R.
    Purcell, Shaun M.
    Sklar, Pamela
    Owen, Michael J.
    O'Donovan, Michael C.
    Craddock, Nick
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2009, 85 (01) : 13 - 24
  • [39] Strategies and issues in the detection of pathway enrichment in genome-wide association studies
    Hong, Mun-Gwan
    Pawitan, Yudi
    Magnusson, Patrik K. E.
    Prince, Jonathan A.
    [J]. HUMAN GENETICS, 2009, 126 (02) : 289 - 301
  • [40] Prioritization of Epilepsy Associated Candidate Genes by Convergent Analysis
    Jia, Peilin
    Ewers, Jeffrey M.
    Zhao, Zhongming
    [J]. PLOS ONE, 2011, 6 (02):