Meta-analysis for pathway enrichment analysis when combining multiple genomic studies

被引:68
作者
Shen, Kui [1 ]
Tseng, George C. [1 ,2 ,3 ]
机构
[1] Univ Pittsburgh, Sch Med, Dept Computat Biol, Pittsburgh, PA 15213 USA
[2] Univ Pittsburgh, Grad Sch Publ Hlth, Dept Biostat, Pittsburgh, PA 15261 USA
[3] Univ Pittsburgh, Grad Sch Publ Hlth, Dept Human Genet, Pittsburgh, PA 15261 USA
基金
美国国家卫生研究院;
关键词
FALSE DISCOVERY RATE; GENE-EXPRESSION DATA; BREAST-CANCER; SIGNATURE; SETS;
D O I
10.1093/bioinformatics/btq148
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Many pathway analysis (or gene set enrichment analysis) methods have been developed to identify enriched pathways under different biological states within a genomic study. As more and more microarray datasets accumulate, meta-analysis methods have also been developed to integrate information among multiple studies. Currently, most meta-analysis methods for combining genomic studies focus on biomarker detection and meta-analysis for pathway analysis has not been systematically pursued. Results: We investigated two approaches of meta-analysis for pathway enrichment (MAPE) by combining statistical significance across studies at the gene level (MAPE_G) or at the pathway level (MAPE_P). Simulation results showed increased statistical power of meta-analysis approaches compared to a single study analysis and showed complementary advantages of MAPE_G and MAPE_P under different scenarios. We also developed an integrated method (MAPE_I) that incorporates advantages of both approaches. Comprehensive simulations and applications to real data on drug response of breast cancer cell lines and lung cancer tissues were evaluated to compare the performance of three MAPE variations. MAPE_P has the advantage of not requiring gene matching across studies. When MAPE_G and MAPE_P show complementary advantages, the hybrid version of MAPE_I is generally recommended. Availability: http://www.biostat.pitt.edu/bioinfo/ Contact: ctseng@pitt.edu Supplementary information: Supplementary data are available at Bioinformatics online.
引用
收藏
页码:1316 / 1323
页数:8
相关论文
共 30 条
[11]  
FARCOMENI A, 2006, STAT METHODS APPL, V15, P43, DOI DOI 10.1007/S10260-006-0002-Z
[12]  
Fisher FMaRA, 1948, AM STAT, V2, P30, DOI DOI 10.2307/2681650
[13]   Analyzing gene expression data in terms of gene sets:: methodological issues [J].
Goeman, Jelle J. ;
Buehlmann, Peter .
BIOINFORMATICS, 2007, 23 (08) :980-987
[14]  
HOSACK DA, 2003, GENOME BIOL, V4, pP4, DOI [DOI 10.1186/GB-2003-4-6-P4, 10.1186/gb-2003-4-6-p4]
[15]   KEGG: Kyoto Encyclopedia of Genes and Genomes [J].
Kanehisa, M ;
Goto, S .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :27-30
[16]   Group testing for pathway analysis improves comparability of different microarray datasets [J].
Manoli, Theodora ;
Gretz, Norbert ;
Grone, Hermann-Josef ;
Kenzelmann, Marc ;
Eils, Roland ;
Brors, Benedikt .
BIOINFORMATICS, 2006, 22 (20) :2500-2506
[17]   RANDOM-SET METHODS IDENTIFY DISTINCT ASPECTS OF THE ENRICHMENT SIGNAL IN GENE-SET ANALYSIS [J].
Newton, Michael A. ;
Quintana, Fernando A. ;
Den Boon, Johan A. ;
Sengupta, Srikumar ;
Ahlquist, Paui .
ANNALS OF APPLIED STATISTICS, 2007, 1 (01) :85-106
[18]   GeneVenn - A web application for comparing gene lists using Venn diagrams [J].
Pirooznia, Mehdi ;
Nagarajan, Vijayaraj ;
Deng, Youping .
BIOINFORMATION, 2007, 1 (10) :420-422
[19]  
Rhodes DR, 2002, CANCER RES, V62, P4427
[20]   Control of the false discovery rate under dependence using the bootstrap and subsampling [J].
Romano, Joseph P. ;
Shaikh, Azeem M. ;
Wolf, Michael .
TEST, 2008, 17 (03) :417-442