Adaptive Elastic-Net Sparse Principal Component Analysis for Pathway Association Testing

被引:7
作者
Chen, Xi [1 ]
机构
[1] Vanderbilt Univ, Dept Biostat, Nashville, TN 37232 USA
关键词
gene expression; microarray; pathway analysis; sparse principal component analysis; GENE SET ENRICHMENT; SINGULAR-VALUE DECOMPOSITION; WIDE EXPRESSION DATA; MICROARRAY DATA; RECEPTOR; SELECTION; THERAPY; LASSO; GAMMA;
D O I
10.2202/1544-6115.1697
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Pathway or gene set analysis has become an increasingly popular approach for analyzing high-throughput biological experiments such as microarray gene expression studies. The purpose of pathway analysis is to identify differentially expressed pathways associated with outcomes. Important challenges in pathway analysis are selecting a subset of genes contributing most to association with clinical phenotypes and conducting statistical tests of association for the pathways efficiently. We propose a two-stage analysis strategy: (1) extract latent variables representing activities within each pathway using a dimension reduction approach based on adaptive elastic-net sparse principal component analysis; (2) integrate the latent variables with the regression modeling framework to analyze studies with different types of outcomes such as binary, continuous or survival outcomes. Our proposed approach is computationally efficient. For each pathway, because the latent variables are estimated in an unsupervised fashion without using disease outcome information, in the sample label permutation testing procedure, the latent variables only need to be calculated once rather than for each permutation resample. Using both simulated and real datasets, we show our approach performed favorably when compared with five other currently available pathway testing methods.
引用
收藏
页数:22
相关论文
共 41 条
[1]   A general modular framework for gene set enrichment analysis [J].
Ackermann, Marit ;
Strimmer, Korbinian .
BMC BIOINFORMATICS, 2009, 10
[2]   Singular value decomposition for genome-wide expression data processing and modeling [J].
Alter, O ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :10101-10106
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]   Supervised principal component analysis for gene set enrichment of microarray data with continuous or survival outcomes [J].
Chen, Xi ;
Wang, Lily ;
Smith, Jonathan D. ;
Zhang, Bing .
BIOINFORMATICS, 2008, 24 (21) :2474-2481
[6]   Pathway-Based Analysis for Genome-Wide Association Studies Using Supervised Principal Components [J].
Chen, Xi ;
Wang, Lily ;
Hu, Bo ;
Guo, Mingsheng ;
Barnard, John ;
Zhu, Xiaofeng .
GENETIC EPIDEMIOLOGY, 2010, 34 (07) :716-724
[7]   Drug therapy: EGFR antagonists in cancer treatment [J].
Ciardiello, Fortunato ;
Tortora, Giampaolo .
NEW ENGLAND JOURNAL OF MEDICINE, 2008, 358 (11) :1160-1174
[8]   Improving gene set analysis of microarray data by SAM-GS [J].
Dinu, Irina ;
Potter, John D. ;
Mueller, Thomas ;
Liu, Qi ;
Adewale, Adeniyi J. ;
Jhangri, Gian S. ;
Einecke, Gunilla ;
Famulski, Konrad S. ;
Halloran, Philip ;
Yasui, Yutaka .
BMC BIOINFORMATICS, 2007, 8 (1)
[9]   Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[10]   ON TESTING THE SIGNIFICANCE OF SETS OF GENES [J].
Efron, Bradley ;
Tibshirani, Robert .
ANNALS OF APPLIED STATISTICS, 2007, 1 (01) :107-129