A multivariate approach for integrating genome-wide expression data and biological knowledge

被引:99
作者
Kong, Sek Won
Pu, William T.
Park, Peter J.
机构
[1] Childrens Hosp Boston, Informat Program, Boston, MA 02115 USA
[2] Childrens Hosp Boston, Dept Cardiol, Boston, MA 02115 USA
[3] Harvard Partners Ctr Genet & Genom, Boston, MA 02115 USA
关键词
D O I
10.1093/bioinformatics/btl401
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Several statistical methods that combine analysis of differential gene expression with biological knowledge databases have been proposed for a more rapid interpretation of expression data. However, most such methods are based on a series of univariate statistical tests and do not properly account for the complex structure of gene interactions. Results: We present a simple yet effective multivariate statistical procedure for assessing the correlation between a subspace defined by a group of genes and a binary phenotype. A subspace is deemed significant if the samples corresponding to different phenotypes are well separated in that subspace. The separation is measured using Hotelling's T-2 statistic, which captures the covariance structure of the subspace. When the dimension of the subspace is larger than that of the sample space, we project the original data to a smaller orthonormal subspace. We use this method to search through functional pathway subspaces defined by Reactome, KEGG, BioCarta and Gene Ontology. To demonstrate its performance, we apply this method to the data from two published studies, and visualize the results in the principal component space.
引用
收藏
页码:2373 / 2380
页数:8
相关论文
共 37 条
[1]   Improved scoring of functional groups from gene expression data by decorrelating GO graph structure [J].
Alexa, Adrian ;
Rahnenfuehrer, Joerg ;
Lengauer, Thomas .
BIOINFORMATICS, 2006, 22 (13) :1600-1607
[2]   Cardiac hypertrophy is inhibited by antagonism of ADAM12 processing of HB-EGF: Metalloproteinase inhibitors as a new therapy [J].
Asakura, M ;
Kitakaze, M ;
Takashima, S ;
Liao, Y ;
Ishikura, F ;
Yoshinaka, T ;
Ohmoto, H ;
Node, K ;
Yoshino, K ;
Ishiguro, H ;
Asanuma, H ;
Sanada, S ;
Matsumura, Y ;
Takeda, H ;
Beppu, S ;
Tada, M ;
Hori, M ;
Higashiyama, S .
NATURE MEDICINE, 2002, 8 (01) :35-40
[3]   The TOR pathway: A target for cancer therapy [J].
Bjornsti, MA ;
Houghton, PJ .
NATURE REVIEWS CANCER, 2004, 4 (05) :335-348
[4]   Between-group analysis of microarray data [J].
Culhane, AC ;
Perrière, G ;
Considine, EC ;
Cotter, TG ;
Higgins, DG .
BIOINFORMATICS, 2002, 18 (12) :1600-1608
[5]   DAVID: Database for annotation, visualization, and integrated discovery [J].
Dennis, G ;
Sherman, BT ;
Hosack, DA ;
Yang, J ;
Gao, W ;
Lane, HC ;
Lempicki, RA .
GENOME BIOLOGY, 2003, 4 (09)
[6]   MAPPFinder: using Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data [J].
Doniger, SW ;
Salomonis, N ;
Dahlquist, KD ;
Vranizan, K ;
Lawlor, SC ;
Conklin, BR .
GENOME BIOLOGY, 2003, 4 (01)
[7]   REGULARIZED DISCRIMINANT-ANALYSIS [J].
FRIEDMAN, JH .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1989, 84 (405) :165-175
[8]   A global test for groups of genes: testing association with a clinical outcome [J].
Goeman, JJ ;
van de Geer, SA ;
de Kort, F ;
van Houwelingen, HC .
BIOINFORMATICS, 2004, 20 (01) :93-99
[9]  
Grossmann S, 2006, LECT NOTES COMPUT SC, V3909, P85
[10]   Genomic profiling of the human heart before and after mechanical support with a ventricular assist device reveals alterations in vascular signaling networks [J].
Hall, JL ;
Grindle, S ;
Han, XQ ;
Fermin, D ;
Park, S ;
Chen, YJ ;
Bache, RJ ;
Mariash, A ;
Guan, ZJ ;
Ormaza, S ;
Thompson, J ;
Graziano, J ;
Lazaro, SED ;
Pan, SC ;
Simari, RD ;
Miller, LW .
PHYSIOLOGICAL GENOMICS, 2004, 17 (03) :283-291