Gene Set Correlation Analysis and Visualization Using Gene Expression Data

被引:2
作者
Tsai, Chen-An [1 ]
Chen, James J. [2 ]
机构
[1] Natl Taiwan Univ, Dept Agron, Taipei, Taiwan
[2] US FDA, Div Bioinformat & Biostat, Natl Ctr Toxicol Res, Jefferson, AR 72079 USA
关键词
Gene set enrichment analyses; gene set correlation analysis; co-inertia analysis; covariance; p53 gene expression data; gene set analysis; CO-INERTIA ANALYSIS; MICROARRAY DATA; DIFFERENTIAL COEXPRESSION; MULTIVARIATE-ANALYSIS; NETWORKS; SITE;
D O I
10.2174/1574893615999200629124444
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Gene set enrichment analyses (GSEA) provide a useful and powerful approach to identify differentially expressed gene sets with prior biological knowledge. Several GSEA algorithms have been proposed to perform enrichment analyses on groups of genes. However, many of these algorithms have focused on the identification of differentially expressed gene sets in a given phenotype. Objective: In this paper, we propose a gene set analytic framework, Gene Set Correlation Analysis (GSCoA), that simultaneously measures within and between gene sets variation to identify sets of genes enriched for differential expression and highly co-related pathways. Methods: We apply co-inertia analysis to the comparisons of cross- gene sets in gene expression data to measure the co-structure of expression profiles in pairs of gene sets. Co-inertia analysis (CIA) is one multivariate method to identify trends or co-relationships in multiple datasets, which contain the same samples. The objective of CIA is to seek ordinations (dimension reduction diagrams) of two gene sets such that the square covariance between the projections of the gene sets on successive axes is maximized. Simulation studies illustrate that CIA offers superior performance in identifying corelationships between gene sets in all simulation settings when compared to correlation-based gene set methods. Result and Conclusion: We also combine between-gene set CIA and GSEA to discover the relationships between gene sets significantly associated with phenotypes. In addition, we provide a graphical technique for visualizing and simultaneously exploring the associations between and within gene sets and their interaction and network. We then demonstrate the integration of within and between gene sets variation using CIA and GSEA, applied to the p53 gene expression data using the c2 curated gene sets. Ultimately, the GSCoA approach provides an attractive tool for the identification and visualization of novel associations between pairs of gene sets by integrating corelationships between gene sets into gene set analysis.
引用
收藏
页码:406 / 421
页数:16
相关论文
共 43 条
  • [1] Pathway analysis of Microarray data via regression
    Adewale, A. J.
    Dinu, I.
    Potter, J. D.
    Liu, Q.
    Yasui, Y.
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2008, 15 (03) : 269 - 277
  • [2] High-throughput imaging of brain gene expression
    Brown, VM
    Ossadtchi, A
    Khan, AH
    Cherry, SR
    Leahy, RM
    Smith, DJ
    [J]. GENOME RESEARCH, 2002, 12 (02) : 244 - 254
  • [3] Significance analysis of groups of genes in expression profiling studies
    Chen, James J.
    Lee, Taewon
    Delongchamp, Robert R.
    Chen, Tao
    Tsai, Chen-An
    [J]. BIOINFORMATICS, 2007, 23 (16) : 2104 - 2112
  • [4] Chessel D., 2004, R news, V4, P5, DOI DOI 10.2307/3780087
  • [5] Differential coexpression analysis using microarray data and its application to human cancer
    Choi, JK
    Yu, US
    Yoo, OJ
    Kim, S
    [J]. BIOINFORMATICS, 2005, 21 (24) : 4348 - 4355
  • [6] Statistical methods for gene set co-expression analysis
    Choi, YounJeong
    Kendziorski, Christina
    [J]. BIOINFORMATICS, 2009, 25 (21) : 2780 - 2786
  • [7] MADE4:: an R package for multivariate analysis of gene expression data
    Culhane, AC
    Thioulouse, J
    Perrière, G
    Higgins, DG
    [J]. BIOINFORMATICS, 2005, 21 (11) : 2789 - 2790
  • [8] Cross-platform comparison and visualisation of gene expression data using co-inertia analysis -: art. no. 59
    Culhane, AC
    Perrière, G
    Higgins, DG
    [J]. BMC BIOINFORMATICS, 2003, 4 (1)
  • [9] Between-group analysis of microarray data
    Culhane, AC
    Perrière, G
    Considine, EC
    Cotter, TG
    Higgins, DG
    [J]. BIOINFORMATICS, 2002, 18 (12) : 1600 - 1608
  • [10] Assessing co-regulation of directly linked genes in biological networks using microarray time series analysis
    Del Sorbo, Maria Rosaria
    Balzano, Walter
    Donato, Michele
    Draghici, Sorin
    [J]. BIOSYSTEMS, 2013, 114 (02) : 149 - 154