An information theoretic exploratory method for learning patterns of conditional gene coexpression from microarray data

被引:11
作者
Boscolo, Riccardo [1 ]
Liao, James C. [2 ]
Roychowdhury, Vwani P. [1 ]
机构
[1] Univ Calif Los Angeles, Dept Elect Engn, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Chem & Biomol Engn, Los Angeles, CA 90095 USA
关键词
gene expression data; statistical analysis; information theory; coinformation; entropy;
D O I
10.1109/TCBB.2007.1056
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In this paper, we introduce an exploratory framework for learning patterns of conditional coexpression in gene expression data. The main idea behind the proposed approach consists of estimating how the information content shared by a set of All nodes in a network (where each node is associated to an expression profile) varies upon conditioning on a set of L conditioning variables (in the simplest case represented by a separate set of expression profiles). The method is nonparametric, and it is based on the concept of statistical coinformation, which, unlike conventional correlation-based techniques, is not restricted in scope to linear conditional dependency patterns. Moreover, such conditional coexpression relationships can potentially indicate regulatory interactions that do not manifest themselves when only pairwise relationships are considered. A moment-based approximation of the coinformation measure is derived that efficiently gets around the problem of estimating high-dimensional multivariate probability density functions from the data, a task usually not viable due to the intrinsic sample size limitations that characterize expression-level measurements. By applying the proposed exploratory method, we analyzed a whole genome microarray assay of the eukaryote Saccharomices cerevisiae and were able to learn statistically significant patterns of conditional coexpression. A selection of such interactions that carry a meaningful biological interpretation are discussed.
引用
收藏
页码:15 / 24
页数:10
相关论文
共 17 条
  • [1] Bell AJ., 2003, 4 INT S IND COMP AN, P921
  • [2] Cover TM, 2006, Elements of Information Theory
  • [3] Statistical intelligence: effective analysis of high-density microarray data
    Draghici, S
    [J]. DRUG DISCOVERY TODAY, 2002, 7 (11) : S55 - S63
  • [4] Draghici S., 2003, DATA ANAL TOOLS DNA
  • [5] Efron B., 1993, INTRO BOOTSTRAP MONO, DOI DOI 10.1201/9780429246593
  • [6] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [7] Friedman J, 2001, The elements of statistical learning, V1, DOI DOI 10.1007/978-0-387-21606-5
  • [8] Friedman N, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P206
  • [9] Genomic expression programs in the response of yeast cells to environmental changes
    Gasch, AP
    Spellman, PT
    Kao, CM
    Carmel-Harel, O
    Eisen, MB
    Storz, G
    Botstein, D
    Brown, PO
    [J]. MOLECULAR BIOLOGY OF THE CELL, 2000, 11 (12) : 4241 - 4257
  • [10] JONES MC, 1983, THESIS U BATH