A Bayesian Approach to Pathway Analysis by Integrating Gene-Gene Functional Directions and Microarray Data

被引:9
作者
Yifang Zhao
Ming-Hui Chen
Baikang Pei
David Rowe
Dong-Guk Shin
Wangang Xie
Fang Yu
Lynn Kuo
机构
[1] Department of Statistics, University of Connecticut, Storrs
[2] MYSM School of Medicine, Yale University, New Haven
[3] School of Dental Medicine, University of Connecticut Health Center, Farmington
[4] Computer Science and Engineering, University of Connecticut, Storrs
[5] Abbott Lab, Chicago
[6] Department of Biostatistics, University of Nebraska Medical Center, Omaha
基金
美国国家卫生研究院;
关键词
Bayesian belief network; Bayesian model selection; KEGG pathways; Microarray data; Prior construction; Symmetric Kullback-Leibler divergence;
D O I
10.1007/s12561-011-9046-1
中图分类号
学科分类号
摘要
Many statistical methods have been developed to screen for differentially expressed genes associated with specific phenotypes in the microarray data. However, it remains a major challenge to synthesize the observed expression patterns with abundant biological knowledge for more complete understanding of the biological functions among genes. Various methods including clustering analysis on genes, neural network, Bayesian network and pathway analysis have been developed toward this goal. In most of these procedures, the activation and inhibition relationships among genes have hardly been utilized in the modeling steps. We propose two novel Bayesian models to integrate the microarray data with the putative pathway structures obtained from the KEGG database and the directional gene-gene interactions in the medical literature. We define the symmetric Kullback-Leibler divergence of a pathway, and use it to identify the pathway(s) most supported by the microarray data. Monte Carlo Markov Chain sampling algorithm is given for posterior computation in the hierarchical model. The proposed method is shown to select the most supported pathway in an illustrative example. Finally, we apply the methodology to a real microarray data set to understand the gene expression profile of osteoblast lineage at defined stages of differentiation. We observe that our method correctly identifies the pathways that are reported to play essential roles in modulating bone mass. © 2011 International Chinese Statistical Association.
引用
收藏
页码:105 / 131
页数:26
相关论文
共 29 条
[1]  
Chen M.-H., Shao Q.-M., Ibrahim J.G., Monte Carlo Methods in Bayesian Computation, (2000)
[2]  
Chen M.-H., Huang L., Ibrahim J.G., Kim S., Bayesian variable selection and computation for generalized linear models with conjugate priors, Bayesian Anal, 3, pp. 585-614, (2008)
[3]  
Curtis R.K., Oresic M., Vidal-Puig A., Pathways to the analysis of microarray data, Trends Biotechnol, 23, 8, pp. 429-435, (2005)
[4]  
Efron B., Tibshirani R., On testing the significance of sets of genes, Ann Appl Stat, 1, pp. 107-129, (2007)
[5]  
Ellis B., Wong W.H., Learning causal Bayesian network structures from experimental data, J Am Stat Assoc, 103, pp. 778-789, (2008)
[6]  
Fletcher R., Reeves C.M., Function minimization by conjugate gradients, Comput J, 7, pp. 148-154, (1964)
[7]  
Friedman N., Linial M., Nachman I., Pe'er D., Using Bayesian networks to analyze expression data, J Comput Biol, 7, 3-4, pp. 601-620, (2000)
[8]  
Geweke J., Evaluating the accuracy of sampling-based approaches to calculating posterior moments, Bayesian Statistics 4, (1992)
[9]  
Hartmann C., A Wnt canon orchestrating osteoblastogenesis, Trends Cell Biol, 16, 3, pp. 151-158, (2006)
[10]  
Hartemink A., Gifford D.K., Jaakkola T.S., Young R.A., Bayesian methods for elucidating genetic regulatory networks, IEEE Intell Syst Biol, 17, 2, pp. 37-43, (2002)