Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data

被引:19
作者
Gao, Shouguo [1 ,2 ]
Wang, Xujing [1 ,2 ]
机构
[1] Univ Alabama Birmingham, Dept Phys, Birmingham, AL 35294 USA
[2] Univ Alabama Birmingham, Comprehens Diabet Ctr, Birmingham, AL 35294 USA
来源
BMC BIOINFORMATICS | 2011年 / 12卷
关键词
SACCHAROMYCES-CEREVISIAE; REGULATORY NETWORKS; CELL-CYCLE; YEAST; FRAMEWORK; ONTOLOGY; DESIGN;
D O I
10.1186/1471-2105-12-359
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Bayesian Network (BN) is a powerful approach to reconstructing genetic regulatory networks from gene expression data. However, expression data by itself suffers from high noise and lack of power. Incorporating prior biological knowledge can improve the performance. As each type of prior knowledge on its own may be incomplete or limited by quality issues, integrating multiple sources of prior knowledge to utilize their consensus is desirable. Results: We introduce a new method to incorporate the quantitative information from multiple sources of prior knowledge. It first uses the Naive Bayesian classifier to assess the likelihood of functional linkage between gene pairs based on prior knowledge. In this study we included cocitation in PubMed and schematic similarity in Gene Ontology annotation. A candidate network edge reservoir is then created in which the copy number of each edge is proportional to the estimated likelihood of linkage between the two corresponding genes. In network simulation the Markov Chain Monte Carlo sampling algorithm is adopted, and samples from this reservoir at each iteration to generate new candidate networks. We evaluated the new algorithm using both simulated and real gene expression data including that from a yeast cell cycle and a mouse pancreas development/growth study. Incorporating prior knowledge led to a similar to 2 fold increase in the number of known transcription regulations recovered, without significant change in false positive rate. In contrast, without the prior knowledge BN modeling is not always better than a random selection, demonstrating the necessity in network modeling to supplement the gene expression data with additional information. Conclusion: our new development provides a statistical means to utilize the quantitative information in prior biological knowledge in the BN modeling of gene expression data, which significantly improves the performance.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Integration of Epigenetic Data in Bayesian Network Modeling of Gene Regulatory Network
    Zheng, Jie
    Chaturvedi, Iti
    Rajapakse, Jagath C.
    PATTERN RECOGNITION IN BIOINFORMATICS, 2011, 7036 : 87 - 96
  • [2] Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions
    de Campos, Luis M.
    Cano, Andres
    Castellano, Javier G.
    Moral, Serafin
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2019, 18 (03)
  • [3] Perturbation Detection Through Modeling of Gene Expression on a Latent Biological Pathway Network: A Bayesian Hierarchical Approach
    Pham, Lisa M.
    Carvalho, Luis
    Schaus, Scott
    Kolaczyk, Eric D.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (513) : 73 - 92
  • [4] miXGENE tool for learning from heterogeneous gene expression data using prior knowledge
    Holec, Matej
    Gologuzov, Valentin
    Klema, Jiri
    2014 IEEE 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2014, : 247 - 250
  • [5] Combining Expression Data and Knowledge Ontology for Gene Clustering and Network Reconstruction
    Lee, Wei-Po
    Lin, Chung-Hsun
    COGNITIVE COMPUTATION, 2016, 8 (02) : 217 - 227
  • [6] Integrating biological knowledge based on functional annotations for biclustering of gene expression data
    Nepomuceno, Juan A.
    Troncoso, Alicia
    Nepomuceno-Chamorro, Isabel A.
    Aguilar-Ruiz, Jesus S.
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2015, 119 (03) : 163 - 180
  • [7] Classification by integrating plant stress response gene expression data with biological knowledge
    Meng, Jun
    Li, Rui
    Luan, Yushi
    MATHEMATICAL BIOSCIENCES, 2015, 266 : 65 - 72
  • [8] Integrating Epigenetic Prior in Dynamic Bayesian Network for Gene Regulatory Network Inference
    Chen, Haifen
    Maduranga, D. A. K.
    Mundra, Piyushkumar A.
    Zheng, Jie
    PROCEEDINGS OF THE 2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2013, : 76 - 82
  • [9] Leveraging additional knowledge to support coherent bicluster discovery in gene expression data
    Visconti, Alessia
    Cordero, Francesca
    Pensa, Ruggero G.
    INTELLIGENT DATA ANALYSIS, 2014, 18 (05) : 837 - 855
  • [10] Bayesian Fourier clustering of gene expression data
    Kim, Jaehee
    Kyung, Minjung
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2017, 46 (08) : 6475 - 6494