Dynamic Bayesian Network Learning to Infer Sparse Models From Time Series Gene Expression Data

被引:10
作者
Ajmal, Hamda B. [1 ]
Madden, Michael G. [1 ]
机构
[1] Natl Univ Ireland, Sch Comp Sci, Galway H91 TK33, Ireland
关键词
Gene expression; Data models; Biological system modeling; Bayes methods; Biology; Computational modeling; Regulation; Computational biology; bioinformatics; Bayesian networks; gene regulatory networks; gene expression; INFORMATION CRITERIA; REGULATORY NETWORKS; MUTUAL INFORMATION; LINEAR-MODELS; SELECTION; TRANSCRIPTION; MICROARRAY; GENERATION; CHALLENGES; DIMENSION;
D O I
10.1109/TCBB.2021.3092879
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
One of the key challenges in systems biology is to derive gene regulatory networks (GRNs) from complex high-dimensional sparse data. Bayesian networks (BNs) and dynamic Bayesian networks (DBNs) have been widely applied to infer GRNs from gene expression data. GRNs are typically sparse but traditional approaches of BN structure learning to elucidate GRNs often produce many spurious (false positive) edges. We present two new BN scoring functions, which are extensions to the Bayesian Information Criterion (BIC) score, with additional penalty terms and use them in conjunction with DBN structure search methods to find a graph structure that maximises the proposed scores. Our BN scoring functions offer better solutions for inferring networks with fewer spurious edges compared to the BIC score. The proposed methods are evaluated extensively on auto regressive and DREAM4 benchmarks. We found that they significantly improve the precision of the learned graphs, relative to the BIC score. The proposed methods are also evaluated on three real time series gene expression datasets. The results demonstrate that our algorithms are able to learn sparse graphs from high-dimensional time series data. The implementation of these algorithms is open source and is available in form of an R package on GitHub at https://github.com/HamdaBinteAjmal/DBN4GRN, along with the documentation and tutorials.
引用
收藏
页码:2794 / 2805
页数:12
相关论文
共 90 条
[71]   Assessing quality and completeness of human transcriptional regulatory pathways on a genome-wide scale [J].
Shmelkov, Evgeny ;
Tang, Zuojian ;
Aifantis, Iannis ;
Statnikov, Alexander .
BIOLOGY DIRECT, 2011, 6
[72]  
Song Le, 2009, P ADV NEUR INF PROC, P1732
[73]   Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization [J].
Spellman, PT ;
Sherlock, G ;
Zhang, MQ ;
Iyer, VR ;
Anders, K ;
Eisen, MB ;
Brown, PO ;
Botstein, D ;
Futcher, B .
MOLECULAR BIOLOGY OF THE CELL, 1998, 9 (12) :3273-3297
[74]  
Spirtes P., 2000, Causation, Prediction, and Search, V2, DOI DOI 10.1007/978-1-4612-7650-0
[75]   Regression shrinkage and selection via the lasso: a retrospective [J].
Tibshirani, Robert .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2011, 73 :273-282
[76]  
Trabelsi G, 2013, LECT NOTES COMPUT SC, V8207, P392, DOI 10.1007/978-3-642-41398-8_34
[77]  
Tsamardinos I., 2003, TR0302 VAND U TRDSC
[78]   The max-min hill-climbing Bayesian network structure learning algorithm [J].
Tsamardinos, Ioannis ;
Brown, Laura E. ;
Aliferis, Constantin F. .
MACHINE LEARNING, 2006, 65 (01) :31-78
[79]   Evolutionary learning of dynamic probabilistic models with large time lags [J].
Tucker, A ;
Liu, XH ;
Ogden-Swift, A .
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2001, 16 (05) :621-645
[80]   dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data [J].
Van Anh Huynh-Thu ;
Geurts, Pierre .
SCIENTIFIC REPORTS, 2018, 8