Learning complex dependency structure of gene regulatory networks from high dimensional microarray data with Gaussian Bayesian networks

被引:3
作者
Graafland, Catharina E. [1 ]
Gutierrez, Jose M. [1 ]
机构
[1] Univ Cantabria, CSIC, Inst Fis Cantabria, Ave Los Castros, E-39005 Santander, Spain
关键词
GRAPHICAL MODELS;
D O I
10.1038/s41598-022-21957-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Reconstruction of Gene Regulatory Networks (GRNs) of gene expression data with Probabilistic Network Models (PNMs) is an open problem. Gene expression datasets consist of thousand of genes with relatively small sample sizes (i.e. are large-p-small-n). Moreover, dependencies of various orders coexist in the datasets. On the one hand transcription factor encoding genes act like hubs and regulate target genes, on the other hand target genes show local dependencies. In the field of Undirected Network Models (UNMs)-a subclass of PNMs-the Glasso algorithm has been proposed to deal with high dimensional microarray datasets forcing sparsity. To overcome the problem of the complex structure of interactions, modifications of the default Glasso algorithm have been developed that integrate the expected dependency structure in the UNMs beforehand. In this work we advocate the use of a simple score-based Hill Climbing algorithm (HC) that learns Gaussian Bayesian networks leaning on directed acyclic graphs. We compare HC with Glasso and variants in the UNM framework based on their capability to reconstruct GRNs from microarray data from the benchmarking synthetic dataset from the DREAM5 challenge and from real-world data from the Escherichia coli genome. We conclude that dependencies in complex data are learned best by the HC algorithm, presenting them most accurately and efficiently, simultaneously modelling strong local and weaker but significant global connections coexisting in the gene expression dataset. The HC algorithm adapts intrinsically to the complex dependency structure of the dataset, without forcing a specific structure in advance.
引用
收藏
页数:18
相关论文
共 45 条
[1]   ComHub: Community predictions of hubs in gene regulatory networks [J].
Akesson, Julia ;
Lubovac-Pilav, Zelmina ;
Magnusson, Rasmus ;
Gustafsson, Mika .
BMC BIOINFORMATICS, 2021, 22 (01)
[2]   Learning Large-Scale Bayesian Networks with the sparsebn Package [J].
Aragam, Bryon ;
Gu, Jiaying ;
Zhou, Qing .
JOURNAL OF STATISTICAL SOFTWARE, 2019, 91 (11) :1-38
[3]  
Aragam B, 2015, J MACH LEARN RES, V16, P2273
[4]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[5]  
Castillo E., 1997, Expert Systems and Probabilistic Network Models. Monographs in Computer Science, DOI [10.1007/978-1-4612-2270-5, DOI 10.1007/978-1-4612-2270-5]
[6]  
Chan-Lau Jorge A, 2017, Lasso Regressions and Forecasting Models in Applied Stress Testing
[7]  
Chen T, 1999, Pac Symp Biocomput, P29
[8]   Modeling and simulation of genetic regulatory systems: A literature review [J].
De Jong, H .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (01) :67-103
[9]   Computational methods for Gene Regulatory Networks reconstruction and analysis: A review [J].
Delgado, Fernando M. ;
Gomez-Vela, Francisco .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2019, 95 :133-145
[10]   Sparse graphical models for exploring gene expression data [J].
Dobra, A ;
Hans, C ;
Jones, B ;
Nevins, JR ;
Yao, GA ;
West, M .
JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 90 (01) :196-212