Learning massive interpretable gene regulatory networks of the human brain by merging Bayesian networks

被引:7
作者
Bernaola, Niko [1 ]
Michiels, Mario [2 ]
Larranaga, Pedro [1 ]
Bielza, Concha [1 ]
机构
[1] Univ Politecn Madrid, Dept Inteligencia Artificial, Computat Intelligence Grp, Madrid, Spain
[2] Hosp Univ HM Puerta Del Sur, Ctr Integral Neurociencias Abarca Campal, Madrid, Spain
关键词
MARKOV BLANKET INDUCTION; FEATURE-SELECTION; CAUSAL DISCOVERY; LOCAL CAUSAL;
D O I
10.1371/journal.pcbi.1011443
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We present the Fast Greedy Equivalence Search (FGES)-Merge, a new method for learning the structure of gene regulatory networks via merging locally learned Bayesian networks, based on the fast greedy equivalent search algorithm. The method is competitive with the state of the art in terms of the Matthews correlation coefficient, which takes into account both precision and recall, while also improving upon it in terms of speed, scaling up to tens of thousands of variables and being able to use empirical knowledge about the topological structure of gene regulatory networks. To showcase the ability of our method to scale to massive networks, we apply it to learning the gene regulatory network for the full human genome using data from samples of different brain structures (from the Allen Human Brain Atlas). Furthermore, this Bayesian network model should predict interactions between genes in a way that is clear to experts, following the current trends in explainable artificial intelligence. To achieve this, we also present a new open-access visualization tool that facilitates the exploration of massive networks and can aid in finding nodes of interest for experimental tests. In this study, we have developed a faster and scalable method, the Fast Greedy Equivalence Search (FGES)-Merge, to understand how genes interact and regulate each other. We adapted it specifically for massive gene regulatory networks, which can have tens of thousands of genes. Our method is not only competitive with the current best methods in terms of accuracy but also outperforms them in terms of speed. This is crucial when working with large scale data such as the human genome.To make our findings clear and usable for fellow scientists, we also created an open-access visualization tool. This tool allows for exploring vast networks and identifying nodes of interest for further research. In our test cases, we used the FGES-Merge method to learn about the gene regulatory network of the entire human brain, using data from various brain structures.Our work provides a significant step towards accurately predicting gene interactions on a large scale and more quickly than before. This can guide future biological research by letting scientists test the interactions our method predicts, thereby furthering our collective understanding of gene functions.
引用
收藏
页数:25
相关论文
共 46 条
[1]  
Aliferis CF, 2010, J MACH LEARN RES, V11, P171
[2]  
Aliferis CF, 2010, J MACH LEARN RES, V11, P235
[3]  
Alon U, 2006, An Introduction to Systems Biology: Design Principles of Biological Circuits, DOI DOI 10.1201/9781420011432
[4]  
Angelin-Bonnet O, 2019, METHODS MOL BIOL, V1883, P347, DOI 10.1007/978-1-4939-8882-2_15
[5]  
[Anonymous], 1988, Probabilistic Reasoning in Intelligent Systems:Networks of Plausible Inference
[6]   Semiparametric Bayesian networks [J].
Atienza, David ;
Bielza, Concha ;
Larranaga, Pedro .
INFORMATION SCIENCES, 2022, 584 :564-582
[7]  
Balov N., 2012, R Package Version, P1
[8]  
Chen S. S., 1998, DARPA BROADCAST NEWS, P127
[9]  
Chickering D. M., 1996, Learning from data: Artificial intelligence and statistics V, P121
[10]   THE COMPUTATIONAL-COMPLEXITY OF PROBABILISTIC INFERENCE USING BAYESIAN BELIEF NETWORKS [J].
COOPER, GF .
ARTIFICIAL INTELLIGENCE, 1990, 42 (2-3) :393-405