GAEM: Genetic Algorithm based Expectation-Maximization for inferring Gene Regulatory Networks from incomplete data

被引:1
作者
Niloofar, Parisa [1 ]
Aghdam, Rosa [2 ,4 ]
Eslahchi, Changiz [3 ,4 ]
机构
[1] Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Campusvej 55, Odense
[2] Wisconsin Institute for Discovery, University of Wisconsin-Madison, WI, Madison
[3] Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University
[4] School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM)
基金
美国国家科学基金会;
关键词
Bayesian network; Conditional Mutual Information; Expectation-Maximization; Gene Regulatory Network; Genetic algorithm; Missing values;
D O I
10.1016/j.compbiomed.2024.109238
中图分类号
学科分类号
摘要
In Bioinformatics, inferring the structure of a Gene Regulatory Network (GRN) from incomplete gene expression data is a difficult task. One popular method for inferring the structure GRNs is to apply the Path Consistency Algorithm based on Conditional Mutual Information (PCA-CMI). Although PCA-CMI excels at extracting GRN skeletons, it struggles with missing values in datasets. As a result, applying PCA-CMI to infer GRNs, necessitates a preprocessing method for data imputation. In this paper, we present the GAEM algorithm, which uses an iterative approach based on a combination of Genetic Algorithm and Expectation-Maximization to infer the structure of GRN from incomplete gene expression datasets. GAEM learns the GRN structure from the incomplete dataset via an algorithm that iteratively updates the imputed values based on the learnt GRN until the convergence criteria are met. We evaluate the performance of this algorithm under various missingness mechanisms (ignorable and nonignorable) and percentages (5%, 15%, and 40%). The traditional approach to handling missing values in gene expression datasets involves estimating them first and then constructing the GRN. However, our methodology differs in that both missing values and the GRN are updated iteratively until convergence. Results from the DREAM3 dataset demonstrate that the GAEM algorithm appears to be a more reliable method overall, especially for smaller network sizes, GAEM outperforms methods where the incomplete dataset is imputed first, followed by learning the GRN structure from the imputed data. We have implemented the GAEM algorithm within the GAEM R package, which is accessible at the following GitHub repository: https://github.com/parniSDU/GAEM. © 2024
引用
收藏
相关论文
共 75 条
[1]  
MacNeil L.T., Walhout A.J.M., Gene regulatory networks and the role of robustness and stochasticity in the control of gene expression, Genome Res., 21, 5, pp. 645-657, (2011)
[2]  
Csermely P., Agoston V., Pongor S., The efficiency of multi-target drugs: the network approach might help drug design, Trends Pharmacol. Sci., 26, 4, pp. 178-182, (2005)
[3]  
Friedman N., Linial M., Nachman I., Pe'er D., Using Bayesian networks to analyze expression data, J. Comput. Biol., 7, 3-4, pp. 601-620, (2000)
[4]  
Savoie C.J., Aburatani S., Watanabe S., Eguchi Y., Muta S., Imoto S., Miyano S., Kuhara S., Tashiro K., Use of gene networks from full genome microarray libraries to identify functionally relevant drug-affected genes and gene regulation cascades, DNA Res., 10, 1, pp. 19-25, (2003)
[5]  
Levine M., Davidson E.H., Gene regulatory networks for development, Proc. Natl. Acad. Sci. USA, 102, 14, pp. 4936-4942, (2005)
[6]  
Ma B., Fang M., Jiao X., Inference of gene regulatory networks based on nonlinear ordinary differential equations, Bioinformatics, 36, 19, pp. 4885-4893, (2020)
[7]  
Karlebach G., Robinson P.N., Computing minimal boolean models of gene regulatory networks, J. Comput. Biol., 31, 2, pp. 117-127, (2024)
[8]  
Aghdam R., Ganjali M., Zhang X., Eslahchi C., CN: a consensus algorithm for inferring gene regulatory networks using the SORDER algorithm and conditional mutual information test, Mol. Biosyst., 11, 3, pp. 942-949, (2015)
[9]  
Nair A., Chetty M., Wangikar P.P., Improving gene regulatory network inference using network topology information, Mol. Biosyst., 11, 9, pp. 2449-2463, (2015)
[10]  
Zhang X., Zhao J., Hao J.-K., Zhao X.-M., Chen L., Conditional mutual inclusive information enables accurate quantification of associations in gene regulatory networks, Nucleic Acids Res., 43, 5, (2015)