GAEM: Genetic Algorithm based Expectation-Maximization for inferring Gene Regulatory Networks from incomplete data

被引:1
作者
Niloofar, Parisa [1 ]
Aghdam, Rosa [2 ,4 ]
Eslahchi, Changiz [3 ,4 ]
机构
[1] Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Campusvej 55, Odense
[2] Wisconsin Institute for Discovery, University of Wisconsin-Madison, WI, Madison
[3] Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University
[4] School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM)
基金
美国国家科学基金会;
关键词
Bayesian network; Conditional Mutual Information; Expectation-Maximization; Gene Regulatory Network; Genetic algorithm; Missing values;
D O I
10.1016/j.compbiomed.2024.109238
中图分类号
学科分类号
摘要
In Bioinformatics, inferring the structure of a Gene Regulatory Network (GRN) from incomplete gene expression data is a difficult task. One popular method for inferring the structure GRNs is to apply the Path Consistency Algorithm based on Conditional Mutual Information (PCA-CMI). Although PCA-CMI excels at extracting GRN skeletons, it struggles with missing values in datasets. As a result, applying PCA-CMI to infer GRNs, necessitates a preprocessing method for data imputation. In this paper, we present the GAEM algorithm, which uses an iterative approach based on a combination of Genetic Algorithm and Expectation-Maximization to infer the structure of GRN from incomplete gene expression datasets. GAEM learns the GRN structure from the incomplete dataset via an algorithm that iteratively updates the imputed values based on the learnt GRN until the convergence criteria are met. We evaluate the performance of this algorithm under various missingness mechanisms (ignorable and nonignorable) and percentages (5%, 15%, and 40%). The traditional approach to handling missing values in gene expression datasets involves estimating them first and then constructing the GRN. However, our methodology differs in that both missing values and the GRN are updated iteratively until convergence. Results from the DREAM3 dataset demonstrate that the GAEM algorithm appears to be a more reliable method overall, especially for smaller network sizes, GAEM outperforms methods where the incomplete dataset is imputed first, followed by learning the GRN structure from the imputed data. We have implemented the GAEM algorithm within the GAEM R package, which is accessible at the following GitHub repository: https://github.com/parniSDU/GAEM. © 2024
引用
收藏
相关论文
共 75 条
[11]  
Zhou C., Zhang S.-W., Liu F., An ensemble method for reconstructing gene regulatory network with jackknife resampling and arithmetic mean fusion, Int. J. Data Min. Bioinform., 12, 3, pp. 328-342, (2015)
[12]  
Rezaei Tabar V., Zareifard H., Salimi S., Plewczynski D., Learning directed acyclic graphs by determination of candidate causes for discrete variables, J. Stat. Comput. Simul., 89, 10, pp. 1957-1970, (2019)
[13]  
Mahmoodi S.H., Aghdam R., Eslahchi C., An order independent algorithm for inferring gene regulatory network using quantile value for conditional independence tests, Sci. Rep., 11, 1, pp. 1-15, (2021)
[14]  
Lei J., Cai Z., He X., Zheng W., Liu J., An approach of gene regulatory network construction using mixed entropy optimizing context-related likelihood mutual information, Bioinformatics, 39, 1, (2023)
[15]  
Malekpour S.A., Alizad-Rahvar A.R., Sadeghi M., LogicNet: probabilistic continuous logics in reconstructing gene regulatory networks, BMC Bioinform., 21, 1, (2020)
[16]  
Malekpour S.A., Shahdoust M., Aghdam R., Sadeghi M., WpLogicNet: logic gate and structure inference in gene regulatory networks, Bioinformatics, 39, 2, (2023)
[17]  
Walker A.M., Cliff A., Romero J., Shah M.B., Jones P., Felipe Machado Gazolla J.G., Jacobson D.A., Kainer D., Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data, Comput. Struct. Biotechnol. J., 20, pp. 3372-3386, (2022)
[18]  
Liu W., Sun X., Peng L., Zhou L., Lin H., Jiang Y., RWRNET: a gene regulatory network inference algorithm using random walk with restart, Front. Genet., 11, (2020)
[19]  
Wu Z., Sinha S., SPREd: a simulation-supervised neural network tool for gene regulatory network reconstruction, Bioinform. Adv., 4, 1, (2024)
[20]  
Gu W.-C., Ma B.-G., PGBTR: A powerful and general method for inferring bacterial transcriptional regulatory networks, (2024)