A multi-approaches-guided genetic algorithm with application to operon prediction

被引:34
作者
Wang, Shuqin
Wang, Yan
Du, Wei
Sun, Fangxun
Wang, Xiumei
Zhou, Chunguang
Liang, Yanchun [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Minist Educ, Key Lab Symbol Computat & Knowledge Educ, Changchun 130012, Peoples R China
[2] NE Normal Univ, Sch Math & Stat, Minist Educ, Key Lab Appl Stat, Changchun 130024, Peoples R China
基金
中国国家自然科学基金;
关键词
genetic algorithm; operon; entropy; COG function; microarray; metabolic pathway;
D O I
10.1016/j.artmed.2007.07.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: The prediction of operons is critical to the reconstruction of regulatory networks at the whole genome level. Multiple genome features have been used for predicting operons. However, multiple genome features are usually dealt with using only single method in the literatures. The aim of this paper is to develop a combined method for operon prediction by using different methods to preprocess different genome features in order for exerting their unique characteristics. Methods: A novel multi-approach-guided genetic algorithm for operon prediction is presented. We exploit different methods for intergenic distance, cluster of orthologous groups (COG) gene functions, metabolic pathway and microarray expression data. A novel local-entropy-minimization method is proposed to partition intergenic distance. Our program can be used for other newly sequenced genomes by transferring the knowledge that has been obtained from Escherichia coli data. We calculate the log-likelihood for COG gene functions and Pearson correlation coefficient for microarray expression data. The genetic algorithm is used for integrating the four types of data. Results: The proposed method is examined on E. coli K12 genome, Bacillus subtilis genome, and Pseudomonas aeruginosa PAO1 genome. The accuracies of prediction for these three genomes are 85.9987%, 88.296%, and 81.2384%, respectively. Conclusion: Simulated experimental results demonstrate that in the genetic algorithm the preprocessing for genome data using multiple approaches ensures the effective utilization of different biological characteristics. Experimental results also show that the proposed method is applicable for predicting operons in prokaryote. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:151 / 159
页数:9
相关论文
共 26 条
[1]   Structural systems biology: modelling protein interactions [J].
Aloy, P ;
Russell, RB .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2006, 7 (03) :188-197
[2]  
Barrett T, 2005, NUCLEIC ACIDS RES, V33, pD562
[3]   A Bayesian network approach to operon prediction [J].
Bockhorst, J ;
Craven, M ;
Page, D ;
Shavlik, J ;
Glasner, J .
BIOINFORMATICS, 2003, 19 (10) :1227-1235
[4]   Operon prediction by comparative genomics:: an application to the Synechococcus sp WH8102 genome [J].
Chen, X ;
Su, Z ;
Dam, P ;
Palenik, B ;
Xu, Y ;
Jiang, T .
NUCLEIC ACIDS RESEARCH, 2004, 32 (07) :2147-2157
[5]  
Chen Xin, 2004, Genome Inform, V15, P211
[6]  
Craven M, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P116
[7]  
DAM P, 2005, P 4 INT IEEE COMP SO, P69
[8]   A universally applicable method of operon map prediction on minimally annotated genomes using conserved genomic context [J].
Edwards, MT ;
Rison, SCG ;
Stoker, NG ;
Wernisch, L .
NUCLEIC ACIDS RESEARCH, 2005, 33 (10) :3253-3262
[9]   A novel regulatory mechanism couples deoxyribonucleotide synthesis and DNA replication in Escherichia coli [J].
Gon, S ;
Camara, JE ;
Klungsoyr, HK ;
Crooke, E ;
Skarstad, K ;
Beckwith, J .
EMBO JOURNAL, 2006, 25 (05) :1137-1147
[10]   A fuzzy guided genetic algorithm for operon prediction [J].
Jacob, E ;
Sasikumar, R ;
Nair, KNR .
BIOINFORMATICS, 2005, 21 (08) :1403-1407