iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition

被引:478
作者
Lin, Hao [1 ,4 ]
Deng, En-Ze [1 ]
Ding, Hui [1 ]
Chen, Wei [2 ,3 ,4 ]
Chou, Kuo-Chen [4 ,5 ]
机构
[1] Univ Elect Sci & Technol China, Sch Life Sci & Technol, Ctr Bioinformat, Minist Educ,Key Lab Neuroinformat, Chengdu 610054, Peoples R China
[2] Hebei United Univ, Sch Sci, Dept Phys, Tangshan 063000, Peoples R China
[3] Hebei United Univ, Sch Sci, Ctr Genom & Computat Biol, Tangshan 063000, Peoples R China
[4] Gordon Life Sci Inst, Belmont, MA USA
[5] King Abdulaziz Univ, CEGMR, Jeddah 21413, Saudi Arabia
关键词
AMINO-ACID-COMPOSITION; MEMBRANE-PROTEIN TYPES; SUBCELLULAR LOCATION PREDICTION; SUPPORT VECTOR MACHINES; PHYSICOCHEMICAL PROPERTIES; GENERAL-FORM; STRUCTURAL CLASS; DNA; PSEAAC; RECOGNITION;
D O I
10.1093/nar/gku1019
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The sigma 54 promoters are unique in prokaryotic genome and responsible for transcripting carbon and nitrogen-related genes. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the sigma 54 promoters. Here, a predictor called 'iPro54-PseKNC' was developed. In the predictor, the samples of DNA sequences were formulated by a novel feature vector called 'pseudo k-tuple nucleotide composition', which was further optimized by the incremental feature selection procedure. The performance of iPro54-PseKNC was examined by the rigorous jackknife cross-validation tests on a stringent benchmark data set. As a user-friendly web-server, iPro54-PseKNC is freely accessible at http://lin.uestc.edu.cn/server/iPro54-PseKNC. For the convenience of the vast majority of experimental scientists, a step-by-step protocol guide was provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics that were presented in this paper just for its integrity. Meanwhile, we also discovered through an in-depth statistical analysis that the distribution of distances between the transcription start sites and the translation initiation sites were governed by the gamma distribution, which may provide a fundamental physical principle for studying the sigma 54 promoters.
引用
收藏
页码:12961 / 12972
页数:12
相关论文
共 95 条
[1]   ProSOM:: core promoter prediction based on unsupervised clustering of DNA physical profiles [J].
Abeel, Thomas ;
Saeys, Yvan ;
Rouze, Pierre ;
Van de Peer, Yves .
BIOINFORMATICS, 2008, 24 (13) :I24-I31
[2]   Generic eukaryotic core promoter prediction using structural features of DNA [J].
Abeel, Thomas ;
Saeys, Yvan ;
Bonnet, Eric ;
Rouze, Pierre ;
Van de Peer, Yves .
GENOME RESEARCH, 2008, 18 (02) :310-323
[3]   Transcriptional activator, FleQ, regulates mucin adhesion and flagellar gene expression in Pseudomonas aeruginosa in a cascade manner [J].
Arora, SK ;
Ritchings, BW ;
Almira, EC ;
Lory, S ;
Ramphal, R .
JOURNAL OF BACTERIOLOGY, 1997, 179 (17) :5574-5581
[4]  
Bailey T L, 1994, Proc Int Conf Intell Syst Mol Biol, V2, P28
[5]   Compilation and analysis of σ54-dependent promoter sequences [J].
Barrios, H ;
Valderrama, B ;
Morett, E .
NUCLEIC ACIDS RESEARCH, 1999, 27 (22) :4305-4313
[6]   σ54-Promoter Discrimination and Regulation by ppGpp and DksA [J].
Bernardo, Lisandro M. D. ;
Johansson, Linda U. M. ;
Skarfstad, Eleonore ;
Shingler, Victoria .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2009, 284 (02) :828-838
[7]   Core promoters are predicted by their distinct physicochemical properties in the genome of Plasmodium falciparum [J].
Brick, Kevin ;
Watanabe, Junichi ;
Pizzi, Elisabetta .
GENOME BIOLOGY, 2008, 9 (12)
[8]   Predicting subcellular localization of proteins in a hybridization space [J].
Cai, YD ;
Chou, KC .
BIOINFORMATICS, 2004, 20 (07) :1151-1156
[9]   Support vector machines for predicting membrane protein types by using functional domain composition [J].
Cai, YD ;
Zhou, GP ;
Chou, KC .
BIOPHYSICAL JOURNAL, 2003, 84 (05) :3257-3263
[10]   propy: a tool to generate various modes of Chou's PseAAC [J].
Cao, Dong-Sheng ;
Xu, Qing-Song ;
Liang, Yi-Zeng .
BIOINFORMATICS, 2013, 29 (07) :960-962