iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition

被引:279
作者
Feng, Peng-Mian [1 ]
Chen, Wei [2 ,3 ,4 ]
Lin, Hao [5 ]
Chou, Kuo-Chen [4 ,6 ]
机构
[1] Hebei United Univ, Sch Publ Hlth, Tangshan 063000, Peoples R China
[2] Hebei United Univ, Sch Sci, Dept Phys, Tangshan 063000, Peoples R China
[3] Hebei United Univ, Ctr Genom & Computat Biol, Tangshan 063000, Peoples R China
[4] Gordon Life Sci Inst, Belmont, MA 02478 USA
[5] Univ Elect Sci & Technol China, Sch Life Sci & Technol, Ctr Bioinformat, Key Lab Neuroinformat,Minist Educ, Chengdu 610054, Peoples R China
[6] King Abdulaziz Univ, CEGMR, Jeddah 21413, Saudi Arabia
关键词
Heat shock protein; Reduced amino acid alphabet; n-Peptide composition; PseAAC; SVM; Web server; SUPPORT VECTOR MACHINES; PREDICTING SUBCELLULAR-LOCALIZATION; GENERAL-FORM; PHYSICOCHEMICAL PROPERTIES; STRUCTURAL CLASS; SIGNAL PEPTIDES; CHOUS PSEAAC; HEAT-SHOCK-PROTEIN-70; ATTRIBUTES; MODE;
D O I
10.1016/j.ab.2013.05.024
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Heat shock proteins (HSPs) are a type of functionally related proteins present in all living organisms, both prokaryotes and eukaryotes. They play essential roles in protein-protein interactions such as folding and assisting in the establishment of proper protein conformation and prevention of unwanted protein aggregation. Their dysfunction may cause various life-threatening disorders, such as Parkinson's, Alzheimer's, and cardiovascular diseases. Based on their functions, HSPs are usually classified into six families: (i) HSP20 or sHSP, (ii) HSP40 or J-class proteins, (iii) HSP60 or GroEL/ES, (iv) HSP70, (v) HSP90, and (vi) HSP100. Although considerable progress has been achieved in discriminating HSPs from other proteins, it is still a big challenge to identify HSPs among their six different functional types according to their sequence information alone. With the avalanche of protein sequences generated in the post-genomic age, it is highly desirable to develop a high-throughput computational tool in this regard. To take up such a challenge, a predictor called iHSP-PseRAAAC has been developed by incorporating the reduced amino acid alphabet information into the general form of pseudo amino acid composition. One of the remarkable advantages of introducing the reduced amino acid alphabet is being able to avoid the notorious dimension disaster or overfitting problem in statistical prediction. It was observed that the overall success rate achieved by iHSP-PseRAAAC in identifying the functional types of HSPs among the aforementioned six types was more than 87%, which was derived by the jackknife test on a stringent benchmark dataset in which none of HSPs included has >= 40% pairwise sequence identity to any other in the same subset. It has not escaped our notice that the reduced amino acid alphabet approach can also be used to investigate other protein classification problems. As a user-friendly web server, iHSP-PseRAAAC is accessible to the public at http://lin.uestc.edu.cn/server/iHSP-PseRAAAC. (C) 2013 Elsevier Inc. All rights reserved.
引用
收藏
页码:118 / 125
页数:8
相关论文
共 86 条
[1]  
Altschul SE, 1997, THEORETICAL AND COMPUTATIONAL METHODS IN GENOME RESEARCH, P1
[2]   Support vector machines for predicting membrane protein types by using functional domain composition [J].
Cai, YD ;
Zhou, GP ;
Chou, KC .
BIOPHYSICAL JOURNAL, 2003, 84 (05) :3257-3263
[3]   propy: a tool to generate various modes of Chou's PseAAC [J].
Cao, Dong-Sheng ;
Xu, Qing-Song ;
Liang, Yi-Zeng .
BIOINFORMATICS, 2013, 29 (07) :960-962
[4]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5]   iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition [J].
Chen, Wei ;
Feng, Peng-Mian ;
Lin, Hao ;
Chou, Kuo-Chen .
NUCLEIC ACIDS RESEARCH, 2013, 41 (06) :e68
[6]   iNuc-PhysChem: A Sequence-Based Predictor for Identifying Nucleosomes via Physicochemical Properties [J].
Chen, Wei ;
Lin, Hao ;
Feng, Peng-Mian ;
Ding, Chen ;
Zuo, Yong-Chun ;
Chou, Kuo-Chen .
PLOS ONE, 2012, 7 (10)
[7]   Prediction of ketoacyl synthase family using reduced amino acid alphabets [J].
Chen, Wei ;
Feng, Pengmian ;
Lin, Hao .
JOURNAL OF INDUSTRIAL MICROBIOLOGY & BIOTECHNOLOGY, 2012, 39 (04) :579-584
[8]   Prediction of midbody, centrosome and kinetochore proteins based on gene ontology information [J].
Chen, Wei ;
Lin, Hao .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2010, 401 (03) :382-384
[9]   Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou's pseudo amino acid composition [J].
Chen, Yen-Kuang ;
Li, Kuo-Bin .
JOURNAL OF THEORETICAL BIOLOGY, 2013, 318 :1-12
[10]   Using increment of diversity to predict mitochondrial proteins of malaria parasite: integrating pseudo-amino acid composition and structural alphabet [J].
Chen, Ying-Li ;
Li, Qian-Zhong ;
Zhang, Li-Qing .
AMINO ACIDS, 2012, 42 (04) :1309-1316