Using increment of diversity to predict mitochondrial proteins of malaria parasite: integrating pseudo-amino acid composition and structural alphabet

被引:27
作者
Chen, Ying-Li [1 ,2 ]
Li, Qian-Zhong [1 ]
Zhang, Li-Qing [1 ,2 ,3 ]
机构
[1] Inner Mongolia Univ, Sch Phys Sci & Technol, Lab Theoret Biophys, Hohhot, Peoples R China
[2] Virginia Tech, Dept Comp Sci, Blacksburg, VA USA
[3] Virginia Tech, Program Genet Bioinformat & Computat Biol, Blacksburg, VA USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Plasmodium falciparum; Mitochondrial proteins; Increment of diversity; Reduced amino acid alphabet; Hydropathy distribution; SUPPORT VECTOR MACHINE; SUBCELLULAR LOCATION; LOCALIZATION; RECOGNITION; SEQUENCE;
D O I
10.1007/s00726-010-0825-7
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Due to the complexity of Plasmodium falciparum (PF) genome, predicting mitochondrial proteins of PF is more difficult than other species. In this study, using the n-peptide composition of reduced amino acid alphabet (RAAA) obtained from structural alphabet named Protein Blocks as feature parameter, the increment of diversity (ID) is firstly developed to predict mitochondrial proteins. By choosing the 1-peptide compositions on the N-terminal regions with 20 residues as the only input vector, the prediction performance achieves 86.86% accuracy with 0.69 Mathew's correlation coefficient (MCC) by the jackknife test. Moreover, by combining with the hydropathy distribution along protein sequence and several reduced amino acid alphabets, we achieved maximum MCC 0.82 with accuracy 92% in the jackknife test by using the developed ID model. When evaluating on an independent dataset our method performs better than existing methods. The results indicate that the ID is a simple and efficient prediction method for mitochondrial proteins of malaria parasite.
引用
收藏
页码:1309 / 1316
页数:8
相关论文
共 54 条
[31]   Grouping of amino acids and recognition of protein structurally conserved regions by reduced alphabets of amino acids [J].
Li Jing ;
Wang Wei .
SCIENCE IN CHINA SERIES C-LIFE SCIENCES, 2007, 50 (03) :392-402
[32]   The prediction of the structural class of protein: Application of the measure of diversity [J].
Li, QZ ;
Lu, ZQ .
JOURNAL OF THEORETICAL BIOLOGY, 2001, 213 (03) :493-502
[33]   Predicting conotoxin superfamily and family by using pseudo amino acid composition and modified Mahalanobis discriminant [J].
Lin, Hao ;
Li, Qian-Zhong .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2007, 354 (02) :548-551
[34]   Using pseudo amino acid composition to predict protein structural class: Approached by incorporating 400 dipeptide components [J].
Lin, Hao ;
Li, Qian-Zhong .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2007, 28 (09) :1463-1466
[35]   A genetic approach for building different alphabets for peptide and protein classification [J].
Nanni, Loris ;
Lumini, Alessandra .
BMC BIOINFORMATICS, 2008, 9 (1)
[36]   Subcellular localization prediction with new protein encoding schemes [J].
Ogul, Hasan ;
Mumcuoglu, Erkan U. .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2007, 4 (02) :227-232
[37]   A new method for identification of protein (sub)families in a set of proteins based on hydropathy distribution in proteins [J].
Pánek, J ;
Eidhammer, I ;
Aasland, R .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 58 (04) :923-934
[38]   Support Vector Machine-based method for predicting subcellular localization of mycobacterial proteins using evolutionary information and motifs [J].
Rashid, Mamoon ;
Saha, Sudipto ;
Raghava, Gajendra P. S. .
BMC BIOINFORMATICS, 2007, 8
[39]   Recognition of analogous and homologous protein folds: Analysis of sequence and structure conservation [J].
Russell, RB ;
Saqi, MAS ;
Sayle, RA ;
Bates, PA ;
Sternberg, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 269 (03) :423-439
[40]   A MATHEMATICAL THEORY OF COMMUNICATION [J].
SHANNON, CE .
BELL SYSTEM TECHNICAL JOURNAL, 1948, 27 (03) :379-423