A novel protein structural classes prediction method based on predicted secondary structure

被引:51
作者
Ding, Shuyan [1 ,2 ]
Zhang, Shengli [3 ]
Li, Yang [2 ]
Wang, Tianming [1 ]
机构
[1] Dalian Univ Technol, Sch Math Sci, Dalian 116024, Peoples R China
[2] Dalian Nationalities Univ, Sch Sci, Dalian 116600, Peoples R China
[3] Xidian Univ, Dept Math, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Protein structural classes; Support vector machine; Feature selection; AMINO-ACID-COMPOSITION; SEQUENCES; CLASSIFIER; LOCATION; HOMOLOGY; PROGRESS; IMPACT;
D O I
10.1016/j.biochi.2012.01.022
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Knowledge of structural classes plays an important role in understanding protein folding patterns. In this paper, features based on the predicted secondary structure sequence and the corresponding E-H sequence are extracted. Then, an 11-dimensional feature vector is selected based on a wrapper feature selection algorithm and a support vector machine (SVM). Among the 11 selected features, 4 novel features are newly designed to model the differences between alpha/beta class and alpha + beta class, and other 7 rational features are proposed by previous researchers. To examine the performance of our method, a total of 5 datasets are used to design and test the proposed method. The results show that competitive prediction accuracies can be achieved by the proposed method compared to existing methods (SCPRED, RKS-PPSC and MODAS), and 4 new features are demonstrated essential to differentiate alpha/beta and alpha + beta classes. Standalone version of the proposed method is written in JAVA language and it can be downloaded from http://web.xidian.edu.cn/slzhang/paper.html. (C) 2012 Elsevier Masson SAS. All rights reserved.
引用
收藏
页码:1166 / 1171
页数:6
相关论文
共 35 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 2006, Introduction to Data Mining
[3]   The ASTRAL compendium for protein structure and sequence analysis [J].
Brenner, SE ;
Koehl, P ;
Levitt, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :254-256
[4]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5]   Prediction of protein structural class using novel evolutionary collocation-based sequence representation [J].
Chen, Ke ;
Kurgan, Lukasz A. ;
Ruan, Jishou .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2008, 29 (10) :1596-1604
[6]   Progress in protein structural class prediction and its impact to bioinformatics and proteomics [J].
Chou, KC .
CURRENT PROTEIN & PEPTIDE SCIENCE, 2005, 6 (05) :423-436
[7]   PREDICTION OF PROTEIN STRUCTURAL CLASSES [J].
CHOU, KC ;
ZHANG, CT .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (04) :275-349
[8]   A NOVEL-APPROACH TO PREDICTING PROTEIN STRUCTURAL CLASSES IN A (20-1)-D AMINO-ACID-COMPOSITION SPACE [J].
CHOU, KC .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1995, 21 (04) :319-344
[9]   Prediction of protein cellular attributes using pseudo-amino acid composition [J].
Chou, KC .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 43 (03) :246-255
[10]   Recent progress in protein subcellular location prediction [J].
Chou, Kuo-Chen ;
Shen, Hong-Bin .
ANALYTICAL BIOCHEMISTRY, 2007, 370 (01) :1-16