A Novel Approach of Protein Secondary Structure Prediction by SVM Using PSSM Combined by Sequence Features

被引:0
作者
Chen, Yehong [1 ]
Cheng, Jinyong [2 ]
Liu, Yihui [2 ]
Park, Pil Seong [3 ]
机构
[1] Qilu Univ Technol, Sch Graph Commun & Packaging, Jinan, Shandong, Peoples R China
[2] Qilu Univ Technol, Sch Informat, Jinan, Shandong, Peoples R China
[3] Univ Suwon, Dept Comp Sci, Suwon, South Korea
来源
PROCEEDINGS OF SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS) 2016, VOL 1 | 2018年 / 15卷
基金
中国国家自然科学基金;
关键词
Protein secondary structure prediction; SVM; Position specific scoring matrices; Sequence feature; Amino acid scale; ProtScale;
D O I
10.1007/978-3-319-56994-9_74
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge of protein secondary structure is a useful step toward prediction of the 3D structure of a particular protein. In this paper, a support vector machine (SVM) based method used for the prediction of secondary structure is introduced in details. Protein sequence data is in a hybrid representation combining the Position-specific Scoring Matrix (PSSM), the Hydrophobicity Sequence Feature (HSF), and the Structural Sequence Feature (SSF). Protein sequences are obtained from CB513 dataset, corresponding PSSM profiles are obtained from PSI-BLAST Program and sequence features are computed based on amino acid scales offered by Expasy website (http://web.expasy.org/protscale/). Basically, PSSM profiles are used as input data to the SVM-PSSM classifier of the secondary structure prediction. Furthermore, to construct more accurate classifiers, more than 40 SFs (sequence features) are examined as accessional input vector to SVM-PSSM classifier for feature selection. The most accurate classifier in this study is constructed using a combination of PSSM and few relevant sequence features. The experimental results show that relevant sequence features extracted from Hydrophobicity index and Structural conformational parameters can improve the SVM-PSSM classifier for the prediction of protein secondary structure elements. Our proposed final SVM-PSSM-SF method achieved an overall accuracy of 78%.
引用
收藏
页码:1074 / 1084
页数:11
相关论文
共 20 条
[1]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[2]   Prediction of protein secondary structure content using support vector machine [J].
Chen, Chao ;
Tian, Yuanxin ;
Zou, Xiaoyong ;
Cai, Peixiang ;
Mo, Jinyuan .
TALANTA, 2007, 71 (05) :2069-2073
[3]   Amino acid bulkiness defines the local conformations and dynamics of natively unfolded α-synuclein and tau [J].
Cho, Min-Kyu ;
Kim, Hai-Young ;
Bernado, Pau ;
Fernandez, Claudio O. ;
Blackledge, Martin ;
Zweckstetter, Markus .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2007, 129 (11) :3032-+
[4]  
Cuff JA, 2000, PROTEINS, V40, P502, DOI 10.1002/1097-0134(20000815)40:3<502::AID-PROT170>3.0.CO
[5]  
2-Q
[6]  
David W, 2013, PROTEINS STRUCTURE F
[7]   AN ALGORITHM FOR PROTEIN SECONDARY STRUCTURE PREDICTION BASED ON CLASS PREDICTION [J].
DELEAGE, G ;
ROUX, B .
PROTEIN ENGINEERING, 1987, 1 (04) :289-294
[8]   A protein structural classes prediction method based on predicted secondary structure and PSI-BLAST profile [J].
Ding, Shuyan ;
Li, Yan ;
Shi, Zhuoxing ;
Yan, Shoujiang .
BIOCHIMIE, 2014, 97 :60-65
[9]  
Gasteiger E., 2005, Springer Protocols Handbooks, V112, P531, DOI DOI 10.1385/1-59259-890-0:571
[10]  
Jaiswal Kunal, 2007, In Silico Biology, V7, P559