A Novel Approach of Protein Secondary Structure Prediction by SVM Using PSSM Combined by Sequence Features

被引:0
作者
Chen, Yehong [1 ]
Cheng, Jinyong [2 ]
Liu, Yihui [2 ]
Park, Pil Seong [3 ]
机构
[1] Qilu Univ Technol, Sch Graph Commun & Packaging, Jinan, Shandong, Peoples R China
[2] Qilu Univ Technol, Sch Informat, Jinan, Shandong, Peoples R China
[3] Univ Suwon, Dept Comp Sci, Suwon, South Korea
来源
PROCEEDINGS OF SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS) 2016, VOL 1 | 2018年 / 15卷
基金
中国国家自然科学基金;
关键词
Protein secondary structure prediction; SVM; Position specific scoring matrices; Sequence feature; Amino acid scale; ProtScale;
D O I
10.1007/978-3-319-56994-9_74
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge of protein secondary structure is a useful step toward prediction of the 3D structure of a particular protein. In this paper, a support vector machine (SVM) based method used for the prediction of secondary structure is introduced in details. Protein sequence data is in a hybrid representation combining the Position-specific Scoring Matrix (PSSM), the Hydrophobicity Sequence Feature (HSF), and the Structural Sequence Feature (SSF). Protein sequences are obtained from CB513 dataset, corresponding PSSM profiles are obtained from PSI-BLAST Program and sequence features are computed based on amino acid scales offered by Expasy website (http://web.expasy.org/protscale/). Basically, PSSM profiles are used as input data to the SVM-PSSM classifier of the secondary structure prediction. Furthermore, to construct more accurate classifiers, more than 40 SFs (sequence features) are examined as accessional input vector to SVM-PSSM classifier for feature selection. The most accurate classifier in this study is constructed using a combination of PSSM and few relevant sequence features. The experimental results show that relevant sequence features extracted from Hydrophobicity index and Structural conformational parameters can improve the SVM-PSSM classifier for the prediction of protein secondary structure elements. Our proposed final SVM-PSSM-SF method achieved an overall accuracy of 78%.
引用
收藏
页码:1074 / 1084
页数:11
相关论文
共 20 条
  • [1] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [2] Prediction of protein secondary structure content using support vector machine
    Chen, Chao
    Tian, Yuanxin
    Zou, Xiaoyong
    Cai, Peixiang
    Mo, Jinyuan
    [J]. TALANTA, 2007, 71 (05) : 2069 - 2073
  • [3] Amino acid bulkiness defines the local conformations and dynamics of natively unfolded α-synuclein and tau
    Cho, Min-Kyu
    Kim, Hai-Young
    Bernado, Pau
    Fernandez, Claudio O.
    Blackledge, Martin
    Zweckstetter, Markus
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2007, 129 (11) : 3032 - +
  • [4] Cuff JA, 2000, PROTEINS, V40, P502, DOI 10.1002/1097-0134(20000815)40:3<502::AID-PROT170>3.0.CO
  • [5] 2-Q
  • [6] David W, 2013, PROTEINS STRUCTURE F
  • [7] AN ALGORITHM FOR PROTEIN SECONDARY STRUCTURE PREDICTION BASED ON CLASS PREDICTION
    DELEAGE, G
    ROUX, B
    [J]. PROTEIN ENGINEERING, 1987, 1 (04): : 289 - 294
  • [8] A protein structural classes prediction method based on predicted secondary structure and PSI-BLAST profile
    Ding, Shuyan
    Li, Yan
    Shi, Zhuoxing
    Yan, Shoujiang
    [J]. BIOCHIMIE, 2014, 97 : 60 - 65
  • [9] Gasteiger E., 2005, Springer Protocols Handbooks, V112, P531, DOI DOI 10.1385/1-59259-890-0:571
  • [10] Jaiswal Kunal, 2007, In Silico Biology, V7, P559