A Novel Approach of Protein Secondary Structure Prediction by SVM Using PSSM Combined by Sequence Features

被引:0
|
作者
Chen, Yehong [1 ]
Cheng, Jinyong [2 ]
Liu, Yihui [2 ]
Park, Pil Seong [3 ]
机构
[1] Qilu Univ Technol, Sch Graph Commun & Packaging, Jinan, Shandong, Peoples R China
[2] Qilu Univ Technol, Sch Informat, Jinan, Shandong, Peoples R China
[3] Univ Suwon, Dept Comp Sci, Suwon, South Korea
基金
中国国家自然科学基金;
关键词
Protein secondary structure prediction; SVM; Position specific scoring matrices; Sequence feature; Amino acid scale; ProtScale;
D O I
10.1007/978-3-319-56994-9_74
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge of protein secondary structure is a useful step toward prediction of the 3D structure of a particular protein. In this paper, a support vector machine (SVM) based method used for the prediction of secondary structure is introduced in details. Protein sequence data is in a hybrid representation combining the Position-specific Scoring Matrix (PSSM), the Hydrophobicity Sequence Feature (HSF), and the Structural Sequence Feature (SSF). Protein sequences are obtained from CB513 dataset, corresponding PSSM profiles are obtained from PSI-BLAST Program and sequence features are computed based on amino acid scales offered by Expasy website (http://web.expasy.org/protscale/). Basically, PSSM profiles are used as input data to the SVM-PSSM classifier of the secondary structure prediction. Furthermore, to construct more accurate classifiers, more than 40 SFs (sequence features) are examined as accessional input vector to SVM-PSSM classifier for feature selection. The most accurate classifier in this study is constructed using a combination of PSSM and few relevant sequence features. The experimental results show that relevant sequence features extracted from Hydrophobicity index and Structural conformational parameters can improve the SVM-PSSM classifier for the prediction of protein secondary structure elements. Our proposed final SVM-PSSM-SF method achieved an overall accuracy of 78%.
引用
收藏
页码:1074 / 1084
页数:11
相关论文
共 50 条
  • [41] MACHINE LEARNING APPROACH FOR THE PREDICTION OF PROTEIN SECONDARY STRUCTURE
    KING, RD
    STERNBERG, MJE
    JOURNAL OF MOLECULAR BIOLOGY, 1990, 216 (02) : 441 - 457
  • [42] A Deep Learning Approach for Prediction of Protein Secondary Structure
    Zubair, Muhammad
    Hanif, Muhammad Kashif
    Alabdulkreem, Eatedal
    Ghadi, Yazeed
    Khan, Muhammad Irfan
    Sarwar, Muhammad Umer
    Hanif, Ayesha
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 72 (02): : 3705 - 3718
  • [43] Sequence/structure similarity and support vector machine for protein secondary structure prediction
    Lin, JH
    Tsai, CL
    Lin, MR
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XIII, PROCEEDINGS: INDUSTRIAL SYSTEMS, 2004, : 71 - 76
  • [44] Association classification algorithm based on structure sequence in protein secondary structure prediction
    Zhou, Zhun
    Yang, Bingru
    Hou, Wei
    EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (09) : 6381 - 6389
  • [45] Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments
    Riis, SK
    Krogh, A
    JOURNAL OF COMPUTATIONAL BIOLOGY, 1996, 3 (01) : 163 - 183
  • [46] Prediction of secondary protein structure content from primary sequence alone - A feature selection based approach
    Kurgan, L
    Homaeian, L
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2005, 3587 : 334 - 345
  • [47] Protein multiple sequence alignment benchmarking through secondary structure prediction
    Le, Quan
    Sievers, Fabian
    Higgins, Desmond G.
    BIOINFORMATICS, 2017, 33 (09) : 1331 - 1337
  • [48] PROTEIN SECONDARY STRUCTURE PREDICTION FROM AMINO-ACID SEQUENCE
    LOW, BW
    CRANE, GA
    FEDERATION PROCEEDINGS, 1974, 33 (05) : 1305 - 1305
  • [49] A Novel Method for Protein Function Prediction Based on Sequence Numerical Features
    Yang, Ang
    Li, Renfa
    Zhu, Wen
    Yue, Guangxue
    MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY, 2012, 67 (03) : 833 - 843
  • [50] Analysis of the effects of multiple sequence alignments in protein secondary structure prediction
    Pappas, GJ
    Subramaniam, S
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, PROCEEDINGS, 2005, 3594 : 128 - 140