Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM

被引:15
作者
Liang, Yunyun [1 ]
Liu, Sanyang [1 ]
Zhang, Shengli [1 ]
机构
[1] Xidian Univ, Sch Math & Stat, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
AMINO-ACID-COMPOSITION; PRINCIPAL COMPONENT ANALYSIS; SUPPORT VECTOR MACHINE; PSI-BLAST; ACCURATE PREDICTION; FOLD RECOGNITION; REPRESENTATION; LOCALIZATION; HOMOLOGY;
D O I
10.1155/2015/370756
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM). Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS), segmented PsePSSM, and segmented autocovariance transformation (ACT) based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640) are adopted in this paper. Then a 700-dimensional (700D) feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA). To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.
引用
收藏
页数:9
相关论文
共 52 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] PRINCIPLES THAT GOVERN FOLDING OF PROTEIN CHAINS
    ANFINSEN, CB
    [J]. SCIENCE, 1973, 181 (4096) : 223 - 230
  • [3] Bahar I, 1997, PROTEINS, V29, P172, DOI 10.1002/(SICI)1097-0134(199710)29:2<172::AID-PROT5>3.3.CO
  • [4] 2-D
  • [5] Prediction of protein structural classes by neural network
    Cai, YD
    Zhou, GP
    [J]. BIOCHIMIE, 2000, 82 (08) : 783 - 785
  • [6] Prediction of protein structural classes by support vector machines
    Cai, YD
    Liu, XJ
    Xu, XB
    Chou, KC
    [J]. COMPUTERS & CHEMISTRY, 2002, 26 (03): : 293 - 296
  • [7] Prediction of protein structural class with Rough Sets
    Cao, YF
    Liu, S
    Zhang, LD
    Qin, J
    Wang, J
    Tang, KX
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [8] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [9] Using pseudo-amino acid composition and support vector machine to predict protein structural class
    Chen, Chao
    Tian, Yuan-Xin
    Zou, Xiao-Yong
    Cai, Pei-Xiang
    Mo, Jin-Yuan
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2006, 243 (03) : 444 - 448
  • [10] Prediction of protein structural class using novel evolutionary collocation-based sequence representation
    Chen, Ke
    Kurgan, Lukasz A.
    Ruan, Jishou
    [J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 2008, 29 (10) : 1596 - 1604