Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates

被引:240
作者
Yang, Yuedong [1 ,2 ]
Faraggi, Eshel [1 ,2 ]
Zhao, Huiying [1 ,2 ]
Zhou, Yaoqi [1 ,2 ]
机构
[1] Indiana Univ Purdue Univ, Sch Informat, Indianapolis, IN 46202 USA
[2] Indiana Univ Sch Med, Ctr Computat Biol & Bioinformat, Indianapolis, IN 46202 USA
基金
美国国家卫生研究院;
关键词
SECONDARY STRUCTURE PREDICTION; SOLVENT ACCESSIBILITY; SEQUENCE-PROFILE; ALIGNMENT; PROGRAM; SERVER; IDENTIFICATION; SUBSTITUTION; SUPERFAMILY; INFORMATION;
D O I
10.1093/bioinformatics/btr350
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In recent years, development of a single-method fold-recognition server lags behind consensus and multiple template techniques. However, a good consensus prediction relies on the accuracy of individual methods. This article reports our efforts to further improve a single-method fold recognition technique called SPARKS by changing the alignment scoring function and incorporating the SPINE-X techniques that make improved prediction of secondary structure, backbone torsion angle and solvent accessible surface area. Results: The new method called SPARKS-X was tested with the SALIGN benchmark for alignment accuracy, Lindahl and SCOP benchmarks for fold recognition, and CASP 9 blind test for structure prediction. The method is compared to several state-of-the-art techniques such as HHPRED and BoostThreader. Results show that SPARKS-X is one of the best single-method fold recognition techniques. We further note that incorporating multiple templates and refinement in model building will likely further improve SPARKS-X.
引用
收藏
页码:2076 / 2082
页数:7
相关论文
共 51 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] Automated server predictions in CASP7
    Battey, James N. D.
    Kopp, Jurgen
    Bordoli, Lorenza
    Read, Randy J.
    Clarke, Neil D.
    Schwede, Torsten
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 69 : 68 - 82
  • [3] Exploring the extremes of sequence/structure space with ensemble fold recognition in the program Phyre
    Bennett-Lovsey, Riccardo M.
    Herbert, Alex D.
    Sternberg, Michael J. E.
    Kelley, Lawrence A.
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 70 (03) : 611 - 625
  • [4] Protein-structure prediction by recombination of fragments
    Bujnicki, JM
    [J]. CHEMBIOCHEM, 2006, 7 (01) : 19 - 27
  • [5] A machine learning information retrieval approach to protein fold recognition
    Cheng, Jianlin
    Baldi, Pierre
    [J]. BIOINFORMATICS, 2006, 22 (12) : 1456 - 1463
  • [6] Automated prediction of CASP-5 structures using the Robetta server
    Chivian, D
    Kim, DE
    Malmström, L
    Bradley, P
    Robertson, T
    Murphy, P
    Strauss, CEM
    Bonneau, R
    Rohl, CA
    Baker, D
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 : 524 - 533
  • [7] Characterizing the Existing and Potential Structural Space of Proteins by Large-Scale Multiple Loop Permutations
    Dai, Liang
    Zhou, Yaoqi
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2011, 408 (03) : 585 - 595
  • [8] Structure-based evaluation of sequence comparison and fold recognition alignment accuracy
    Domingues, FS
    Lackner, P
    Andreeva, A
    Sippl, MJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2000, 297 (04) : 1003 - 1013
  • [9] Achieving 80% ten-fold cross-validated accuracy for secondary structure prediction by large-scale training
    Dor, Ofer
    Zhou, Yaoqi
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 66 (04) : 838 - 845
  • [10] FARAGGI E, 2011, SPINE X GOING 80 ACC