Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates

被引:244
作者
Yang, Yuedong [1 ,2 ]
Faraggi, Eshel [1 ,2 ]
Zhao, Huiying [1 ,2 ]
Zhou, Yaoqi [1 ,2 ]
机构
[1] Indiana Univ Purdue Univ, Sch Informat, Indianapolis, IN 46202 USA
[2] Indiana Univ Sch Med, Ctr Computat Biol & Bioinformat, Indianapolis, IN 46202 USA
基金
美国国家卫生研究院;
关键词
SECONDARY STRUCTURE PREDICTION; SOLVENT ACCESSIBILITY; SEQUENCE-PROFILE; ALIGNMENT; PROGRAM; SERVER; IDENTIFICATION; SUBSTITUTION; SUPERFAMILY; INFORMATION;
D O I
10.1093/bioinformatics/btr350
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In recent years, development of a single-method fold-recognition server lags behind consensus and multiple template techniques. However, a good consensus prediction relies on the accuracy of individual methods. This article reports our efforts to further improve a single-method fold recognition technique called SPARKS by changing the alignment scoring function and incorporating the SPINE-X techniques that make improved prediction of secondary structure, backbone torsion angle and solvent accessible surface area. Results: The new method called SPARKS-X was tested with the SALIGN benchmark for alignment accuracy, Lindahl and SCOP benchmarks for fold recognition, and CASP 9 blind test for structure prediction. The method is compared to several state-of-the-art techniques such as HHPRED and BoostThreader. Results show that SPARKS-X is one of the best single-method fold recognition techniques. We further note that incorporating multiple templates and refinement in model building will likely further improve SPARKS-X.
引用
收藏
页码:2076 / 2082
页数:7
相关论文
共 51 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Automated server predictions in CASP7 [J].
Battey, James N. D. ;
Kopp, Jurgen ;
Bordoli, Lorenza ;
Read, Randy J. ;
Clarke, Neil D. ;
Schwede, Torsten .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 69 :68-82
[3]   Exploring the extremes of sequence/structure space with ensemble fold recognition in the program Phyre [J].
Bennett-Lovsey, Riccardo M. ;
Herbert, Alex D. ;
Sternberg, Michael J. E. ;
Kelley, Lawrence A. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 70 (03) :611-625
[4]   Protein-structure prediction by recombination of fragments [J].
Bujnicki, JM .
CHEMBIOCHEM, 2006, 7 (01) :19-27
[5]   A machine learning information retrieval approach to protein fold recognition [J].
Cheng, Jianlin ;
Baldi, Pierre .
BIOINFORMATICS, 2006, 22 (12) :1456-1463
[6]   Automated prediction of CASP-5 structures using the Robetta server [J].
Chivian, D ;
Kim, DE ;
Malmström, L ;
Bradley, P ;
Robertson, T ;
Murphy, P ;
Strauss, CEM ;
Bonneau, R ;
Rohl, CA ;
Baker, D .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 :524-533
[7]   Characterizing the Existing and Potential Structural Space of Proteins by Large-Scale Multiple Loop Permutations [J].
Dai, Liang ;
Zhou, Yaoqi .
JOURNAL OF MOLECULAR BIOLOGY, 2011, 408 (03) :585-595
[8]   Structure-based evaluation of sequence comparison and fold recognition alignment accuracy [J].
Domingues, FS ;
Lackner, P ;
Andreeva, A ;
Sippl, MJ .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 297 (04) :1003-1013
[9]   Achieving 80% ten-fold cross-validated accuracy for secondary structure prediction by large-scale training [J].
Dor, Ofer ;
Zhou, Yaoqi .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 66 (04) :838-845
[10]  
FARAGGI E, 2011, SPINE X GOING 80 ACC