Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field

被引:691
作者
Xu, Dong [1 ]
Zhang, Yang [1 ,2 ]
机构
[1] Univ Michigan, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Biol Chem, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
hydrogen bonding; Monte Carlo simulation; protein folding; protein structure prediction; solvent accessibility; statistical potential; STRUCTURE PREDICTION; SECONDARY STRUCTURE; ENERGY MINIMIZATION; GLOBULAR-PROTEINS; MEAN FORCE; I-TASSER; POTENTIALS; ALGORITHM; ROSETTA; TARGETS;
D O I
10.1002/prot.24065
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Ab initio protein folding is one of the major unsolved problems in computational biology owing to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 120 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 nonhomologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in one-third cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction experiment, QUARK server outperformed the second and third best servers by 18 and 47% based on the cumulative Z-score of global distance test-total scores in the FM category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress toward the solution of the most important problem in the field. Proteins 2012; (c) 2012 Wiley Periodicals, Inc.
引用
收藏
页码:1715 / 1735
页数:21
相关论文
共 58 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   PRINCIPLES THAT GOVERN FOLDING OF PROTEIN CHAINS [J].
ANFINSEN, CB .
SCIENCE, 1973, 181 (4096) :223-230
[3]   Assessment of CASP8 structure predictions for template free targets [J].
Ben-David, Moshe ;
Noivirt-Brik, Orly ;
Paz, Aviv ;
Prilusky, Jaime ;
Sussman, Joel L. ;
Levy, Yaakov .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2009, 77 :50-65
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]  
Bonneau R, 2001, PROTEINS, P119
[6]   Toward high-resolution de novo structure prediction for small proteins [J].
Bradley, P ;
Misura, KMS ;
Baker, D .
SCIENCE, 2005, 309 (5742) :1868-1871
[7]   CHARMM - A PROGRAM FOR MACROMOLECULAR ENERGY, MINIMIZATION, AND DYNAMICS CALCULATIONS [J].
BROOKS, BR ;
BRUCCOLERI, RE ;
OLAFSON, BD ;
STATES, DJ ;
SWAMINATHAN, S ;
KARPLUS, M .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 1983, 4 (02) :187-217
[8]   Cyclic coordinate descent: A robotics algorithm for protein loop closure [J].
Canutescu, AA ;
Dunbrack, RL .
PROTEIN SCIENCE, 2003, 12 (05) :963-972
[9]  
CASE DA, 1997, AMBER 5 0
[10]   LMProt:: An efficient algorithm for Monte Carlo sampling of protein conformational space [J].
da Silva, RA ;
Degrève, L ;
Caliri, A .
BIOPHYSICAL JOURNAL, 2004, 87 (03) :1567-1577