TopScore: Using Deep Neural Networks and Large Diverse Data Sets for Accurate Protein Model Quality Assessment

被引:23
作者
Mulnaes, Daniel [1 ]
Gohlke, Holger [1 ,2 ,3 ]
机构
[1] Heinrich Heine Univ Dusseldorf, Inst Pharmaceut & Med Chem, Dept Math & Nat Sci, Univ Str 1, D-40225 Dusseldorf, Germany
[2] Forschungszentrum Julich, John Neumann Inst Comp NIC, JSC, Julich, Germany
[3] Forschungszentrum Julich, Inst Complex Syst Struct Biochem ICS 6, Julich, Germany
关键词
ABSOLUTE QUALITY; MEAN FORCE; RECOGNITION; PREDICTION; ALIGNMENT; PCONS;
D O I
10.1021/acs.jctc.8b00690
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
The value of protein models obtained with automated protein structure prediction depends primarily on their accuracy. Protein model quality assessment is thus critical to select the model that can best answer biologically relevant questions from an ensemble of predictions. However, despite many advances in the field, different methods capture different types of errors, begging the question of which method to use. We introduce TopScore, a meta Model Quality Assessment Program (meta-MQAP) that uses deep neural networks to combine scores from 15 different primary predictors to predict accurate residue-wise and whole-protein error estimates. The predictions on six large independent data sets are highly correlated to superposition-independent errors in the model, achieving a Pearson's R-all(2) of 0.93 and 0.78 for whole-protein and residue-wise error predictions, respectively. This is a significant improvement over any of the investigated primary MQAPs, demonstrating that much can be gained by optimally combining different methods and using different and very large data sets.
引用
收藏
页码:6117 / 6126
页数:10
相关论文
共 53 条
  • [1] QMEAN: A comprehensive scoring function for model quality assessment
    Benkert, Pascal
    Tosatto, Silvio C. E.
    Schomburg, Dietmar
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 71 (01) : 261 - 277
  • [2] Toward the estimation of the absolute quality of individual protein structure models
    Benkert, Pascal
    Biasini, Marco
    Schwede, Torsten
    [J]. BIOINFORMATICS, 2011, 27 (03) : 343 - 350
  • [3] QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information
    Benkert, Pascal
    Schwede, Torsten
    Tosatto, Silvio C. E.
    [J]. BMC STRUCTURAL BIOLOGY, 2009, 9
  • [4] Brown P. J., 1990, P AMS IMS SIAM JOINT, V112
  • [5] MolProbity: all-atom structure validation for macromolecular crystallography
    Chen, Vincent B.
    Arendall, W. Bryan, III
    Headd, Jeffrey J.
    Keedy, Daniel A.
    Immormino, Robert M.
    Kapral, Gary J.
    Murray, Laura W.
    Richardson, Jane S.
    Richardson, David C.
    [J]. ACTA CRYSTALLOGRAPHICA SECTION D-STRUCTURAL BIOLOGY, 2010, 66 : 12 - 21
  • [6] 3DRobot: automated generation of diverse and well-packed protein structure decoys
    Deng, Haiyou
    Jia, Ya
    Zhang, Yang
    [J]. BIOINFORMATICS, 2016, 32 (03) : 378 - 387
  • [7] 3D-Jury: a simple approach to improve protein structure predictions
    Ginalski, K
    Elofsson, A
    Fischer, D
    Rychlewski, L
    [J]. BIOINFORMATICS, 2003, 19 (08) : 1015 - 1018
  • [8] OPTIMAL PROTEIN-FOLDING CODES FROM SPIN-GLASS THEORY
    GOLDSTEIN, RA
    LUTHEYSCHULTEN, ZA
    WOLYNES, PG
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (11) : 4918 - 4922
  • [9] MQAPRank: improved global protein model quality assessment by learning-to-rank
    Jing, Xiaoyang
    Dong, Qiwen
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [10] Sann: Solvent accessibility prediction of proteins by nearest neighbor method
    Joo, Keehyoung
    Lee, Sung Jong
    Lee, Jooyoung
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2012, 80 (07) : 1791 - 1797