Blind assessment of monomeric AlphaFold2 protein structure models with experimental NMR data

被引:13
作者
Li, Ethan H. [1 ]
Spaman, Laura E. [1 ]
Tejero, Roberto [1 ]
Huang, Yuanpeng Janet [1 ]
Ramelot, Theresa A. [1 ]
Fraga, Keith J. [1 ]
Prestegard, James H. [2 ]
Kennedy, Michael A. [3 ]
Montelione, Gaetano T. [1 ]
机构
[1] Rensselaer Polytech Inst, Ctr Biotechnol & Interdisciplinary Sci, Dept Chem & Chem Biol, Troy, NY 12180 USA
[2] Univ Georgia, Complex Carbohydrate Res Ctr, Athens, GA 30602 USA
[3] Miami Univ, Dept Chem & Biochem, Oxford, OH 45056 USA
基金
美国国家卫生研究院;
关键词
AlphaFold2; Artificial intelligence; Deep learning; Protein NMR; Model validation; NMR data; STRUCTURE VALIDATION; X-RAY; SEQUENCE; PREDICTIONS; REFINEMENT; ALGORITHMS; PRECISION; SOFTWARE; GEOMETRY; FOLD;
D O I
10.1016/j.jmr.2023.107481
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in molecular modeling of protein structures are changing the field of structural biology. AlphaFold-2 (AF2), an AI system developed by DeepMind, Inc., utilizes attention-based deep learning to predict models of protein structures with high accuracy relative to structures determined by X-ray crys-tallography and cryo-electron microscopy (cryoEM). Comparing AF2 models to structures determined using solution NMR data, both high similarities and distinct differences have been observed. Since AF2 was trained on X-ray crystal and cryoEM structures, we assessed how accurately AF2 can model small, monomeric, solution protein NMR structures which (i) were not used in the AF2 training data set, and (ii) did not have homologous structures in the Protein Data Bank at the time of AF2 training. We identified nine open-source protein NMR data sets for such "blind" targets, including chemical shift, raw NMR FID data, NOESY peak lists, and (for 1 case) 15N-1H residual dipolar coupling data. For these nine small (70- 108 residues) monomeric proteins, we generated AF2 prediction models and assessed how well these models fit to these experimental NMR data, using several well-established NMR structure validation tools. In most of these cases, the AF2 models fit the NMR data nearly as well, or sometimes better than, the corresponding NMR structure models previously deposited in the Protein Data Bank. These results provide benchmark NMR data for assessing new NMR data analysis and protein structure prediction methods. They also document the potential for using AF2 as a guiding tool in protein NMR data analysis, and more generally for hypothesis generation in structural biology research.& COPY; 2023 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).
引用
收藏
页数:11
相关论文
共 67 条
[51]   Clustering algorithms for identifying core atom sets and for assessing the precision of protein structure ensembles [J].
Snyder, DA ;
Montelione, GT .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 59 (04) :673-686
[52]   The expanded FindCore method for identification of a core atom set for assessment of protein structure prediction [J].
Snyder, David A. ;
Grullon, Jennifer ;
Huang, Yuanpeng J. ;
Tejero, Roberto ;
Montelione, Gaetano T. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2014, 82 :219-230
[53]  
Takatsu K., 2022, ADVANCES
[54]  
Tejero R, 1996, PROTEIN SCI, V5, P578
[55]   AlphaFold Models of Small Proteins Rival the Accuracy of Solution NMR Structures [J].
Tejero, Roberto ;
Huang, Yuanpeng Janet ;
Ramelot, Theresa A. ;
Montelione, Gaetano T. .
FRONTIERS IN MOLECULAR BIOSCIENCES, 2022, 9
[56]   PDBStat: a universal restraint converter and restraint analysis software package for protein NMR [J].
Tejero, Roberto ;
Snyder, David ;
Mao, Binchen ;
Aramini, James M. ;
Montelione, Gaetano T. .
JOURNAL OF BIOMOLECULAR NMR, 2013, 56 (04) :337-351
[57]  
Terwilliger T.C., 2022, BioRxiv, V2021
[58]   BioMagResBank [J].
Ulrich, Eldon L. ;
Akutsu, Hideo ;
Doreleijers, Jurgen F. ;
Harano, Yoko ;
Ioannidis, Yannis E. ;
Lin, Jundong ;
Livny, Miron ;
Mading, Steve ;
Maziuk, Dimitri ;
Miller, Zachary ;
Nakatani, Eiichi ;
Schulte, Christopher F. ;
Tolmie, David E. ;
Wenger, R. Kent ;
Yao, Hongyang ;
Markley, John L. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D402-D408
[59]  
Vaswani A, 2017, ADV NEUR IN, V30
[60]   PROTEIN STRUCTURES IN SOLUTION BY NUCLEAR-MAGNETIC-RESONANCE AND DISTANCE GEOMETRY - THE POLYPEPTIDE FOLD OF THE BASIC PANCREATIC TRYPSIN-INHIBITOR DETERMINED USING 2 DIFFERENT ALGORITHMS, DISGEO AND DISMAN [J].
WAGNER, G ;
BRAUN, W ;
HAVEL, TF ;
SCHAUMANN, T ;
GO, N ;
WUTHRICH, K .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 196 (03) :611-639