3D-equivariant graph neural networks for protein model quality assessment

被引:24
作者
Chen, Chen [1 ]
Chen, Xiao [1 ]
Morehead, Alex [1 ]
Wu, Tianqi [1 ]
Cheng, Jianlin [1 ]
机构
[1] Univ Missouri, Dept Elect Engn & Comp Sci, Columbia, MO 65211 USA
关键词
PREDICTION;
D O I
10.1093/bioinformatics/btad030
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
MotivationQuality assessment (QA) of predicted protein tertiary structure models plays an important role in ranking and using them. With the recent development of deep learning end-to-end protein structure prediction techniques for generating highly confident tertiary structures for most proteins, it is important to explore corresponding QA strategies to evaluate and select the structural models predicted by them since these models have better quality and different properties than the models predicted by traditional tertiary structure prediction methods.ResultsWe develop EnQA, a novel graph-based 3D-equivariant neural network method that is equivariant to rotation and translation of 3D objects to estimate the accuracy of protein structural models by leveraging the structural features acquired from the state-of-the-art tertiary structure prediction method-AlphaFold2. We train and test the method on both traditional model datasets (e.g. the datasets of the Critical Assessment of Techniques for Protein Structure Prediction) and a new dataset of high-quality structural models predicted only by AlphaFold2 for the proteins whose experimental structures were released recently. Our approach achieves state-of-the-art performance on protein structural models predicted by both traditional protein structure prediction methods and the latest end-to-end deep learning method-AlphaFold2. It performs even better than the model QA scores provided by AlphaFold2 itself. The results illustrate that the 3D-equivariant graph neural network is a promising approach to the evaluation of protein structural models. Integrating AlphaFold2 features with other complementary sequence and structural features is important for improving protein model QA.Availability and implementationThe source code is available at .Supplementary informationare available at Bioinformatics online.
引用
收藏
页数:7
相关论文
共 38 条
[1]   The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures [J].
Andreeva, Antonina ;
Kulesha, Eugene ;
Gough, Julian ;
Murzin, Alexey G. .
NUCLEIC ACIDS RESEARCH, 2020, 48 (D1) :D376-D382
[2]   SCOP2 prototype: a new approach to protein structure mining [J].
Andreeva, Antonina ;
Howorth, Dave ;
Chothia, Cyrus ;
Kulesha, Eugene ;
Murzin, Alexey G. .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D310-D314
[3]   The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling [J].
Arnold, K ;
Bordoli, L ;
Kopp, J ;
Schwede, T .
BIOINFORMATICS, 2006, 22 (02) :195-201
[4]   Accurate prediction of protein structures and interactions using a three-track neural network [J].
Baek, Minkyung ;
DiMaio, Frank ;
Anishchenko, Ivan ;
Dauparas, Justas ;
Ovchinnikov, Sergey ;
Lee, Gyu Rie ;
Wang, Jue ;
Cong, Qian ;
Kinch, Lisa N. ;
Schaeffer, R. Dustin ;
Millan, Claudia ;
Park, Hahnbeom ;
Adams, Carson ;
Glassman, Caleb R. ;
DeGiovanni, Andy ;
Pereira, Jose H. ;
Rodrigues, Andria V. ;
van Dijk, Alberdina A. ;
Ebrecht, Ana C. ;
Opperman, Diederik J. ;
Sagmeister, Theo ;
Buhlheller, Christoph ;
Pavkov-Keller, Tea ;
Rathinaswamy, Manoj K. ;
Dalwadi, Udit ;
Yip, Calvin K. ;
Burke, John E. ;
Garcia, K. Christopher ;
Grishin, Nick V. ;
Adams, Paul D. ;
Read, Randy J. ;
Baker, David .
SCIENCE, 2021, 373 (6557) :871-+
[5]   GraphQA: protein model quality assessment using graph convolutional networks [J].
Baldassarre, Federico ;
Hurtado, David Menendez ;
Elofsson, Arne ;
Azizpour, Hossein .
BIOINFORMATICS, 2021, 37 (03) :360-366
[6]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[7]   RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences [J].
Burley, Stephen K. ;
Bhikadiya, Charmi ;
Bi, Chunxiao ;
Bittrich, Sebastian ;
Chen, Li ;
Crichlow, Gregg, V ;
Christie, Cole H. ;
Dalenberg, Kenneth ;
Di Costanzo, Luigi ;
Duarte, Jose M. ;
Dutta, Shuchismita ;
Feng, Zukang ;
Ganesan, Sai ;
Goodsell, David S. ;
Ghosh, Sutapa ;
Green, Rachel Kramer ;
Guranovic, Vladimir ;
Guzenko, Dmytro ;
Hudson, Brian P. ;
Lawson, Catherine L. ;
Liang, Yuhe ;
Lowe, Robert ;
Namkoong, Harry ;
Peisach, Ezra ;
Persikova, Irina ;
Randle, Chris ;
Rose, Alexander ;
Rose, Yana ;
Sali, Andrej ;
Segura, Joan ;
Sekharan, Monica ;
Shao, Chenghua ;
Tao, Yi-Ping ;
Voigt, Maria ;
Westbrook, John D. ;
Young, Jasmine Y. ;
Zardecki, Christine ;
Zhuravleva, Marina .
NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) :D437-D451
[8]   DeepQA: improving the estimation of single protein model quality with deep belief networks [J].
Cao, Renzhi ;
Bhattacharya, Debswapna ;
Hou, Jie ;
Cheng, Jianlin .
BMC BIOINFORMATICS, 2016, 17
[9]   AlphaFold2 fails to predict protein fold switching [J].
Chakravarty, Devlina ;
Porter, Lauren L. .
PROTEIN SCIENCE, 2022, 31 (06)
[10]  
Cohen TS, 2016, PR MACH LEARN RES, V48