Protein model accuracy estimation empowered by deep learning and inter-residue distance prediction in CASP14

被引:10
作者
Chen, Xiao [1 ]
Liu, Jian [1 ]
Guo, Zhiye [1 ]
Wu, Tianqi [1 ]
Hou, Jie [2 ]
Cheng, Jianlin [1 ]
机构
[1] Univ Missouri, Dept Elect Engn & Comp Sci, Columbia, MO 65201 USA
[2] St Louis Univ, Dept Comp Sci, St Louis, MO 63103 USA
关键词
QUALITY ASSESSMENT; SINGLE; REGIONS; SERVER;
D O I
10.1038/s41598-021-90303-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The inter-residue contact prediction and deep learning showed the promise to improve the estimation of protein model accuracy (EMA) in the 13th Critical Assessment of Protein Structure Prediction (CASP13). To further leverage the improved inter-residue distance predictions to enhance EMA, during the 2020 CASP14 experiment, we integrated several new inter-residue distance features with the existing model quality assessment features in several deep learning methods to predict the quality of protein structural models. According to the evaluation of performance in selecting the best model from the models of CASP14 targets, our three multi-model predictors of estimating model accuracy (MULTICOM-CONSTRUCT, MULTICOM-AI, and MULTICOM-CLUSTER) achieve the averaged loss of 0.073, 0.079, and 0.081, respectively, in terms of the global distance test score (GDT-TS). The three methods are ranked first, second, and third out of all 68 CASP14 predictors. MULTICOM-DEEP, the single-model predictor of estimating model accuracy (EMA), is ranked within top 10 among all the single-model EMA methods according to GDT-TS score loss. The results demonstrate that inter-residue distance features are valuable inputs for deep learning to predict the quality of protein structural models. However, larger training datasets and better ways of leveraging inter-residue distance information are needed to fully explore its potentials.
引用
收藏
页数:12
相关论文
共 50 条
[1]   DNCON2: improved protein contact prediction using two-level deep convolutional neural networks [J].
Adhikari, Badri ;
Hou, Jie ;
Cheng, Jianlin .
BIOINFORMATICS, 2018, 34 (09) :1466-1472
[2]   QMEAN: A comprehensive scoring function for model quality assessment [J].
Benkert, Pascal ;
Tosatto, Silvio C. E. ;
Schomburg, Dietmar .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 71 (01) :261-277
[3]   QMEAN server for protein model quality estimation [J].
Benkert, Pascal ;
Kuenzli, Michael ;
Schwede, Torsten .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W510-W514
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]   Improvement of 3D protein models using multiple templates guided by single-template model quality assessment [J].
Buenavista, Maria T. ;
Roche, Daniel B. ;
McGuffin, Liam J. .
BIOINFORMATICS, 2012, 28 (14) :1851-1857
[6]   QAcon: single model quality assessment using protein structural and contact information with machine learning techniques [J].
Cao, Renzhi ;
Adhikari, Badri ;
Bhattacharya, Debswapna ;
Sun, Miao ;
Hou, Jie ;
Cheng, Jianlin .
BIOINFORMATICS, 2017, 33 (04) :586-588
[7]   DeepQA: improving the estimation of single protein model quality with deep belief networks [J].
Cao, Renzhi ;
Bhattacharya, Debswapna ;
Hou, Jie ;
Cheng, Jianlin .
BMC BIOINFORMATICS, 2016, 17
[8]   SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines [J].
Cao, Renzhi ;
Wang, Zheng ;
Wang, Yiheng ;
Cheng, Jianlin .
BMC BIOINFORMATICS, 2014, 15
[9]  
Chen X, 2020, P 11 ACM INT C BIOIN, P1
[10]   Estimation of model accuracy in CASP13 [J].
Chene, Jianlin ;
Choe, Myong-Ho ;
Elofsson, Arne ;
Han, Kun-Sop ;
Hoe, Jie ;
Maghrabi, Ali H. A. ;
McGuffin, Liam J. ;
Menendez-Hurtado, David ;
Olechnovic, Klinnent ;
Schwede, Torsten ;
Studer, Gabriel ;
Uziela, Karolis ;
Venclovas, Ceslovas ;
Wallner, Bjorn .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2019, 87 (12) :1361-1377