Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14

被引:47
作者
Zheng, Wei [1 ]
Li, Yang [1 ,2 ]
Zhang, Chengxin [1 ]
Zhou, Xiaogen [1 ]
Pearce, Robin [1 ]
Bell, Eric W. [1 ]
Huang, Xiaoqiang [1 ]
Zhang, Yang [1 ,3 ]
机构
[1] Univ Michigan, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Peoples R China
[3] Univ Michigan, Dept Biol Chem, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
ab initio folding; CASP14; deep learning; domain partition; multiple sequence alignment; protein structure prediction; residue-residue distance prediction; FOLD-RECOGNITION; I-TASSER; SIMILARITY; SERVER;
D O I
10.1002/prot.26193
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this article, we report 3D structure prediction results by two of our best server groups ("Zhang-Server" and "QUARK") in CASP14. These two servers were built based on the D-I-TASSER and D-QUARK algorithms, which integrated four newly developed components into the classical protein folding pipelines, I-TASSER and QUARK, respectively. The new components include: (a) a new multiple sequence alignment (MSA) collection tool, DeepMSA2, which is extended from the DeepMSA program; (b) a contact-based domain boundary prediction algorithm, FUpred, to detect protein domain boundaries; (c) a residual convolutional neural network-based method, DeepPotential, to predict multiple spatial restraints by co-evolutionary features derived from the MSA; and (d) optimized spatial restraint energy potentials to guide the structure assembly simulations. For 37 FM targets, the average TM-scores of the first models produced by D-I-TASSER and D-QUARK were 96% and 112% higher than those constructed by I-TASSER and QUARK, respectively. The data analysis indicates noticeable improvements produced by each of the four new components, especially for the newly added spatial restraints from DeepPotential and the well-tuned force field that combines spatial restraints, threading templates, and generic knowledge-based potentials. However, challenges still exist in the current pipelines. These include difficulties in modeling multi-domain proteins due to low accuracy in inter-domain distance prediction and modeling protein domains from oligomer complexes, as the co-evolutionary analysis cannot distinguish inter-chain and intra-chain distances. Specifically tuning the deep learning-based predictors for multi-domain targets and protein complexes may be helpful to address these issues.
引用
收藏
页码:1734 / 1751
页数:18
相关论文
共 55 条
[41]  
Yang Li CZ., PROTEIN INTER UNPUB
[42]  
Yang P., 2021, DECODING MICROBIOME
[43]   DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins [J].
Zhang, Chengxin ;
Zheng, Wei ;
Mortuza, S. M. ;
Li, Yang ;
Zhang, Yang .
BIOINFORMATICS, 2020, 36 (07) :2105-2112
[44]   Atomic-Level Protein Structure Refinement Using Fragment-Guided Molecular Dynamics Conformation Sampling [J].
Zhang, Jian ;
Liang, Yu ;
Zhang, Yang .
STRUCTURE, 2011, 19 (12) :1784-1795
[45]   A Novel Side-Chain Orientation Dependent Potential Derived from Random-Walk Reference State for Protein Fold Selection and Structure Prediction [J].
Zhang, Jian ;
Zhang, Yang .
PLOS ONE, 2010, 5 (10)
[46]   SPICKER:: A clustering approach to identify near-native protein folds [J].
Zhang, Y ;
Skolnick, J .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2004, 25 (06) :865-871
[47]   Scoring function for automated assessment of protein structure template quality (vol 57, pg 702, 2004) [J].
Zhang, Yang ;
Skolnick, Jeffrey .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 68 (04) :1020-1020
[48]   Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10 [J].
Zhang, Yang .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2014, 82 :175-187
[49]   Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations [J].
Zheng, Wei ;
Zhang, Chengxin ;
Li, Yang ;
Pearce, Robin ;
Bell, Eric W. ;
Zhang, Yang .
CELL REPORTS METHODS, 2021, 1 (03)
[50]   FUpred: detecting protein domains through deep-learning-based contact map prediction [J].
Zheng, Wei ;
Zhou, Xiaogen ;
Wuyun, Qiqige ;
Pearce, Robin ;
Li, Yang ;
Zhang, Yang .
BIOINFORMATICS, 2020, 36 (12) :3749-3757