Protein inter-residue contact and distance prediction by coupling complementary coevolution features with deep residual networks in CASP14

被引:17
作者
Li, Yang [1 ,2 ]
Zhang, Chengxin [2 ]
Zheng, Wei [2 ]
Zhou, Xiaogen [2 ]
Bell, Eric W. [2 ]
Yu, Dong-Jun [1 ,2 ]
Zhang, Yang [2 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Peoples R China
[2] Univ Michigan, Dept Comp Med & Bioinformat, Ann Arbor, MI 48107 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
CASP; coevolution; contact-map prediction; deep learning; protein structure prediction; MUTATIONS; EVOLUTIONARY; INFORMATION;
D O I
10.1002/prot.26211
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
This article reports and analyzes the results of protein contact and distance prediction by our methods in the 14th Critical Assessment of techniques for protein Structure Prediction (CASP14). A new deep learning-based contact/distance predictor was employed based on the ensemble of two complementary coevolution features coupling with deep residual networks. We also improved our multiple sequence alignment (MSA) generation protocol with wholesale meta-genome sequence databases. On 22 CASP14 free modeling (FM) targets, the proposed model achieved a top-L/5 long-range precision of 63.8% and a mean distance bin error of 1.494. Based on the predicted distance potentials, 11 out of 22 FM targets and all of the 14 FM/template-based modeling (TBM) targets have correctly predicted folds (TM-score >0.5), suggesting that our approach can provide reliable distance potentials for ab initio protein folding.
引用
收藏
页码:1911 / 1921
页数:11
相关论文
共 51 条
[1]   A further leap of improvement in tertiary structure prediction in CASP13 prompts new routes for future assessments [J].
Abriata, Luciano A. ;
Tamo, Giorgio E. ;
Dal Peraro, Matteo .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2019, 87 (12) :1100-1112
[2]   Assessment of hard target modeling in CASP12 reveals an emerging role of alignment-based contact prediction methods [J].
Abriata, Luciano A. ;
Tamo, Giorgio E. ;
Monastyrskyy, Bohdan ;
Kryshtafovych, Andriy ;
Dal Peraro, Matteo .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2018, 86 :97-112
[3]   DNCON2: improved protein contact prediction using two-level deep convolutional neural networks [J].
Adhikari, Badri ;
Hou, Jie ;
Cheng, Jianlin .
BIOINFORMATICS, 2018, 34 (09) :1466-1472
[4]  
[Anonymous], 2016, INSTANCE NORMALIZATI
[5]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[6]   Improved protein contact predictions with the MetaPSICOV2 server in CASP12 [J].
Buchan, Daniel W. A. ;
Jones, David T. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2018, 86 :78-83
[7]   PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta [J].
Chaudhury, Sidhartha ;
Lyskov, Sergey ;
Gray, Jeffrey J. .
BIOINFORMATICS, 2010, 26 (05) :689-691
[8]   The IMG/M data management and analysis system v.6.0: new tools and advanced capabilities [J].
Chen, I-Min A. ;
Chu, Ken ;
Palaniappan, Krishnaveni ;
Ratner, Anna ;
Huang, Jinghua ;
Huntemann, Marcel ;
Hajek, Patrick ;
Ritter, Stephan ;
Varghese, Neha ;
Seshadri, Rekha ;
Roux, Simon ;
Woyke, Tanja ;
Eloe-Fadrosh, Emiley A. ;
Ivanova, Natalia N. ;
Kyrpides, Nikos C. .
NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) :D751-D763
[9]   Structure and function of virion RNA polymerase of a crAss-like phage [J].
Drobysheva, Arina V. ;
Panafidina, Sofia A. ;
Kolesnik, Matvei V. ;
Klimuk, Evgeny I. ;
Minakhin, Leonid ;
Yakunina, Maria V. ;
Borukhov, Sergei ;
Nilsson, Emelie ;
Holmfeldt, Karin ;
Yutin, Natalya ;
Makarova, Kira S. ;
Koonin, Eugene V. ;
Severinov, Konstantin V. ;
Leiman, Petr G. ;
Sokolova, Maria L. .
NATURE, 2021, 589 (7841) :306-+
[10]   Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction [J].
Dunn, S. D. ;
Wahl, L. M. ;
Gloor, G. B. .
BIOINFORMATICS, 2008, 24 (03) :333-340