DeepHomo2.0: improved protein-protein contact prediction of homodimers by transformer -enhanced deep learning

被引:13
作者
Lin, Peicong [1 ]
Yan, Yumeng [1 ]
Huang, Sheng-You [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Phys, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
protein-protein interaction; homo-oligomers; residue-residue contact prediction; deep learning; transformer features; CRYO-EM; RESIDUE CONTACTS; COEVOLUTION; IDENTIFICATION; PRINCIPLES; SYMMETRY; DOCKING;
D O I
10.1093/bib/bbac499
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Protein-protein interactions play an important role in many biological processes. However, although structure prediction for monomer proteins has achieved great progress with the advent of advanced deep learning algorithms like AlphaFold, the structure prediction for protein-protein complexes remains an open question. Taking advantage of the Transformer model of ESM-MSA, we have developed a deep learning -based model, named DeepHomo2.0, to predict protein-protein interactions of homodimeric complexes by leveraging the direct-coupling analysis (DCA) and Transformer features of sequences and the structure features of monomers. DeepHomo2.0 was extensively evaluated on diverse test sets and compared with eight state-of-the-art methods including protein language model based, DCA-based and machine learning -based methods. It was shown that DeepHomo2.0 achieved a high precision of >70% with experimental monomer structures and >60% with predicted monomer structures for the top 10 predicted contacts on the test sets and outperformed the other eight methods. Moreover, even the version without using structure information, named DeepHomoSeq, still achieved a good precision of >55% for the top 10 predicted contacts. Integrating the predicted contacts into protein docking significantly improved the structure prediction of realistic Critical Assessment of Protein Structure Prediction homodimeric complexes. DeepHomo2.0 and DeepHomoSeq are available at http://huanglab.phys.hust.edu.cn/DeepHomo2/.
引用
收藏
页数:14
相关论文
共 82 条
[51]  
Remmert M, 2012, NAT METHODS, V9, P173, DOI [10.1038/nmeth.1818, 10.1038/NMETH.1818]
[52]   High-Throughput Sequencing Technologies [J].
Reuter, Jason A. ;
Spacek, Damek V. ;
Snyder, Michael P. .
MOLECULAR CELL, 2015, 58 (04) :586-597
[53]   Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences [J].
Rives, Alexander ;
Meier, Joshua ;
Sercu, Tom ;
Goyal, Siddharth ;
Lin, Zeming ;
Liu, Jason ;
Guo, Demi ;
Ott, Myle ;
Zitnick, C. Lawrence ;
Ma, Jerry ;
Fergus, Rob .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2021, 118 (15)
[54]   Improving cryo-EM structure validation [J].
Rohou, Alexis .
NATURE METHODS, 2021, 18 (02) :130-131
[55]   I-TASSER: a unified platform for automated protein structure and function prediction [J].
Roy, Ambrish ;
Kucukural, Alper ;
Zhang, Yang .
NATURE PROTOCOLS, 2010, 5 (04) :725-738
[56]   A deep dilated convolutional residual network for predicting interchain contacts of protein homodimers [J].
Roy, Raj S. ;
Quadir, Farhan ;
Soltanikazemi, Elham ;
Cheng, Jianlin .
BIOINFORMATICS, 2022, 38 (07) :1904-1910
[57]   BIPSPI: a method for the prediction of partner-specific protein-protein interfaces [J].
Sanchez-Garcia, Ruben ;
Sorzano, C. O. S. ;
Carazo, J. M. ;
Segura, Joan .
BIOINFORMATICS, 2019, 35 (03) :470-477
[58]   Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age [J].
Schaarschmidt, Joerg ;
Monastyrskyy, Bohdan ;
Kryshtafovych, Andriy ;
Bonvin, Alexandre M. J. J. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2018, 86 :51-66
[59]   RELION: Implementation of a Bayesian approach to cryo-EM structure determination [J].
Scheres, Sjors H. W. .
JOURNAL OF STRUCTURAL BIOLOGY, 2012, 180 (03) :519-530
[60]   Rosetta design with co-evolutionary information retains protein function [J].
Schmitz, Samuel ;
Ertelt, Moritz ;
Merkl, Rainer ;
Meiler, Jens .
PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (01)