Improved the heterodimer protein complex prediction with protein language models

被引:13
作者
Chen, Bo [1 ]
Xie, Ziwei [2 ]
Qiu, Jiezhong [3 ]
Ye, Zhaofeng [4 ]
Xu, Jinbo [2 ]
Tang, Jie [5 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Knowledge Engn Grp KEG, Beijing, Peoples R China
[2] Toyota Technol Inst Chicago, Chicago, IL 60637 USA
[3] Zhejiang Lab, Res Ctr Intelligent Comp Platforms, Hangzhou, Peoples R China
[4] Tsinghua Univ, Sch Med, Beijing, Peoples R China
[5] Tsinghua Univ, Dept Comp Sci, Beijing 100084, Peoples R China
关键词
protein complex structure prediction; protein language model; alphafold-multimer; SERVER; PRINCIPLES; ZDOCK;
D O I
10.1093/bib/bbad221
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
AlphaFold-Multimer has greatly improved the protein complex structure prediction, but its accuracy also depends on the quality of the multiple sequence alignment (MSA) formed by the interacting homologs (i.e. interologs) of the complex under prediction. Here we propose a novel method, ESMPair, that can identify interologs of a complex using protein language models. We show that ESMPair can generate better interologs than the default MSA generation method in AlphaFold-Multimer. Our method results in better complex structure prediction than AlphaFold-Multimer by a large margin (+10.7% in terms of the Top-5 best DockQ), especially when the predicted complex structures have low confidence. We further show that by combining several MSA generation methods, we may yield even better complex structure prediction accuracy than Alphafold-Multimer (+22% in terms of the Top-5 best DockQ). By systematically analyzing the impact factors of our algorithm we find that the diversity of MSA of interologs significantly affects the prediction accuracy. Moreover, we show that ESMPair performs particularly well on complexes in eucaryotes.
引用
收藏
页数:13
相关论文
共 63 条
[1]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
[2]   Accurate prediction of protein structures and interactions using a three-track neural network [J].
Baek, Minkyung ;
DiMaio, Frank ;
Anishchenko, Ivan ;
Dauparas, Justas ;
Ovchinnikov, Sergey ;
Lee, Gyu Rie ;
Wang, Jue ;
Cong, Qian ;
Kinch, Lisa N. ;
Schaeffer, R. Dustin ;
Millan, Claudia ;
Park, Hahnbeom ;
Adams, Carson ;
Glassman, Caleb R. ;
DeGiovanni, Andy ;
Pereira, Jose H. ;
Rodrigues, Andria V. ;
van Dijk, Alberdina A. ;
Ebrecht, Ana C. ;
Opperman, Diederik J. ;
Sagmeister, Theo ;
Buhlheller, Christoph ;
Pavkov-Keller, Tea ;
Rathinaswamy, Manoj K. ;
Dalwadi, Udit ;
Yip, Calvin K. ;
Burke, John E. ;
Garcia, K. Christopher ;
Grishin, Nick V. ;
Adams, Paul D. ;
Read, Randy J. ;
Baker, David .
SCIENCE, 2021, 373 (6557) :871-+
[3]   DockQ: A Quality Measure for Protein-Protein Docking Models [J].
Basu, Sankar ;
Wallner, Bjorn .
PLOS ONE, 2016, 11 (08)
[4]  
Billings WM, 2021, SCI REP-UK, V11, P1
[5]  
Blackwell S, 2021, bioRxiv, DOI DOI 10.1101/2021.10.04.463034
[6]  
Brown TB, 2020, ADV NEUR IN, V33
[7]   Improved prediction of protein-protein interactions using AlphaFold2 [J].
Bryant, P. ;
Pozzati, G. ;
Elofsson, A. .
NATURE COMMUNICATIONS, 2022, 13 (01)
[8]   ZDOCK: An initial-stage protein-docking algorithm [J].
Chen, R ;
Li, L ;
Weng, ZP .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 52 (01) :80-87
[9]  
Chen Ting, 2019, PMLR
[10]   Performance and Its Limits in Rigid Body Protein-Protein Docking [J].
Desta, Israel T. ;
Porter, Kathryn A. ;
Xia, Bing ;
Kozakov, Dima ;
Vajda, Sandor .
STRUCTURE, 2020, 28 (09) :1071-+