Novel machine learning approaches revolutionize protein knowledge

被引:39
作者
Bordin, Nicola [1 ]
Dallago, Christian [2 ,3 ]
Heinzinger, Michael [2 ,4 ]
Kim, Stephanie [5 ,6 ]
Littmann, Maria [2 ]
Rauer, Clemens [1 ]
Steinegger, Martin [5 ,6 ]
Rost, Burkhard [2 ,7 ,8 ]
Orengo, Christine [1 ]
机构
[1] UCL, Inst Struct & Mol Biol, Gower St, London WC1E 6BT, England
[2] Tech Univ Munich TUM, Dept Informat Bioinformat & Computat Biol i12, Boltzmannstr 3, D-85748 Munich, Germany
[3] VantAI, 151 42nd St, New York, NY 10036 USA
[4] TUM, Grad Sch, Ctr Doctoral Studies Informat & its Applicat CeDoS, Boltzmannstr 11, D-85748 Garching, Germany
[5] Seoul Natl Univ, Sch Biol Sci, Seoul, South Korea
[6] Seoul Natl Univ, Artificial Intelligence Inst, Seoul, South Korea
[7] Inst Adv Study TUM IAS, Lichtenbergstr 2A, D-85748 Munich, Germany
[8] TUM Sch Life Sci Weihenstephan TUM WZW, Alte Akad 8, Freising Weihenstephan, Germany
基金
英国惠康基金; 英国生物技术与生命科学研究理事会; 新加坡国家研究基金会;
关键词
STRUCTURAL CLASSIFICATION; STRUCTURE ALIGNMENT; DATABASE SEARCH; PREDICTION; BREAKTHROUGH; RESIDUES; FAMILIES; LANGUAGE; LIFE;
D O I
10.1016/j.tibs.2022.11.001
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Breakthrough methods in machine learning (ML), protein structure prediction, and novel ultrafast structural aligners are revolutionizing structural biology. Obtaining accurate models of proteins and annotating their functions on a large scale is no longer limited by time and resources. The most recent method to be top ranked by the Critical Assessment of Structure Prediction (CASP) assessment, AlphaFold 2 (AF2), is capable of building structural models with an accuracy comparable to that of experimental structures. Annotations of 3D models are keeping pace with the deposition of the structures due to advance-ments in protein language models (pLMs) and structural aligners that help vali-date these transferred annotations. In this review we describe how recent developments in ML for protein science are making large-scale structural bioin-formatics available to the general scientific community.
引用
收藏
页码:345 / 359
页数:15
相关论文
共 122 条
[1]   Real-time structure search and structure classification for AlphaFold protein models [J].
Aderinwale, Tunde ;
Bharadwaj, Vijay ;
Christoffer, Charles ;
Terashi, Genki ;
Zhang, Zicong ;
Jahandideh, Rashidedin ;
Kagaya, Yuki ;
Kihara, Daisuke .
COMMUNICATIONS BIOLOGY, 2022, 5 (01)
[2]  
Ahdritz G, 2022, bioRxiv, DOI [10.1101/2022.11.20.517210, 10.1101/2022.11.20.517210, DOI 10.1101/2022.11.20.517210, 10.1101/2022.11.20.517210v2]
[3]   A structural biology community assessment of AlphaFold2 applications [J].
Akdel, Mehmet ;
Pires, Douglas E., V ;
Porta Pardo, Eduard ;
Janes, Jurgen ;
Zalevsky, Arthur O. ;
Meszaros, Balint ;
Bryant, Patrick ;
Good, Lydia L. ;
Laskowski, Roman A. ;
Pozzati, Gabriele ;
Shenoy, Aditi ;
Zhu, Wensi ;
Kundrotas, Petras ;
Serra, Victoria Ruiz ;
Rodrigues, Carlos H. M. ;
Dunham, Alistair S. ;
Burke, David ;
Borkakoti, Neera ;
Velankar, Sameer ;
Frost, Adam ;
Basquin, Jerome ;
Lindorff-Larsen, Kresten ;
Bateman, Alex ;
Kajava, Andrey, V ;
Valencia, Alfonso ;
Ovchinnikov, Sergey ;
Durairaj, Janani ;
Ascher, David B. ;
Thornton, Janet M. ;
Davey, Norman E. ;
Stein, Amelie ;
Elofsson, Arne ;
Croll, Tristan, I ;
Beltrao, Pedro .
NATURE STRUCTURAL & MOLECULAR BIOLOGY, 2022, 29 (11) :1056-+
[4]  
Alderson TR, 2022, bioRxiv, DOI [10.1101/2022.02.18.481080, 10.1101/2022.02.18.481080, DOI 10.1101/2022.02.18.481080]
[5]   Unified rational protein engineering with sequence-based deep representation learning [J].
Alley, Ethan C. ;
Khimulya, Grigory ;
Biswas, Surojit ;
AlQuraishi, Mohammed ;
Church, George M. .
NATURE METHODS, 2019, 16 (12) :1315-+
[6]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[7]   De novo protein design by deep network hallucination [J].
Anishchenko, Ivan ;
Pellock, Samuel J. ;
Chidyausiku, Tamuka M. ;
Ramelot, Theresa A. ;
Ovchinnikov, Sergey ;
Hao, Jingzhou ;
Bafna, Khushboo ;
Norn, Christoffer ;
Kang, Alex ;
Bera, Asim K. ;
DiMaio, Frank ;
Carter, Lauren ;
Chow, Cameron M. ;
Montelione, Gaetano T. ;
Baker, David .
NATURE, 2021, 600 (7889) :547-+
[8]   Origins of coevolution between residues distant in protein 3D structures [J].
Anishchenko, Ivan ;
Ovchinnikov, Sergey ;
Kamisetty, Hetunandan ;
Baker, David .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2017, 114 (34) :9122-9127
[9]  
[Anonymous], 1965, Electronics Magazine
[10]   RUPEE: A fast and accurate purely geometric protein structure search [J].
Ayoub, Ronald ;
Lee, Yugyung .
PLOS ONE, 2019, 14 (03)