Highly accurate protein structure prediction with AlphaFold

被引:20508
作者
Jumper, John [1 ]
Evans, Richard [1 ]
Pritzel, Alexander [1 ]
Green, Tim [1 ]
Figurnov, Michael [1 ]
Ronneberger, Olaf [1 ]
Tunyasuvunakool, Kathryn [1 ]
Bates, Russ [1 ]
Zidek, Augustin [1 ]
Potapenko, Anna [1 ]
Bridgland, Alex [1 ]
Meyer, Clemens [1 ]
Kohl, Simon A. A. [1 ]
Ballard, Andrew J. [1 ]
Cowie, Andrew [1 ]
Romera-Paredes, Bernardino [1 ]
Nikolov, Stanislav [1 ]
Jain, Rishub [1 ]
Adler, Jonas [1 ]
Back, Trevor [1 ]
Petersen, Stig [1 ]
Reiman, David [1 ]
Clancy, Ellen [1 ]
Zielinski, Michal [1 ]
Steinegger, Martin [2 ,3 ]
Pacholska, Michalina [1 ]
Berghammer, Tamas [1 ]
Bodenstein, Sebastian [1 ]
Silver, David [1 ]
Vinyals, Oriol [1 ]
Senior, Andrew W. [1 ]
Kavukcuoglu, Koray [1 ]
Kohli, Pushmeet [1 ]
Hassabis, Demis [1 ]
机构
[1] DeepMind, London, England
[2] Seoul Natl Univ, Sch Biol Sci, Seoul, South Korea
[3] Seoul Natl Univ, Artificial Intelligence Inst, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
NEURAL-NETWORKS; POTENTIALS; CONTACTS; FORCE;
D O I
10.1038/s41586-021-03819-2
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort(1-4), the structures of around 100,000 unique proteins have been determined(5), but this represents a small fraction of the billions of known protein sequences(6,7). Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'(8)-has been an important open research problem for more than 50 years(9). Despite recent progress(10-14), existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14)(15), demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
引用
收藏
页码:583 / +
页数:10
相关论文
共 84 条
  • [1] A further leap of improvement in tertiary structure prediction in CASP13 prompts new routes for future assessments
    Abriata, Luciano A.
    Tamo, Giorgio E.
    Dal Peraro, Matteo
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2019, 87 (12) : 1100 - 1112
  • [2] Unified rational protein engineering with sequence-based deep representation learning
    Alley, Ethan C.
    Khimulya, Grigory
    Biswas, Surojit
    AlQuraishi, Mohammed
    Church, George M.
    [J]. NATURE METHODS, 2019, 16 (12) : 1315 - +
  • [3] End-to-End Differentiable Learning of Protein Structure
    AlQuraishi, Mohammed
    [J]. CELL SYSTEMS, 2019, 8 (04) : 292 - +
  • [4] CORRELATION OF COORDINATED AMINO-ACID SUBSTITUTIONS WITH FUNCTION IN VIRUSES RELATED TO TOBACCO MOSAIC-VIRUS
    ALTSCHUH, D
    LESK, AM
    BLOOMER, AC
    KLUG, A
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1987, 193 (04) : 693 - 707
  • [5] PRINCIPLES THAT GOVERN FOLDING OF PROTEIN CHAINS
    ANFINSEN, CB
    [J]. SCIENCE, 1973, 181 (4096) : 223 - 230
  • [6] [Anonymous], 2019, BUILDING MACHINE LEA, P59, DOI [DOI 10.1007/978-1-4842-4470-8_7, 10.1007/978-1-4842-4470-87, DOI 10.1007/978-1-4842-4470-87]
  • [7] Ashish A. M. A., 2015, TENSORFLOW LARGE SCA
  • [8] How cryo-EM is revolutionizing structural biology
    Bai, Xiao-Chen
    McMullan, Greg
    Scheres, Sjors H. W.
    [J]. TRENDS IN BIOCHEMICAL SCIENCES, 2015, 40 (01) : 49 - 57
  • [9] UniProt: the universal protein knowledgebase in 2021
    Bateman, Alex
    Martin, Maria-Jesus
    Orchard, Sandra
    Magrane, Michele
    Agivetova, Rahat
    Ahmad, Shadab
    Alpi, Emanuele
    Bowler-Barnett, Emily H.
    Britto, Ramona
    Bursteinas, Borisas
    Bye-A-Jee, Hema
    Coetzee, Ray
    Cukura, Austra
    Da Silva, Alan
    Denny, Paul
    Dogan, Tunca
    Ebenezer, ThankGod
    Fan, Jun
    Castro, Leyla Garcia
    Garmiri, Penelope
    Georghiou, George
    Gonzales, Leonardo
    Hatton-Ellis, Emma
    Hussein, Abdulrahman
    Ignatchenko, Alexandr
    Insana, Giuseppe
    Ishtiaq, Rizwan
    Jokinen, Petteri
    Joshi, Vishal
    Jyothi, Dushyanth
    Lock, Antonia
    Lopez, Rodrigo
    Luciani, Aurelien
    Luo, Jie
    Lussi, Yvonne
    Mac-Dougall, Alistair
    Madeira, Fabio
    Mahmoudy, Mahdi
    Menchi, Manuela
    Mishra, Alok
    Moulang, Katie
    Nightingale, Andrew
    Oliveira, Carla Susana
    Pundir, Sangya
    Qi, Guoying
    Raj, Shriya
    Rice, Daniel
    Lopez, Milagros Rodriguez
    Saidi, Rabie
    Sampson, Joseph
    [J]. NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) : D480 - D489
  • [10] PROTEIN MODELING Protein storytelling through physics
    Brini, Emiliano
    Simmerling, Carlos
    Dill, Ken
    [J]. SCIENCE, 2020, 370 (6520) : 1056 - +