AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences

被引:623
作者
Varadi, Mihaly [1 ]
Bertoni, Damian [1 ]
Magana, Paulyna [1 ]
Paramval, Urmila [1 ]
Pidruchna, Ivanna [1 ]
Radhakrishnan, Malarvizhi [1 ]
Tsenkov, Maxim [1 ]
Nair, Sreenath [1 ]
Mirdita, Milot [2 ]
Yeo, Jingi [2 ]
Kovalevskiy, Oleg [3 ]
Tunyasuvunakool, Kathryn [3 ]
Laydon, Agata [3 ]
Zidek, Augustin [3 ]
Tomlinson, Hamish [3 ]
Hariharan, Dhavanthi [3 ]
Abrahamson, Josh [3 ]
Green, Tim [3 ]
Jumper, John [3 ]
Birney, Ewan [1 ]
Steinegger, Martin [2 ]
Hassabis, Demis [3 ]
Velankar, Sameer [1 ]
机构
[1] European Bioinformat Inst, European Mol Biol Lab, Hinxton, England
[2] Seoul Natl Univ, Sch Biol Sci, Seoul, South Korea
[3] Google DeepMind, London, England
基金
新加坡国家研究基金会;
关键词
D O I
10.1093/nar/gkad1011
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The AlphaFold Database Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) has significantly impacted structural biology by amassing over 214 million predicted protein structures, expanding from the initial 300k structures released in 2021. Enabled by the groundbreaking AlphaFold2 artificial intelligence (AI) system, the predictions archived in AlphaFold DB have been integrated into primary data resources such as PDB, UniProt, Ensembl, InterPro and MobiDB. Our manuscript details subsequent enhancements in data archiving, covering successive releases encompassing model organisms, global health proteomes, Swiss-Prot integration, and a host of curated protein datasets. We detail the data access mechanisms of AlphaFold DB, from direct file access via FTP to advanced queries using Google Cloud Public Datasets and the programmatic access endpoints of the database. We also discuss the improvements and services added since its initial release, including enhancements to the Predicted Aligned Error viewer, customisation options for the 3D viewer, and improvements in the search engine of AlphaFold DB. [GRAPHICS] .
引用
收藏
页码:D368 / D375
页数:8
相关论文
共 27 条
[1]  
Ahdritz G., 2022, bioRxiv
[2]   Accurate prediction of protein structures and interactions using a three-track neural network [J].
Baek, Minkyung ;
DiMaio, Frank ;
Anishchenko, Ivan ;
Dauparas, Justas ;
Ovchinnikov, Sergey ;
Lee, Gyu Rie ;
Wang, Jue ;
Cong, Qian ;
Kinch, Lisa N. ;
Schaeffer, R. Dustin ;
Millan, Claudia ;
Park, Hahnbeom ;
Adams, Carson ;
Glassman, Caleb R. ;
DeGiovanni, Andy ;
Pereira, Jose H. ;
Rodrigues, Andria V. ;
van Dijk, Alberdina A. ;
Ebrecht, Ana C. ;
Opperman, Diederik J. ;
Sagmeister, Theo ;
Buhlheller, Christoph ;
Pavkov-Keller, Tea ;
Rathinaswamy, Manoj K. ;
Dalwadi, Udit ;
Yip, Calvin K. ;
Burke, John E. ;
Garcia, K. Christopher ;
Grishin, Nick V. ;
Adams, Paul D. ;
Read, Randy J. ;
Baker, David .
SCIENCE, 2021, 373 (6557) :871-+
[3]   Clustering predicted structures at the scale of the known protein universe [J].
Barrio-Hernandez, Inigo ;
Yeo, Jingi ;
Janes, Jurgen ;
Mirdita, Milot ;
Gilchrist, Cameron L. M. ;
Wein, Tanita ;
Varadi, Mihaly ;
Velankar, Sameer ;
Beltrao, Pedro ;
Steinegger, Martin .
NATURE, 2023, 622 (7983) :637-+
[4]   UniProt: the Universal Protein Knowledgebase in 2023 [J].
Bateman, Alex ;
Martin, Maria-Jesus ;
Orchard, Sandra ;
Magrane, Michele ;
Ahmad, Shadab ;
Alpi, Emanuele ;
Bowler-Barnett, Emily H. ;
Britto, Ramona ;
Cukura, Austra ;
Denny, Paul ;
Dogan, Tunca ;
Ebenezer, ThankGod ;
Fan, Jun ;
Garmiri, Penelope ;
Gonzales, Leonardo Jose da Costa ;
Hatton-Ellis, Emma ;
Hussein, Abdulrahman ;
Ignatchenko, Alexandr ;
Insana, Giuseppe ;
Ishtiaq, Rizwan ;
Joshi, Vishal ;
Jyothi, Dushyanth ;
Kandasaamy, Swaathi ;
Lock, Antonia ;
Luciani, Aurelien ;
Lugaric, Marija ;
Luo, Jie ;
Lussi, Yvonne ;
MacDougall, Alistair ;
Madeira, Fabio ;
Mahmoudy, Mahdi ;
Mishra, Alok ;
Moulang, Katie ;
Nightingale, Andrew ;
Pundir, Sangya ;
Qi, Guoying ;
Raj, Shriya ;
Raposo, Pedro ;
Rice, Daniel L. ;
Saidi, Rabie ;
Santos, Rafael ;
Speretta, Elena ;
Stephenson, James ;
Totoo, Prabhat ;
Turner, Edward ;
Tyagi, Nidhi ;
Vasudev, Preethi ;
Warner, Kate ;
Watkins, Xavier ;
Zellner, Hermann .
NUCLEIC ACIDS RESEARCH, 2023, 51 (D1) :D523-D531
[5]   AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms [J].
Bordin, Nicola ;
Sillitoe, Ian ;
Nallapareddy, Vamsi ;
Rauer, Clemens ;
Lam, Su Datt ;
Waman, Vaishali P. ;
Sen, Neeladri ;
Heinzinger, Michael ;
Littmann, Maria ;
Kim, Stephanie ;
Velankar, Sameer ;
Steinegger, Martin ;
Rost, Burkhard ;
Orengo, Christine .
COMMUNICATIONS BIOLOGY, 2023, 6 (01)
[6]   Novel machine learning approaches revolutionize protein knowledge [J].
Bordin, Nicola ;
Dallago, Christian ;
Heinzinger, Michael ;
Kim, Stephanie ;
Littmann, Maria ;
Rauer, Clemens ;
Steinegger, Martin ;
Rost, Burkhard ;
Orengo, Christine .
TRENDS IN BIOCHEMICAL SCIENCES, 2023, 48 (04) :345-359
[7]   BLAST plus : architecture and applications [J].
Camacho, Christiam ;
Coulouris, George ;
Avagyan, Vahram ;
Ma, Ning ;
Papadopoulos, Jason ;
Bealer, Kevin ;
Madden, Thomas L. .
BMC BIOINFORMATICS, 2009, 10
[8]   Ensembl 2022 [J].
Cunningham, Fiona ;
Allen, James E. ;
Allen, Jamie ;
Alvarez-Jarreta, Jorge ;
Amode, M. Ridwan ;
Armean, Irina M. ;
Austine-Orimoloye, Olanrewaju ;
Azov, Andrey G. ;
Barnes, If ;
Bennett, Ruth ;
Berry, Andrew ;
Bhai, Jyothish ;
Bignell, Alexandra ;
Billis, Konstantinos ;
Boddu, Sanjay ;
Brooks, Lucy ;
Charkhchi, Mehrnaz ;
Cummins, Carla ;
Fioretto, Luca Da Rin ;
Davidson, Claire ;
Dodiya, Kamalkumar ;
Donaldson, Sarah ;
El Houdaigui, Bilal ;
El Naboulsi, Tamara ;
Fatima, Reham ;
Giron, Carlos Garcia ;
Genez, Thiago ;
Martinez, Jose Gonzalez ;
Guijarro-Clarke, Cristina ;
Gymer, Arthur ;
Hardy, Matthew ;
Hollis, Zoe ;
Hourlier, Thibaut ;
Hunt, Toby ;
Juettemann, Thomas ;
Kaikala, Vinay ;
Kay, Mike ;
Lavidas, Ilias ;
Le, Tuan ;
Lemos, Diana ;
Marugan, Jose Carlos ;
Mohanan, Shamika ;
Mushtaq, Aleena ;
Naven, Marc ;
Ogeh, Denye N. ;
Parker, Anne ;
Parton, Andrew ;
Perry, Malcolm ;
Pilizota, Ivana ;
Prosovetskaia, Irina .
NUCLEIC ACIDS RESEARCH, 2022, 50 (D1) :D988-D995
[9]   Structure of cytoplasmic ring of nuclear pore complex by integrative cryo-EM and AlphaFold [J].
Fontana, Pietro ;
Dong, Ying ;
Pi, Xiong ;
Tong, Alexander B. ;
Hecksel, Corey W. ;
Wang, Longfei ;
Fu, Tian-Min ;
Bustamante, Carlos ;
Wu, Hao .
SCIENCE, 2022, 376 (6598) :1178-+
[10]   De novo protein design by inversion of the AlphaFold structure prediction network [J].
Goverde, Casper A. ;
Wolf, Benedict ;
Khakzad, Hamed ;
Rosset, Stephane ;
Correia, Bruno E. .
PROTEIN SCIENCE, 2023, 32 (06)