RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning

被引:342
作者
Burley, Stephen K. [1 ,2 ,3 ,4 ,5 ]
Bhikadiya, Charmi [5 ]
Bi, Chunxiao [5 ]
Bittrich, Sebastian [5 ]
Chao, Henry [1 ,2 ]
Chen, Li [1 ,2 ]
Craig, Paul A. [6 ]
Crichlow, Gregg, V [1 ,2 ]
Dalenberg, Kenneth [1 ,2 ]
Duarte, Jose M. [5 ]
Dutta, Shuchismita [1 ,2 ,3 ]
Fayazi, Maryam [1 ,2 ]
Feng, Zukang [1 ,2 ]
Flatt, Justin W. [1 ,2 ]
Ganesan, Sai [7 ]
Ghosh, Sutapa [1 ,2 ]
Goodsell, David S. [1 ,2 ,3 ,8 ]
Green, Rachel Kramer [1 ,2 ]
Guranovic, Vladimir [1 ,2 ]
Henry, Jeremy [5 ]
Hudson, Brian P. [1 ,2 ]
Khokhriakov, Igor [5 ]
Lawson, Catherine L. [1 ,2 ]
Liang, Yuhe [1 ,2 ]
Lowe, Robert [1 ,2 ]
Peisach, Ezra [1 ,2 ]
Persikova, Irina [1 ,2 ]
Piehl, Dennis W. [1 ,2 ]
Rose, Yana [5 ]
Sali, Andrej
Segura, Joan [5 ]
Sekharan, Monica [1 ,2 ]
Shao, Chenghua [1 ,2 ]
Vallat, Brinda [1 ,2 ]
Voigt, Maria [1 ,2 ]
Webb, Ben
Westbrook, John D. [1 ,2 ,3 ]
Whetstone, Shamara [1 ,2 ]
Young, Jasmine Y. [1 ,2 ]
Zalevsky, Arthur
Zardecki, Christine [1 ,2 ]
机构
[1] Rutgers State Univ, Res Collaboratory Struct Bioinformat Prot Data Ban, Piscataway, NJ 08854 USA
[2] Rutgers State Univ, Inst Quantitat Biomed, Piscataway, NJ 08854 USA
[3] Rutgers Canc Inst New Jersey, New Brunswick, NJ 08901 USA
[4] Rutgers State Univ, Dept Chem & Chem Biol, Piscataway, NJ 08854 USA
[5] Univ Calif San Diego, San Diego Supercomp Ctr, Res Collaboratory Struct Bioinformat Prot Data Ban, La Jolla, CA 92093 USA
[6] Rochester Inst Technol, Sch Chem & Mat Sci, Rochester, NY 14623 USA
[7] Univ Calif San Francisco, Quantitat Biosci Inst, Dept Bioengn & Therapeut Sci, Dept Pharmaceut Chem,Res Collaboratory Struct Bioi, San Francisco, CA 94158 USA
[8] Scripps Res Inst, Dept Integrat Struct & Computat Biol, La Jolla, CA 92037 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
STRUCTURE PREDICTION; BIOLOGICAL MACROMOLECULES; RESOURCE; INFORMATION; VALIDATION; BIOTECHNOLOGY; VISUALIZATION; ANNOTATION; DEPOSITION; FAMILIES;
D O I
10.1093/nar/gkac1077
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), founding member of the Worldwide Protein Data Bank (wwPDB), is the US data center for the open-access PDB archive. As wwPDB-designated Archive Keeper, RCSB PDB is also responsible for PDB data security. Annually, RCSB PDB serves >10 000 depositors of three-dimensional (3D) biostructures working on all permanently inhabited continents. RCSB PDB delivers data from its research-focused RCSB.org web portal to many millions of PDB data consumers based in virtually every United Nations-recognized country, territory, etc. This Database Issue contribution describes upgrades to the research-focused RCSB.org web portal that created a one-stop-shop for open access to similar to 200 000 experimentally-determined PDB structures of biological macromolecules alongside >1 000 000 incorporated Computed Structure Models (CSMs) predicted using artificial intelligence/machine learning methods. RCSB.org is a 'living data resource.' Every PDB structure and CSM is integrated weekly with related functional annotations from external biodata resources, providing up-to-date information for the entire corpus of 3D biostructure data freely available from RCSB.org with no usage limitations. Within RCSB.org, PDB structures and the CSMs are clearly identified as to their provenance and reliability. Both are fully searchable, and can be analyzed and visualized using the full complement of RCSB.org web portal capabilities.
引用
收藏
页码:D488 / D508
页数:21
相关论文
共 128 条
  • [1] The GTEx Consortium atlas of genetic regulatory effects across human tissues
    Aguet, Francois
    Barbeira, Alvaro N.
    Bonazzola, Rodrigo
    Brown, Andrew
    Castel, Stephane E.
    Jo, Brian
    Kasela, Silva
    Kim-Hellmuth, Sarah
    Liang, Yanyu
    Parsana, Princy
    Flynn, Elise
    Fresard, Laure
    Gamazon, Eric R.
    Hamel, Andrew R.
    He, Yuan
    Hormozdiari, Farhad
    Mohammadi, Pejman
    Munoz-Aguirre, Manuel
    Ardlie, Kristin G.
    Battle, Alexis
    Bonazzola, Rodrigo
    Brown, Christopher D.
    Cox, Nancy
    Dermitzakis, Emmanouil T.
    Engelhardt, Barbara E.
    Garrido-Martin, Diego
    Gay, Nicole R.
    Getz, Gad
    Guigo, Roderic
    Hamel, Andrew R.
    Handsaker, Robert E.
    He, Yuan
    Hoffman, Paul J.
    Hormozdiari, Farhad
    Im, Hae Kyung
    Jo, Brian
    Kasela, Silva
    Kashin, Seva
    Kim-Hellmuth, Sarah
    Kwong, Alan
    Lappalainen, Tuuli
    Li, Xiao
    Liang, Yanyu
    MacArthur, Daniel G.
    Mohammadi, Pejman
    Montgomery, Stephen B.
    Munoz-Aguirre, Manuel
    Rouhana, John M.
    Hormozdiari, Farhad
    Im, Hae Kyung
    [J]. SCIENCE, 2020, 369 (6509) : 1318 - 1330
  • [2] Recent improvements to Binding MOAD: a resource for protein-ligand binding affinities and structures
    Ahmed, Aqeel
    Smith, Richard D.
    Clark, Jordan J.
    Dunbar, James B., Jr.
    Carlson, Heather A.
    [J]. NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) : D465 - D469
  • [3] Target highlights in CASP14: Analysis of models by structure providers
    Alexander, Leila T.
    Lepore, Rosalba
    Kryshtafovych, Andriy
    Adamopoulos, Athanassios
    Alahuhta, Markus
    Arvin, Ann M.
    Bomble, Yannick J.
    Bottcher, Bettina
    Breyton, Cecile
    Chiarini, Valerio
    Chinnam, Naga Babu
    Chiu, Wah
    Fidelis, Krzysztof
    Grinter, Rhys
    Gupta, Gagan D.
    Hartmann, Marcus D.
    Hayes, Christopher S.
    Heidebrecht, Tatjana
    Ilari, Andrea
    Joachimiak, Andrzej
    Kim, Youngchang
    Linares, Romain
    Lovering, Andrew L.
    Lunin, Vladimir V.
    Lupas, Andrei N.
    Makbul, Cihan
    Michalska, Karolina
    Moult, John
    Mukherjee, Prasun K.
    Nutt, William Sam
    Oliver, Stefan L.
    Perrakis, Anastassis
    Stols, Lucy
    Tainer, John A.
    Topf, Maya
    Tsutakawa, Susan E.
    Valdivia-Delgado, Mauricio
    Schwede, Torsten
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2021, 89 (12) : 1647 - 1672
  • [4] Anderson W., 2017, COORDINATED INT SUPP, DOI [10.1101/110825v3, DOI 10.1101/110825V3]
  • [5] A global coalition to sustain core data
    Anderson, Warwick P.
    [J]. NATURE, 2017, 543 (7644) : 179 - 179
  • [6] The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures
    Andreeva, Antonina
    Kulesha, Eugene
    Gough, Julian
    Murzin, Alexey G.
    [J]. NUCLEIC ACIDS RESEARCH, 2020, 48 (D1) : D376 - D382
  • [7] SCOP2 prototype: a new approach to protein structure mining
    Andreeva, Antonina
    Howorth, Dave
    Chothia, Cyrus
    Kulesha, Eugene
    Murzin, Alexey G.
    [J]. NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) : D310 - D314
  • [8] PRINCIPLES THAT GOVERN FOLDING OF PROTEIN CHAINS
    ANFINSEN, CB
    [J]. SCIENCE, 1973, 181 (4096) : 223 - 230
  • [9] Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
  • [10] PDBe: improved findability of macromolecular structure data in the PDB
    Armstrong, David R.
    Berrisford, John M.
    Conroy, Matthew J.
    Gutmanas, Aleksandras
    Anyango, Stephen
    Choudhary, Preeti
    Clark, Alice R.
    Dana, Jose M.
    Deshpande, Mandar
    Dunlop, Roisin
    Gane, Paul
    Gaborova, Romana
    Gupta, Deepti
    Haslam, Pauline
    Koca, Jaroslav
    Mak, Lora
    Mir, Saqib
    Mukhopadhyay, Abhik
    Nadzirin, Nurul
    Nair, Sreenath
    Paysan-Lafosse, Typhaine
    Pravda, Lukas
    Sehnal, David
    Salih, Osman
    Smart, Oliver
    Tolchard, James
    Varadi, Mihaly
    Svobodova-Varekova, Radka
    Zaki, Hossam
    Kleywegt, Gerard J.
    Velankar, Sameer
    [J]. NUCLEIC ACIDS RESEARCH, 2020, 48 (D1) : D335 - D343