The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest

被引:1630
作者
Szklarczyk, Damian [1 ,2 ]
Kirsch, Rebecca [3 ]
Koutrouli, Mikaela [3 ]
Nastou, Katerina [3 ]
Mehryary, Farrokh [4 ]
Hachilif, Radja [1 ,2 ]
Gable, Annika L. [1 ,2 ]
Fang, Tao [1 ,2 ]
Doncheva, Nadezhda T. [3 ]
Pyysalo, Sampo [4 ]
Bork, Peer [5 ,6 ,7 ,8 ]
Jensen, Lars J. [3 ]
von Mering, Christian [1 ,2 ]
机构
[1] Univ Zurich, Dept Mol Life Sci, CH-8057 Zurich, Switzerland
[2] SIB Swiss Inst Bioinformat, CH-1015 Lausanne, Switzerland
[3] Univ Copenhagen, Novo Nordisk Fdn Ctr Prot Res, DK-2200 Copenhagen N, Denmark
[4] Univ Turku, Dept Comp, TurkuNLP Lab, Turku 20014, Finland
[5] European Mol Biol Lab, Struct & Computat Biol Unit, D-69117 Heidelberg, Germany
[6] Yonsei Univ, Yonsei Frontier Lab YFL, Seoul 03722, South Korea
[7] Max Delbruck Ctr Mol Med, D-13125 Berlin, Germany
[8] Univ Wurzburg, Dept Bioinformat, Biozentrum, D-97074 Wurzburg, Germany
基金
芬兰科学院;
关键词
IDENTIFICATION; ANNOTATION; RESOURCE; DISEASE; CONTEXT; MODULES;
D O I
10.1093/nar/gkac1000
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Much of the complexity within cells arises from functional and regulatory interactions among proteins. The core of these interactions is increasingly known, but novel interactions continue to be discovered, and the information remains scattered across different database resources, experimental modalities and levels of mechanistic detail. The STRING database (https://string-db.org/) systematically collects and integrates protein-protein interactions-both physical interactions as well as functional associations. The data originate from a number of sources: automated text mining of the scientific literature, computational interaction predictions from co-expression, conserved genomic context, databases of interaction experiments and known complexes/pathways from curated sources. All of these interactions are critically assessed, scored, and subsequently automatically transferred to less well-studied organisms using hierarchical orthology information. The data can be accessed via the website, but also programmatically and via bulk downloads. The most recent developments in STRING (version 12.0) are: (i) it is now possible to create, browse and analyze a full interaction network for any novel genome of interest, by submitting its complement of encoded proteins, (ii) the co-expression channel now uses variational auto-encoders to predict interactions, and it covers two new sources, single-cell RNA-seq and experimental proteomics data and (iii) the confidence in each experimentally derived interaction is now estimated based on the detection method used, and communicated to the user in the web-interface. Furthermore, STRING continues to enhance its facilities for functional enrichment analysis, which are now fully available also for user-submitted genomes.
引用
收藏
页码:D638 / D646
页数:9
相关论文
共 70 条
  • [1] OMIM.org: leveraging knowledge across phenotype-gene relationships
    Amberger, Joanna S.
    Bocchini, Carol A.
    Scott, Alan F.
    Hamosh, Ada
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D1038 - D1043
  • [2] Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
  • [3] NCBI GEO: archive for functional genomics data sets-update
    Barrett, Tanya
    Wilhite, Stephen E.
    Ledoux, Pierre
    Evangelista, Carlos
    Kim, Irene F.
    Tomashevsky, Maxim
    Marshall, Kimberly A.
    Phillippy, Katherine H.
    Sherman, Patti M.
    Holko, Michelle
    Yefanov, Andrey
    Lee, Hyeseung
    Zhang, Naigong
    Robertson, Cynthia L.
    Serova, Nadezhda
    Davis, Sean
    Soboleva, Alexandra
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D991 - D995
  • [4] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [5] Genome-wide bidirectional CRISPR screens identify mucins as host factors modulating SARS-CoV-2 infection
    Biering, Scott B.
    Sarnik, Sylvia A.
    Wang, Eleanor
    Zengel, James R.
    Leist, Sarah R.
    Schafer, Alexandra
    Sathyan, Varun
    Hawkins, Padraig
    Okuda, Kenichi
    Tau, Cyrus
    Jangid, Aditya R.
    Duffy, Connor, V
    Wei, Jin
    Gilmore, Rodney C.
    Alfajaro, Mia Madel
    Strine, Madison S.
    Nguyenla, Xammy
    Van Dis, Erik
    Catamura, Carmelle
    Yamashiro, Livia H.
    Belk, Julia A.
    Begeman, Adam
    Stark, Jessica C.
    Shon, D. Judy
    Fox, Douglas M.
    Ezzatpour, Shahrzad
    Huang, Emily
    Olegario, Nico
    Rustagi, Arjun
    Volmer, Allison S.
    Livraghi-Butrico, Alessandra
    Wehri, Eddie
    Behringer, Richard R.
    Cheon, Dong-Joo
    Schaletzky, Julia
    Aguilar, Hector C.
    Puschnik, Andreas S.
    Button, Brian
    Pinsky, Benjamin A.
    Blish, Catherine A.
    Baric, Ralph S.
    O'Neal, Wanda K.
    Bertozzi, Carolyn R.
    Wilen, Craig B.
    Boucher, Richard C.
    Carette, Jan E.
    Stanley, Sarah A.
    Harris, Eva
    Konermann, Silvana
    Hsu, Patrick D.
    [J]. NATURE GENETICS, 2022, 54 (08) : 1078 - +
  • [6] COMPARTMENTS: unification and visualization of protein subcellular localization evidence
    Binder, Janos X.
    Pletscher-Frankild, Sune
    Tsafou, Kalliopi
    Stolte, Christian
    O'Donoghue, Sean I.
    Schneider, Reinhard
    Jensen, Lars Juhl
    [J]. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2014,
  • [7] Sensitive protein alignments at tree-of-life scale using DIAMOND
    Buchfink, Benjamin
    Reuter, Klaus
    Drost, Hajk-Georg
    [J]. NATURE METHODS, 2021, 18 (04) : 366 - +
  • [8] The Gene Ontology resource: enriching a GOld mine
    Carbon, Seth
    Douglass, Eric
    Good, Benjamin M.
    Unni, Deepak R.
    Harris, Nomi L.
    Mungall, Christopher J.
    Basu, Siddartha
    Chisholm, Rex L.
    Dodson, Robert J.
    Hartline, Eric
    Fey, Petra
    Thomas, Paul D.
    Albou, Laurent-Philippe
    Ebert, Dustin
    Kesling, Michael J.
    Mi, Huaiyu
    Muruganujan, Anushya
    Huang, Xiaosong
    Mushayahama, Tremayne
    LaBonte, Sandra A.
    Siegele, Deborah A.
    Antonazzo, Giulia
    Attrill, Helen
    Brown, Nick H.
    Garapati, Phani
    Marygold, Steven J.
    Trovisco, Vitor
    Dos Santos, Gil
    Falls, Kathleen
    Tabone, Christopher
    Zhou, Pinglei
    Goodman, Joshua L.
    Strelets, Victor B.
    Thurmond, Jim
    Garmiri, Penelope
    Ishtiaq, Rizwan
    Rodriguez-Lopez, Milagros
    Acencio, Marcio L.
    Kuiper, Martin
    Laegreid, Astrid
    Logie, Colin
    Lovering, Ruth C.
    Kramarz, Barbara
    Saverimuttu, Shirin C. C.
    Pinheiro, Sandra M.
    Gunn, Heather
    Su, Renzhi
    Thurlow, Katherine E.
    Chibucos, Marcus
    Giglio, Michelle
    [J]. NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) : D325 - D334
  • [9] The MetaCyc database of metabolic pathways and enzymes - a 2019 update
    Caspi, Ron
    Billington, Richard
    Keseler, Ingrid M.
    Kothari, Anamika
    Krummenacker, Markus
    Midford, Peter E.
    Ong, Wai Kit
    Paley, Suzanne
    Subhraveti, Pallavi
    Karp, Peter D.
    [J]. NUCLEIC ACIDS RESEARCH, 2020, 48 (D1) : D445 - D453
  • [10] Saccharomyces Genome Database: the genomics resource of budding yeast
    Cherry, J. Michael
    Hong, Eurie L.
    Amundsen, Craig
    Balakrishnan, Rama
    Binkley, Gail
    Chan, Esther T.
    Christie, Karen R.
    Costanzo, Maria C.
    Dwight, Selina S.
    Engel, Stacia R.
    Fisk, Dianna G.
    Hirschman, Jodi E.
    Hitz, Benjamin C.
    Karra, Kalpana
    Krieger, Cynthia J.
    Miyasato, Stuart R.
    Nash, Rob S.
    Park, Julie
    Skrzypek, Marek S.
    Simison, Matt
    Weng, Shuai
    Wong, Edith D.
    [J]. NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) : D700 - D705