Blue Brain Nexus: An open, secure, scalable system for knowledge graph management and data-driven science

被引:1
作者
Sy, Mohameth Francois [1 ]
Roman, Bogdan [1 ]
Kerrien, Samuel [1 ]
Mendez, Didac Montero [1 ]
Genet, Henry [1 ]
Wajerowicz, Wojciech [1 ]
Dupont, Michael [1 ]
Lavriushev, Ian [1 ]
Machon, Julien [1 ]
Pirman, Kenneth [1 ]
Mana, Dhanesh Neela [1 ]
Stafeeva, Natalia [1 ]
Kaufmann, Anna-Kristin [1 ]
Lu, Huanxiang [1 ]
Lurie, Jonathan [1 ]
Fonta, Pierre-Alexandre [1 ]
Martinez, Alejandra Garcia Rojas [1 ]
Ulbrich, Alexander D. [1 ]
Lindqvist, Carolina [1 ]
Jimenez, Silvia [1 ]
Rotenberg, David [2 ]
Markram, Henry [1 ]
Hill, Sean L. [1 ,2 ,3 ]
机构
[1] Ecole Polytechn Federale Lausanne EPFL, Blue Brain Project, Biotech Campus, Geneva, Switzerland
[2] Krembil Ctr Neuroinformat, Ctr Addict & Mental Hlth CAMH, Toronto, ON, Canada
[3] Univ Toronto, Dept Psychiat, Neurosci & Clin Translat, Toronto, ON, Canada
关键词
Knowledge graph; Data science; Data management; Distributed system; Data-driven science; BIG DATA; ONTOLOGY;
D O I
10.3233/SW-222974
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern data-driven science often consists of iterative cycles of data discovery, acquisition, preparation, analysis, model building and validation leading to knowledge discovery as well as dissemination at scale. The unique challenges of building and simulating the whole rodent brain in the Swiss EPFL Blue Brain Project (BBP) required a solution to managing large-scale highly heterogeneous data, and tracking their provenance to ensure quality, reproducibility and attribution throughout these iterative cycles. Here, we describe Blue Brain Nexus (BBN), an ecosystem of open source, domain agnostic, scalable, extensible data and knowledge graph management systems built by BBP to address these challenges. BBN builds on open standards and interoperable semantic web technologies to enable the creation and management of secure RDF-based knowledge graphs validated by W3C SHACL. BBN supports a spectrum of (meta)data modeling and representation formats including JSON and JSON-LD as well as more formally specified SHACL-based schemas enabling domain model-driven runtime API. With its streaming event-based architecture, BBN supports asynchronous building and maintenance of multiple extensible indices to ensure high performance search capabilities and enable analytics. We present four use cases and applications of BBN to largescale data integration and dissemination challenges in computational modeling, neuroscience, psychiatry and open linked data.
引用
收藏
页码:697 / 727
页数:31
相关论文
共 50 条
  • [1] Agarwal O, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P3554
  • [2] Smart Electricity Meter Data Intelligence for Future Energy Systems: A Survey
    Alahakoon, Damminda
    Yu, Xinghuo
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2016, 12 (01) : 425 - 436
  • [3] The Human Brain Project-Synergy between neuroscience, computing, informatics, and brain-inspired technologies
    Amunts, Katrin
    Knoll, Alois C.
    Lippert, Thomas
    Pennartz, Cyriel M. A.
    Ryvlin, Philippe
    Destexhe, Alain
    Jirsa, Viktor K.
    D'Angelo, Egidio
    Bjaalie, Jan G.
    [J]. PLOS BIOLOGY, 2019, 17 (07)
  • [4] Big Data for Health
    Andreu-Perez, Javier
    Poon, Carmen C. Y.
    Merrifield, Robert D.
    Wong, Stephen T. C.
    Yang, Guang-Zhong
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2015, 19 (04) : 1193 - 1208
  • [5] Aryani A, 2017, RES GRAPH BUILDING D, DOI [10.4225/03/58c696655af8a, DOI 10.4225/03/58C696655AF8A]
  • [6] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [7] Baker M, 2016, NATURE, V533, P452, DOI 10.1038/533452a
  • [8] UniProt: a worldwide hub of protein knowledge
    Bateman, Alex
    Martin, Maria-Jesus
    Orchard, Sandra
    Magrane, Michele
    Alpi, Emanuele
    Bely, Benoit
    Bingley, Mark
    Britto, Ramona
    Bursteinas, Borisas
    Busiello, Gianluca
    Bye-A-Jee, Hema
    Da Silva, Alan
    De Giorgi, Maurizio
    Dogan, Tunca
    Castro, Leyla Garcia
    Garmiri, Penelope
    Georghiou, George
    Gonzales, Daniel
    Gonzales, Leonardo
    Hatton-Ellis, Emma
    Ignatchenko, Alexandr
    Ishtiaq, Rizwan
    Jokinen, Petteri
    Joshi, Vishal
    Jyothi, Dushyanth
    Lopez, Rodrigo
    Luo, Jie
    Lussi, Yvonne
    MacDougall, Alistair
    Madeira, Fabio
    Mahmoudy, Mahdi
    Menchi, Manuela
    Nightingale, Andrew
    Onwubiko, Joseph
    Palka, Barbara
    Pichler, Klemens
    Pundir, Sangya
    Qi, Guoying
    Raj, Shriya
    Renaux, Alexandre
    Lopez, Milagros Rodriguez
    Saidi, Rabie
    Sawford, Tony
    Shypitsyna, Aleksandra
    Speretta, Elena
    Turner, Edward
    Tyagi, Nidhi
    Vasudev, Preethi
    Volynkin, Vladimir
    Wardell, Tony
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D506 - D515
  • [9] The FAIR guiding principles for data stewardship: fair enough?
    Boeckhout, Martin
    Zielhuis, Gerhard A.
    Bredenoord, Annelien L.
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2018, 26 (07) : 931 - 936
  • [10] A Multi-Level Systems Perspective for the Science of Team Science
    Boerner, Katy
    Contractor, Noshir
    Falk-Krzesinski, Holly J.
    Fiore, Stephen M.
    Hall, Kara L.
    Keyton, Joann
    Spring, Bonnie
    Stokols, Daniel
    Trochim, William
    Uzzi, Brian
    [J]. SCIENCE TRANSLATIONAL MEDICINE, 2010, 2 (49)