Exploiting Distributed, Heterogeneous and Sensitive Data Stocks while Maintaining the Owner's Data Sovereignty

被引:25
作者
Lablans, M. [1 ]
Kadioglu, D. [1 ]
Muscholl, M. [1 ]
Ueckert, F. [1 ]
机构
[1] Univ Med Ctr Mainz, D-55131 Mainz, Germany
关键词
Collaborative research; biobank; disease registries; federation; distributed search; data protection; data sovereignty; data integration; SYSTEMS; MODEL;
D O I
10.3414/ME14-01-0137
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Background: To achieve statistical significance in medical research, biological or data samples from several bio- or databanks often need to be complemented by those of other institutions. For that purpose, IT-based search services have been established to locate datasets matching a given set of criteria in databases distributed across several institutions. However, previous approaches require data owners to disclose information about their samples, raising a barrier for their participation in the network. Objective: To devise a method to search distributed databases for datasets matching a given set of criteria while fully maintaining their owner's data sovereignty. Methods: As a modification to traditional federated search services, we propose the decentral search, which allows the data owner a high degree of control. Relevant data are loaded into local bridgeheads, each under their owner's sovereignty. Researchers can formulate criteria sets along with a project proposal using a central search broker, which then notifies the bridgeheads. The criteria are, however, treated as an inquiry rather than a query: Instead of responding with results, bridgeheads notify their owner and wait for his/her decision regarding whether and what to answer based on the criteria set, the matching datasets and the specific project proposal. Without the owner's explicit consent, no data leaves his/her institution. Results: The decentral search has been deployed in one of the six German Centers for Health Research, comprised of eleven university hospitals. In the process, compliance with German data protection regulations has been confirmed. The decentral search also marks the centerpiece of an open source registry software toolbox aiming to build a national registry of rare diseases in Germany. Conclusions: While the sacrifice of real-time answers impairs some use-cases, it leads to several beneficial side effects: improved data protection due to data parsimony, tolerance for incomplete data schema mappings and flexibility with regard to patient consent. Most importantly, as no datasets ever leave their institution, owners can reject projects without facing potential peer pressure. By its lower barrier for participation, a decentral search service is likely to attract a larger number of partners and to bring a researcher into contact with the right potential partners.
引用
收藏
页码:346 / 352
页数:7
相关论文
共 34 条
  • [1] Altmann U, 2006, ST HEAL T, V124, P139
  • [2] [Anonymous], 11179 ISOIEC JTC1 SC
  • [3] Arbeitsgemeinschaft Deutscher Tumorzentren e.V, ORG MOD ERG ZUM BAS
  • [4] Arbeitsgemeinschaft Deutscher Tumorzentren e.V, 2014, EINH ONK BAS ADT GEK
  • [5] Asslaber Martin, 2007, Briefings in Functional Genomics & Proteomics, V6, P193, DOI 10.1093/bfgp/elm023
  • [6] Networking for rare diseases:: a necessity for Europe
    Ayme, S.
    Schmidtke, J.
    [J]. BUNDESGESUNDHEITSBLATT-GESUNDHEITSFORSCHUNG-GESUNDHEITSSCHUTZ, 2007, 50 (12) : 1477 - 1483
  • [7] BBMRI, CAT EUR BIOB
  • [8] Dispelling myths about rare disease registry system development
    Bellgard, Matthew
    Beroud, Christophe
    Parkinson, Kay
    Harris, Tess
    Ayme, Segolene
    Baynam, Gareth
    Weeramanthri, Tarun
    Dawkins, Hugh
    Hunter, Adam
    [J]. SOURCE CODE FOR BIOLOGY AND MEDICINE, 2013, 8 (01):
  • [9] Bundesministerium fur Bildung und Forschung, 2012, KAMPF GEG KREBS DTSC
  • [10] Demchok J, 2013, EUR J CANCER, V49, pS37