Asynchronous Data Provenance for Research Data in a Distributed System

被引:0
作者
Heinrichs, Benedikt [1 ]
Politze, Marius [1 ]
机构
[1] Rhein Westfal TH Aachen, IT Ctr, Seffenter Weg 23, Aachen, Germany
来源
ICEIS: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS - VOL 2 | 2021年
关键词
Research Data Management; Data Provenance; Distributed Systems;
D O I
10.5220/0010478003610367
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many provenance systems assume that the data flow is being directly orchestrated by them or logs are present which describe it. This works well until these assumptions do not hold anymore. The Coscine platform is a way for researchers to connect to different storage providers and annotate their stored data with discipline-specific metadata. These storage providers, however, do not inform the platform of externally induced changes for example by the user. Therefore, this paper focuses on the need of data provenance that is not directly produced and has to be deduced after the fact. An approach is proposed for dealing with and creating such asynchronous data provenance which makes use of change indicators that deduce if a data entity has been modified. A representation on how to describe such an asynchronous data provenance in the Resource Description Framework (RDF) is discussed. Finally, a prototypical implementation of the approach in the Coscine use-case is described and the future steps for the approach and prototype are detailed.
引用
收藏
页码:361 / 367
页数:7
相关论文
共 24 条
[11]   A survey on provenance: What for? What form? What from? [J].
Herschel, Melanie ;
Diestelkaemper, Ralf ;
Ben Lahmar, Houssem .
VLDB JOURNAL, 2017, 26 (06) :881-906
[12]   A survey on data provenance in IoT [J].
Hu, Rui ;
Yan, Zheng ;
Ding, Wenxiu ;
Yang, Laurence T. .
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (02) :1441-1463
[13]   Adding data provenance support to Apache Spark [J].
Interlandi, Matteo ;
Ekmekji, Ari ;
Shah, Kshitij ;
Gulzar, Muhammad Ali ;
Tetali, Sai Deep ;
Kim, Miryung ;
Millstein, Todd ;
Condie, Tyson .
VLDB JOURNAL, 2018, 27 (05) :595-615
[14]  
Mufti Z., 2018, DATA PROVENANCE INTE
[15]  
Ocansey S. K., 2018, INT J COMPUTERS APPL, V43, P1
[16]   A systematic review of provenance systems [J].
Perez, Beatriz ;
Rubio, Julio ;
Saenz-Adan, Carlos .
KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 57 (03) :495-543
[17]  
Politze M., 2020, Eur. J. Higher Educ. IT, V1, P5
[18]  
Schmitz D., 2018, o-bib. Das offene Bibliotheksjournal/Herausgeber VDB, V5, P76, DOI [https://doi.org/10.5282/o-bib/2018H3S76-91, DOI 10.5282/O-BIB/2018H3S76-91]
[19]  
Schwardmann U, 2015, PRES JOINT DATACITE, V21
[20]  
da Cruz SMS, 2009, 2009 IEEE CONGRESS ON SERVICES (SERVICES-1 2009), VOLS 1 AND 2, P259, DOI 10.1109/SERVICES-I.2009.18