Opinion paper: Data provenance challenges in biomedical research

被引:5
作者
Baum B. [1 ]
Bauer C.R. [1 ]
Franke T. [1 ]
Kusch H. [1 ]
Parciak M. [1 ]
Rottmann T. [1 ]
Umbach N. [1 ]
Sax U. [1 ,2 ]
机构
[1] Department of Medical Informatics, University Medical Center Göttingen, Göttingen
[2] WG Infrastructure for Translational Research, Department of Medical Informatics, University Medical Center Göttingen, Göttingen
来源
IT - Information Technology | 2017年 / 59卷 / 04期
关键词
Data collection; Data curation; Data provenance; Information storage and retrieval; Medical informatics application;
D O I
10.1515/itit-2016-0031
中图分类号
学科分类号
摘要
In this opinion paper we provide an overview of some challenges concerning data provenance in biomedical research. We reflect current literature and depict some examples of existing implicit or explicit provenance aspects in some standard data types in translational research. Furthermore, we assess the need of further data provenance standardization in biomedical informatics. Basic data provenance should provide a recall about the origin of the data, transformation process steps, support replication and presentation of the data. Even though usable concepts for the documentation of data provenance can be found in other fields as early as 2005, the penetration rate in biomedical projects and in the biomedical literature is quite low. The awareness for the necessity of basic data provenance has to be raised, the education of data managers has to be further improved. © 2017 De Gruyter Oldenbourg. All rights reserved.
引用
收藏
页码:191 / 196
页数:5
相关论文
共 28 条
[1]  
Bauer C.R., Umbach N., Baum B., Buckow K., Franke T., Grutz R., Gusky L., Nussbeck S.Y., Quade M., Rey S., Architecture of a biomedical informatics research data management pipeline, Studies in Health Technology and Informatics, 228, (2016)
[2]  
Simmhan Y.L., Plale B., Gannon D., A survey of data provenance in e-science, ACM Sigmod Record, 34, 3, pp. 31-36, (2005)
[3]  
Weber G.M., Mandl K.D., Kohane I.S., Finding the missing link for big biomedical data, Jama, 311, 24, pp. 2479-2480, (2014)
[4]  
Malin B., Protecting Dna Sequence Anonymity with Generalization Lattices, (2004)
[5]  
An Overview of the PROV Family of Documents
[6]  
Buneman P., Khanna S., Tan W.-C., Data provenance: Some basic issues, FST TCS 2000: Foundations of Software Technology and Theoretical Computer Science: 20th Conference, pp. 87-93, (2000)
[7]  
Sahoo S.S., Nguyen V., Bodenreider O., Parikh P., Minning T., Sheth A.P., A unified framework for managing provenance information in translational research, BMC Bioinformatics, 12, 1, (2011)
[8]  
W3C Provenance Incubator Group Wiki
[9]  
Lagoze C., Block W.C., Williams J., Abowd J., Vilhuber L., Encoding Provenance of Social Science Data: Integrating Prov with Ddi, (2013)
[10]  
Sahoo S.S., Valdez J., Rueschman M., Scientific reproducibility in biomedical research: Provenance metadata ontology for semantic annotation of study description, AMIA Annual Symposium Proceedings, 2016, pp. 1070-1079, (2016)