Provenance Network AnalyticsAn approach to data analytics using data provenance

被引:0
|
作者
Trung Dong Huynh
Mark Ebden
Joel Fischer
Stephen Roberts
Luc Moreau
机构
[1] University of Southampton,Electronics and Computer Science
[2] University of Oxford,Information Engineering, Department of Engineering Science
[3] University of Nottingham,Mixed Reality Lab., School of Computer Science
[4] King’s College London,Department of Informatics
来源
Data Mining and Knowledge Discovery | 2018年 / 32卷
关键词
Data provenance; Data analytics; Network metrics; Graph classification;
D O I
暂无
中图分类号
学科分类号
摘要
Provenance network analytics is a novel data analytics approach that helps infer properties of data, such as quality or importance, from their provenance. Instead of analysing application data, which are typically domain-dependent, it analyses the data’s provenance as represented using the World Wide Web Consortium’s domain-agnostic PROV data model. Specifically, the approach proposes a number of network metrics for provenance data and applies established machine learning techniques over such metrics to build predictive models for some key properties of data. Applying this method to the provenance of real-world data from three different applications, we show that it can successfully identify the owners of provenance documents, assess the quality of crowdsourced data, and identify instructions from chat messages in an alternate-reality game with high levels of accuracy. By so doing, we demonstrate the different ways the proposed provenance network metrics can be used in analysing data, providing the foundation for provenance-based data analytics.
引用
收藏
页码:708 / 735
页数:27
相关论文
共 50 条
  • [1] Provenance Network Analytics An approach to data analytics using data provenance
    Trung Dong Huynh
    Ebden, Mark
    Fischer, Joel
    Roberts, Stephen
    Moreau, Luc
    DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 32 (03) : 708 - 735
  • [2] Implementing Data Provenance in Health Data Analytics Software
    Xu, Shen
    Fairweather, Elliot
    Rogers, Toby
    Curcin, Vasa
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES, IPAW 2018, 2018, 11017 : 173 - 176
  • [3] A Data Provenance Visualization Approach
    Yazici, Ilkay Melek
    Karabulut, Erkan
    Aktas, Mehmet S.
    2018 14TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2018, : 84 - 91
  • [4] A novel visualization approach for data provenance
    Yazici, Ilkay Melek
    Aktas, Mehmet S.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (09)
  • [5] A Proposed Approach for Provenance Data Gathering
    Sembay, Marcio Jose
    de Macedo, Douglas Dyllon Jeronimo
    Dutra, Moises Lima
    MOBILE NETWORKS & APPLICATIONS, 2021, 26 (01) : 304 - 318
  • [6] A Proposed Approach for Provenance Data Gathering
    Márcio José Sembay
    Douglas Dyllon Jeronimo de Macedo
    Moisés Lima Dutra
    Mobile Networks and Applications, 2021, 26 : 304 - 318
  • [7] Big Data Provenance Using Blockchain for Qualitative Analytics via Machine Learning
    Khan, Kashif Mehboob
    Haider, Warda
    Khan, Najeed Ahmed
    Saleem, Darakhshan
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2023, 29 (05) : 446 - 469
  • [8] Data Provenance Assurance In Cloud Using Blockchain
    Shetty, Sachin
    Red, Val
    Kamhoua, Charles
    Kwiat, Kevin
    Njilla, Laurent
    DISRUPTIVE TECHNOLOGIES IN SENSORS AND SENSOR SYSTEMS, 2017, 10206
  • [9] Data Provenance for healthcare: a blockchain-based approach
    D'Antonio, Salvatore
    Uccello, Federica
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1655 - 1660
  • [10] Data Provenance in Environmental Monitoring
    da Silva, Daniel L.
    Batista, Andre
    Correa, Pedro L. P.
    PROCEEDINGS 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SENSOR SYSTEMS (MASS 2016), 2016, : 337 - 342