Prov-Dominoes: An approach for knowledge discovery from provenance data

被引:0
作者
Alencar, Victor [1 ]
Kohwalter, Troy [2 ]
Braganholo, Vanessa [2 ]
Da Silva Junior, Jose Ricardo [3 ,4 ]
Murta, Leonardo [2 ]
机构
[1] CASNAV, Brazilian Navy, Rio De Janeiro, RJ, Brazil
[2] Univ Fed Fluminense, Inst Computacao, Niteroi, RJ, Brazil
[3] IFRJ, Dept Computacao, Niteroi, RJ, Brazil
[4] Inst Fed Rio Janeiro, Niteroi, RJ, Brazil
关键词
Knowledge discovery; Data analysis; Provenance; Gpu computing; VISUALIZATION; MODEL;
D O I
10.1016/j.eswa.2023.123030
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Provenance has become increasingly relevant to understanding, auditing, and reproducing computational tasks. The provenance analysis processes can often be overwhelming to the user due to the large volume of data, the multiple relationships among data, and the implicit information buried in the data. Existing provenance analysis tools use either visual exploration (which is overwhelming for large provenance graphs) or do not support the exploration of implicit provenance data, such as the inferences of the PROV Data Model Constraints. To fill in this gap, we introduce Prov-Dominoes, a tool designed to interactively enable knowledge discovery on provenance data. Prov-Dominoes promotes the provenance relationships among entities, activities, and agents into first-class elements represented by domino tiles. It allows users to combine and compose such domino tiles visually and interactively, using GPU. The benefits of Prov-Dominoes are three-fold: first, it uses matrices to display provenance data, which is more compact than graphs; second, it allows users to easily explore implicit information; third, it is capable of efficiently processing large datasets using GPUs. We evaluated Prov-Dominoes over distinct case studies, allowing the observation of Prov-Dominoes in action. We also evaluated the performance of sequential combinations executed in Prov-Dominoes when dealing with provenance data with thousands of relations, contrasting their executions in GPU and CPU. The results showed that, for a large dataset, GPU was more than a hundred times faster than CPU.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Modeling of Distributed visual Knowledge Discovery from Data Process
    Ellouzi, Hamdi
    ben Ayed, Mounir
    Ltifi, Hela
    2017 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (IEEE ISKE), 2017,
  • [32] Knowledge Discovery from Honeypot Data for Monitoring Malicious Attacks
    Jin, Huidong
    de Vel, Olivier
    Zhang, Ke
    Liu, Nianjun
    AI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5360 : 470 - +
  • [33] METHODOLOGIES OF KNOWLEDGE DISCOVERY FROM DATA AND DATA MINING METHODS IN MECHANICAL ENGINEERING
    Rogalewicz, Michal
    Sika, Robert
    MANAGEMENT AND PRODUCTION ENGINEERING REVIEW, 2016, 7 (04) : 97 - 108
  • [34] Data Mining Technique for Knowledge Discovery from Engineering Materials Data Sets
    Doreswamy
    Hemanth, K. S.
    Vastrad, Channabasayya M.
    Nagaraju, S.
    ADVANCES IN COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, PT I, 2011, 131 : 512 - +
  • [35] From Patterns in Data to Knowledge Discovery: What Data Mining Can Do
    Gullo, Francesco
    3RD INTERNATIONAL CONFERENCE FRONTIERS IN DIAGNOSTIC TECHNOLOGIES, ICFDT3 2013, 2015, 62 : 18 - 22
  • [36] Provenance as a Domain Analysis Approach in Archival Knowledge Organization
    Chaves Guimaraes, Jose Augusto
    Tognoli, Natalia Bolfarini
    KNOWLEDGE ORGANIZATION, 2015, 42 (08): : 562 - 569
  • [37] Evolutionary intelligent data warehousing approach to knowledge discovery systems: Dynamic cubing
    Kaur H.
    Singh K.
    Kaur T.
    Recent Advances in Computer Science and Communications, 2021, 14 (06) : 1869 - 1882
  • [38] From Knowledge Discovery to Knowledge Transfer
    Gabor, Andras
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON INTELLECTUAL CAPITAL KNOWLEDGE MANAGEMENT & ORGANISATIONAL LEARNING (ICICKM 2016), 2016, : 329 - 332
  • [39] Using the Semantic Web in digital humanities: Shift from data publishing to data-analysis and serendipitous knowledge discovery
    Hyvonen, Eero
    SEMANTIC WEB, 2020, 11 (01) : 187 - 193
  • [40] An Integrative Bioinformatics Approach for Knowledge Discovery
    Pena-Castillo, Lourdes
    Phan, Sieu
    Famili, Fazel
    IT REVOLUTIONS, 2009, 11 : 254 - 257