Prov-Dominoes: An approach for knowledge discovery from provenance data

被引：0

作者：

Alencar, Victor ^{[1
]}

Kohwalter, Troy ^{[2
]}

Braganholo, Vanessa ^{[2
]}

Da Silva Junior, Jose Ricardo ^{[3
,4
]}

Murta, Leonardo ^{[2
]}

机构：

[1] CASNAV, Brazilian Navy, Rio De Janeiro, RJ, Brazil

[2] Univ Fed Fluminense, Inst Computacao, Niteroi, RJ, Brazil

[3] IFRJ, Dept Computacao, Niteroi, RJ, Brazil

[4] Inst Fed Rio Janeiro, Niteroi, RJ, Brazil

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 245卷

关键词：

Knowledge discovery; Data analysis; Provenance; Gpu computing; VISUALIZATION; MODEL;

D O I：

10.1016/j.eswa.2023.123030

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Provenance has become increasingly relevant to understanding, auditing, and reproducing computational tasks. The provenance analysis processes can often be overwhelming to the user due to the large volume of data, the multiple relationships among data, and the implicit information buried in the data. Existing provenance analysis tools use either visual exploration (which is overwhelming for large provenance graphs) or do not support the exploration of implicit provenance data, such as the inferences of the PROV Data Model Constraints. To fill in this gap, we introduce Prov-Dominoes, a tool designed to interactively enable knowledge discovery on provenance data. Prov-Dominoes promotes the provenance relationships among entities, activities, and agents into first-class elements represented by domino tiles. It allows users to combine and compose such domino tiles visually and interactively, using GPU. The benefits of Prov-Dominoes are three-fold: first, it uses matrices to display provenance data, which is more compact than graphs; second, it allows users to easily explore implicit information; third, it is capable of efficiently processing large datasets using GPUs. We evaluated Prov-Dominoes over distinct case studies, allowing the observation of Prov-Dominoes in action. We also evaluated the performance of sequential combinations executed in Prov-Dominoes when dealing with provenance data with thousands of relations, contrasting their executions in GPU and CPU. The results showed that, for a large dataset, GPU was more than a hundred times faster than CPU.

引用

页数：17

共 50 条

[21] Knowledge Discovery in Data Science
Grady, Nancy W.
2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1603 - 1608
[22] Data mining and knowledge discovery
Trybula, WJ
ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 1997, 32 : 197 - 229
[23] Knowledge discovery in open data for epidemic disease prediction
Wu, ChienHsing
Kao, Shu-Chen
HEALTH POLICY AND TECHNOLOGY, 2021, 10 (01) : 126 - 134
[24] FNETVision: A WAMS Big Data Knowledge Discovery System
Wang, Weikang
Zhao, Jiecheng
Yu, Wenpeng
Liu, Yilu
2018 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2018,
[25] Data Warehousing and Knowledge Discovery
Mukesh Mohania
A. Min Tjoa
Yahiko Kambayashi
Journal of Intelligent Information Systems, 2000, 15 : 5 - 6
[26] A Novel Approach for Knowledge Discovery from AIS Data: An Application for Transit Marine Traffic in the Sea of Marmara
Dogan, Yunus
Kart, Ozge
Kundakci, Burak
Nas, Selcuk
ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2021, 21 (03) : 73 - 80
[27] Knowledge Discovery: Methods from data mining and machine learning
Shu, Xiaoling
Ye, Yiwan
SOCIAL SCIENCE RESEARCH, 2023, 110
[28] Modeling of Distributed visual Knowledge Discovery from Data Process
Ellouzi, Hamdi
ben Ayed, Mounir
Ltifi, Hela
2017 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (IEEE ISKE), 2017,
[29] Knowledge Discovery from Large Amounts of Social Media Data
Belcastro, Loris
Cantini, Riccardo
Marozzo, Fabrizio
APPLIED SCIENCES-BASEL, 2022, 12 (03):
[30] Random Forests in a Glassworks: Knowledge Discovery from Industrial Data
Setlak, Galina
Pasko, Lukasz
INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2019, PT II, 2020, 1051 : 179 - 188

← 1 2 3 4 5 →