A Data Quality Framework for Graph-Based Virtual Data Integration Systems

被引:0
|
作者
Li, Yalei [1 ]
Nadal, Sergi [1 ]
Romero, Oscar [1 ]
机构
[1] Univ Politecn Catalunya BarcelonaTech, Barcelona, Spain
来源
ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2022 | 2022年 / 13389卷
关键词
Data Quality; Data integration; Denial constraints; APPROXIMATE; DISCOVERY;
D O I
10.1007/978-3-031-15740-0_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data Quality (DQ) plays a critical role in data integration. Up to now, DQ has mostly been addressed from a single database perspective. Popular DQ frameworks rely on Integrity Constraints (IC) to enforce valid application semantics, which lead to the Denial Constraint (DC) formalism which models a broad range of ICs in real-world applications. Yet, current approaches are rather monolithic, considering a single database and do not suit data integration scenarios. In this paper, we address DQ for data integration systems. Specifically, we extend virtual data integration systems to elicit DCs from disparate data sources to be integrated, using DC-related state-of-the-art, and propagate them to the integrated schema (global DCs). Then, we propose a method to manage global DCs and identify (i) minimal DCs and (ii) potential clashes between them.
引用
收藏
页码:104 / 117
页数:14
相关论文
共 50 条
  • [1] A Knowledge Graph-Based Data Integration Framework Applied to Battery Data Management
    Kalayci, Tahir Emre
    Bricelj, Bor
    Lah, Marko
    Pichler, Franz
    Scharrer, Matthias K.
    Rubesa-Zrim, Jelena
    SUSTAINABILITY, 2021, 13 (03) : 1 - 17
  • [2] Graph-based Data Integration and Business Intelligence with BIIIG
    Petermann, Andre
    Junghanns, Martin
    Muller, Robert
    Rahm, Erhard
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (13): : 1577 - 1580
  • [3] A knowledge graph-based data harmonization framework for secondary data reuse
    Abad-Navarro, Francisco
    Martinez-Costa, Catalina
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 243
  • [4] Graph-based Management of Neuroscience data: Representation, Integration and Analysis
    Gulnes, Maren Parnas
    Soylu, Ahmet
    Roman, Dumitru
    ERCIM NEWS, 2021, (125): : 44 - 45
  • [5] Graph-based sequence annotation using a data integration approach
    Pesch, Robert
    Lysenko, Artem
    Hindle, Matthew
    Hassani-Pak, Keywan
    Thiele, Ralf
    Rawlings, Christopher
    Koehler, Jacob
    Taubert, Jan
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2008, 5 (02)
  • [6] Flexible data integration and curation using a graph-based approach
    Croset, Samuel
    Rupp, Joachim
    Romacker, Martin
    BIOINFORMATICS, 2016, 32 (06) : 918 - 925
  • [7] A framework for quality evaluation in data integration systems
    Akoka, J.
    Berti-Equille, L.
    Boucelma, O.
    Bouzeghoub, M.
    Comyn-Wattiau, I.
    Cosquer, M.
    Goasdoue-Thion, V.
    Kedad, Z.
    Nugier, S.
    Peralta, V.
    Sisaid-Cherfi, S.
    ICEIS 2007: PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS: INFORMATION SYSTEMS ANALYSIS AND SPECIFICATION, 2007, : 170 - +
  • [8] Modeling recurring concepts in data streams: a graph-based framework
    Ahmadi, Zahra
    Kramer, Stefan
    KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 55 (01) : 15 - 44
  • [9] A multiobjective evolutionary programming framework for graph-based data mining
    Shelokar, Prakash
    Quirin, Arnaud
    Cordon, Oscar
    INFORMATION SCIENCES, 2013, 237 : 118 - 136
  • [10] Modeling recurring concepts in data streams: a graph-based framework
    Zahra Ahmadi
    Stefan Kramer
    Knowledge and Information Systems, 2018, 55 : 15 - 44