Contextual Data Cleaning with Ontology Functional Dependencies

被引:0
|
作者
Zheng, Zheng [1 ]
Zheng, Longtao [2 ]
Alipourlangouri, Morteza [1 ]
Chiang, Fei [1 ]
Golab, Lukasz [3 ]
Szlichta, Jaroslaw [4 ]
Baskaran, Sridevi [1 ]
机构
[1] McMaster University, 1280 Main Street, Hamilton,ON,L8S 4K1, Canada
[2] University of Science and Technology of China, No. 96, JinZhai Road Baohe District, Anhui, Hefei,230026, China
[3] University of Waterloo, 200 University Ave W, Waterloo,ON,N2L 3G1, Canada
[4] Ontario Tech University, 2000 Simcoe St N, Oshawa,ON,L1G 0C, Canada
关键词
Cleaning; -; Semantics;
D O I
暂无
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Functional Dependencies define attribute relationships based on syntactic equality, and when used in data cleaning, they erroneously label syntactically different but semantically equivalent values as errors. We explore dependency-based data cleaning with Ontology Functional Dependencies (OFDs), which express semantic attribute relationships such as synonyms defined by an ontology. We study the theoretical foundations of OFDs, including sound and complete axioms and a linear-time inference procedure. We then propose an algorithm for discovering OFDs (exact ones and ones that hold with some exceptions) from data that uses the axioms to prune the search space. Toward enabling OFDs as data quality rules in practice, we study the problem of finding minimal repairs to a relation and ontology with respect to a set of OFDs. We demonstrate the effectiveness of our techniques on real datasets and show that OFDs can significantly reduce the number of false positive errors in data cleaning techniques that rely on traditional Functional Dependencies. © 2022 Association for Computing Machinery.
引用
收藏
相关论文
共 50 条
  • [31] RECOGNITION OF FUNCTIONAL DEPENDENCIES USING METEOROLOGICAL DATA
    VAPNIK, VN
    ROMANOV, LN
    IZVESTIYA AKADEMII NAUK SSSR FIZIKA ATMOSFERY I OKEANA, 1978, 14 (02): : 131 - 137
  • [32] Efficient Discovery of Functional Dependencies on Massive Data
    Wan, Xiaolong
    Han, Xixian
    Wang, Jinbao
    Li, Jianzhong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (01) : 107 - 121
  • [33] Ontologies and Functional Dependencies for Data Integration and Reconciliation
    Bakhtouchi, Abdeighani
    Bellatreche, Ladjel
    Ait-Ameur, Yamine
    ADVANCES IN CONCEPTUAL MODELING: RECENT DEVELOPMENTS AND NEW DIRECTIONS, 2011, 6999 : 98 - +
  • [34] Approximate Temporal Functional Dependencies on Clinical Data
    Mantovani, Matteo
    2017 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2017, : 328 - 328
  • [35] Mining relaxed functional dependencies from data
    Loredana Caruccio
    Vincenzo Deufemia
    Giuseppe Polese
    Data Mining and Knowledge Discovery, 2020, 34 : 443 - 477
  • [36] Functional Dependencies Unleashed for Scalable Data Exchange
    Bonifati, Angela
    Ileana, Ioana
    Linardi, Michele
    28TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM) 2016), 2016,
  • [37] Mining relaxed functional dependencies from data
    Caruccio, Loredana
    Deufemia, Vincenzo
    Polese, Giuseppe
    DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (02) : 443 - 477
  • [38] A functional dependencies checking method in relational data
    Zhong P.
    Li Z.-H.
    Chen Q.
    1600, Science Press (40): : 207 - 222
  • [39] Threshold Functional Dependencies for Time Series Data
    Ji, Mingyue
    Wei, Xiukun
    Miao, Dongjing
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2020, 2020, 12115 : 164 - 174
  • [40] Semi-Automatic Ontology Construction by Exploiting Functional Dependencies and Association Rules
    Cagliero, Luca
    Cerquitelli, Tania
    Garza, Paolo
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2011, 7 (02) : 1 - 22