Contextual Data Cleaning with Ontology Functional Dependencies

被引:0
|
作者
Zheng, Zheng [1 ]
Zheng, Longtao [2 ]
Alipourlangouri, Morteza [1 ]
Chiang, Fei [1 ]
Golab, Lukasz [3 ]
Szlichta, Jaroslaw [4 ]
Baskaran, Sridevi [1 ]
机构
[1] McMaster University, 1280 Main Street, Hamilton,ON,L8S 4K1, Canada
[2] University of Science and Technology of China, No. 96, JinZhai Road Baohe District, Anhui, Hefei,230026, China
[3] University of Waterloo, 200 University Ave W, Waterloo,ON,N2L 3G1, Canada
[4] Ontario Tech University, 2000 Simcoe St N, Oshawa,ON,L1G 0C, Canada
关键词
Cleaning; -; Semantics;
D O I
暂无
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Functional Dependencies define attribute relationships based on syntactic equality, and when used in data cleaning, they erroneously label syntactically different but semantically equivalent values as errors. We explore dependency-based data cleaning with Ontology Functional Dependencies (OFDs), which express semantic attribute relationships such as synonyms defined by an ontology. We study the theoretical foundations of OFDs, including sound and complete axioms and a linear-time inference procedure. We then propose an algorithm for discovering OFDs (exact ones and ones that hold with some exceptions) from data that uses the axioms to prune the search space. Toward enabling OFDs as data quality rules in practice, we study the problem of finding minimal repairs to a relation and ontology with respect to a set of OFDs. We demonstrate the effectiveness of our techniques on real datasets and show that OFDs can significantly reduce the number of false positive errors in data cleaning techniques that rely on traditional Functional Dependencies. © 2022 Association for Computing Machinery.
引用
收藏
相关论文
共 50 条
  • [41] Elaboration on functional dependencies: Functional dependencies are dead, long live functional dependencies!
    Karachalias G.
    Schrijvers T.
    ACM SIGPLAN Not., 10 (133-147): : 133 - 147
  • [42] Elaboration on Functional Dependencies: Functional Dependencies Are Dead, Long Live Functional Dependencies!
    Karachalias, Georgios
    Schrijvers, Tom
    ACM SIGPLAN NOTICES, 2017, 52 (10) : 133 - 147
  • [43] A consistency cleaning method based on content-related conditional functional dependencies
    Du, Yue-Feng (dr.duyuefeng@gmail.com), 1683, Northeast University (37):
  • [44] Foundational challenges in automated Semantic Web data and ontology cleaning
    Alonso-Jiménez, JA
    Borrego-Díaz, J
    Chávez-González, AM
    Martín-Mateos, FJ
    IEEE INTELLIGENT SYSTEMS, 2006, 21 (01) : 42 - 52
  • [45] Evaluating ontology cleaning
    Welty, C
    Mahindru, R
    Chu-Carroll, J
    PROCEEDING OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE SIXTEENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2004, : 311 - 316
  • [46] Preserving logical and functional dependencies in synthetic tabular data
    Umesh, Chaithra
    Schultz, Kristian
    Mahendra, Manjunath
    Bej, Saptarshi
    Wolkenhauer, Olaf
    PATTERN RECOGNITION, 2025, 163
  • [47] Contextual dependencies in a stimulus equivalence paradigm
    Dibbets, P
    Maes, JHR
    Vossen, JMH
    QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY SECTION B-COMPARATIVE AND PHYSIOLOGICAL PSYCHOLOGY, 2002, 55 (02): : 97 - 119
  • [48] Cardinality constraints and functional dependencies over possibilistic data
    Roblot, Tania
    Link, Sebastian
    DATA & KNOWLEDGE ENGINEERING, 2018, 117 : 339 - 358
  • [49] Contextual Dependencies in Unsupervised Word Segmentation
    Goldwater, Sharon
    Griffiths, Thomas L.
    Johnson, Mark
    COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 673 - 680
  • [50] On the Existence of Armstrong Data Trees for XML Functional Dependencies
    Hartmann, Sven
    Koehler, Henning
    Trinh, Thu
    FOUNDATIONS OF INFORMATION AND KNOWLEDGE SYSTEMS, PROCEEDINGS, 2010, 5956 : 94 - +