Contextual Data Cleaning with Ontology Functional Dependencies

被引:2
|
作者
Zheng, Zheng [1 ]
Zheng, Longtao [2 ]
Alipourlangouri, Morteza [1 ]
Chiang, Fei [1 ]
Golab, Lukasz [3 ]
Szlichta, Jaroslaw [4 ]
Baskaran, Sridevi [1 ]
机构
[1] McMaster Univ, 1280 Main St, West Hamilton, ON L8S 4K1, Canada
[2] Univ Sci & Technol China, 96 JinZhai Rd, Hefei 230026, Anhui, Peoples R China
[3] Univ Waterloo, 200 Univ Ave W, Waterloo, ON N2L 3G1, Canada
[4] Ontario Tech Univ, 2000 Simcoe St N, Oshawa, ON L1G 0C, Canada
来源
关键词
Data cleaning; ontology functional dependencies; EFFICIENT DISCOVERY; MODEL;
D O I
10.1145/3524303
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Functional Dependencies define attribute relationships based on syntactic equality, and when used in data cleaning, they erroneously label syntactically different but semantically equivalent values as errors. We explore dependency-based data cleaning with Ontology Functional Dependencies (OFDs), which express semantic attribute relationships such as synonyms defined by an ontology. We study the theoretical foundations of OFDs, including sound and complete axioms and a linear-time inference procedure. We then propose an algorithm for discovering OFDs (exact ones and ones that hold with some exceptions) from data that uses the axioms to prune the search space. Toward enabling OFDs as data quality rules in practice, we study the problem of finding minimal repairs to a relation and ontology with respect to a set of OFDs. We demonstrate the effectiveness of our techniques on real datasets and show that OFDs can significantly reduce the number of false positive errors in data cleaning techniques that rely on traditional Functional Dependencies.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] Data Cleaning and Query Answering with Matching Dependencies and Matching Functions
    Bertossi, Leopoldo
    Kolahi, Solmaz
    Lakshmanan, Laks V. S.
    THEORY OF COMPUTING SYSTEMS, 2013, 52 (03) : 441 - 482
  • [22] Rule mining for automatic ontology based data cleaning
    Brueggemann, Stefan
    PROGRESS IN WWW RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2008, 4976 : 522 - 527
  • [23] Contextual dependencies and gender strategy
    Bednar, PM
    INFORMATION SYSTEMS RESEARCH: RELEVANT THEORY AND INFORMED PRACTICE, 2004, : 681 - 686
  • [24] Contextual dependencies in predictive learning
    Dibbets, P
    Maes, JHR
    Boermans, K
    Vossen, JMH
    MEMORY, 2001, 9 (01) : 29 - 38
  • [25] Improving XML Data Quality with Functional Dependencies
    Tan, Zijing
    Zhang, Liyong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT I, 2011, 6587 : 450 - 465
  • [26] Managing merged data by vague functional dependencies
    Lu, A
    Ng, W
    CONCEPTUAL MODELING - ER 2004, PROCEEDINGS, 2004, 3288 : 259 - 272
  • [27] Conditional functional dependencies for capturing data inconsistencies
    Fan, Wenfei
    Geerts, Floris
    Jia, Xibei
    Kementsietsidis, Anastasios
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2008, 33 (02):
  • [28] FUNCTIONAL-DEPENDENCIES IN HIERARCHICAL STRUCTURES OF DATA
    YEMELCHENKOV, YP
    TSALENKO, MS
    LECTURE NOTES IN COMPUTER SCIENCE, 1991, 495 : 258 - 275
  • [29] Fast Detection of Functional Dependencies in XML Data
    Shi, Hang
    Amagasa, Toshiyuki
    Kitagawa, Hiroyuki
    DATABASE AND XML TECHNOLOGIES, 2010, 6309 : 113 - +
  • [30] Functional dependencies for selecting views in data cubes
    Garnaud, Eve
    Maabout, Sofian
    Mosbah, Mohamed
    JOURNAL OF DECISION SYSTEMS, 2012, 21 (01) : 71 - 91