Method for the Assessment of Semantic Accuracy Using Rules Identified by Conditional Functional Dependencies

被引:1
作者
Santana, Vanusa S. [1 ]
Lopes, Fabio S. [2 ]
机构
[1] Inst Technol Res, Grad Master Program IPT, Sao Paulo, Brazil
[2] Univ Prebiteriana Mackenzie, Sao Paulo, Brazil
来源
METADATA AND SEMANTIC RESEARCH, MTSR 2019 | 2019年 / 1057卷
关键词
Data quality; Conditional functional dependence (CFD); Data quality assessment; DATA QUALITY;
D O I
10.1007/978-3-030-36599-8_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data is a central resource of organizations, which makes data quality essential for their intellectual growth. Quality is seen as a multifaceted concept and, in general, refers to suitability for use. This indicates that the pillar for the quality evaluation is the definition of a set of quality rules, determined from the criteria of the business. However, it may be impossible to manually specify the quality rules for the evaluation. The use of Conditional Functional Dependencies (CFDs) allows to automatically identifying context-dependent quality rules. This paper presents a method for assess data quality using the CFD concept to extract quality rules and identify inconsistencies. The quality of the database in the proposed method will be evaluated in the semantic accuracy dimension. The method consolidates the process of knowledge discovery with data quality assessment, listing the respective activities that result in the quantification of semantic accuracy. An instance of the method has been demonstrated by applying it in the context of air quality monitoring data. The evaluation of the method showed that the CFDs rules were able to reflect some atmospheric phenomena, emerging interesting context-dependent rules. The patterns of the transactions, which may be unknown by the users, can be used as input for the evaluation and monitoring of data quality.
引用
收藏
页码:284 / 297
页数:14
相关论文
共 23 条
  • [1] Abdo A.S., 2017, HDB RES MACHINE LEAR, P230
  • [2] Design of a rule based system using Structured Query Language
    Abdullah, Umair
    Sawar, Mohammad Jamil
    Ahmed, Aftab
    [J]. EIGHTH IEEE INTERNATIONAL CONFERENCE ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, PROCEEDINGS, 2009, : 223 - +
  • [3] Assessment of data quality in accounting data with association rules
    Alpar, Paul
    Winkelstraeter, Sven
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (05) : 2259 - 2268
  • [4] [Anonymous], 2008, Software engineering-Software product Quality Requirements and Evaluation (SQuaRE)-Data quality model: International Organization for Standardization
  • [5] Aria, TEOR POL ATM
  • [6] Batini C, 2016, DATA CENTRIC SYST AP, P1, DOI 10.1007/978-3-319-24106-7
  • [7] Batini C., 2008, International Journal of Innovative Computing and Applications, V1, P205, DOI 10.1109/icdim.2007.369236
  • [8] Cetesb website, QUAL
  • [9] Discovering Data Quality Rules
    Chiang, Fei
    Miller, Renee J.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (01): : 1166 - 1177
  • [10] Discovering context-aware conditional functional dependencies
    Du, Yuefeng
    Shen, Derong
    Nie, Tiezheng
    Kou, Yue
    Yu, Ge
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2017, 11 (04) : 688 - 701