Knowledge Based Data Cleaning for Data Warehouse Quality

被引:0
作者
Bradji, Louardi [1 ,2 ]
Boufaida, Mahmoud [2 ]
机构
[1] Univ Tebessa, Tebessa 12002, Algeria
[2] Mentouri Univ Constantine, LIRE lab, Constantine 25017, Algeria
来源
DIGITAL INFORMATION PROCESSING AND COMMUNICATIONS, PT 2 | 2011年 / 189卷
关键词
Data Cleaning; Data Quality; Data Warehouse; Knowledge;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes an approach for improvement the quality of data warehouse and operational databases with using knowledge. The benefit of this approach is three-folds. First, the incorporation of knowledge into data cleaning is successful to meet the user's demands and then the data cleaning can be expanded and modified. The knowledge that can be extracted automatically or manually is stored in repository in order to be used and validated among an appropriate process. Second, the propagation of cleaned data to their original sources in order to validate them by the user so the data cleaning can give valid values but incorrect. In addition, the mutual coherence of data is ensured. Third, the user interaction with data cleaning process is taken account in order to control it. The proposed approach is based in the idea that the quality of data will be assured at the sources and the target of data.
引用
收藏
页码:373 / +
页数:3
相关论文
共 20 条
  • [1] Berti-Equille L., 2004, REV NATL TECHNOLOGIE
  • [2] Berti-Equille L., 2009, INT C DAT MIN ICDM 2
  • [3] Favre C, 2007, LECT NOTES COMPUT SC, V4654, P13
  • [4] Biological data cleaning: A case study
    Herbert, Katherine G.
    Wang, Jason T.L.
    [J]. International Journal of Information Quality, 2007, 1 (01) : 60 - 82
  • [5] Huanzhuo Ye, 2010, 2010 2nd International Conference on Computer Engineering and Technology (ICCET), P158, DOI 10.1109/ICCET.2010.5485262
  • [6] Kedad Z., 2002, J INGENIERIE SYSTEME, V7, P39
  • [7] Kororoas A., 2007, INFORM QUALITY MANAG, P221
  • [8] Le Pape C., 2005, Journal of Digital Information Management, V3, P82
  • [9] Exploiting missing clinical data in Bayesian network modeling for predicting medical problems
    Lin, Jau-Huei
    Haug, Peter J.
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2008, 41 (01) : 1 - 14
  • [10] Luebbers D., 2003, Proc. 29th Int. Conf. Very large data bases, V29, P548