A Methodology for Prior Management of Temporal Data Quality in a Data Mining Process

被引:0
作者
Diop, Mouhamed [1 ]
Camara, Mamadou Samba [1 ]
Fall, Ibrahima [1 ]
Bah, Alassane [1 ]
机构
[1] Univ Cheikh Anta Diop Dakar UCAD, ESP, UMMISCO UCAD, UMI 209, Dakar, Senegal
来源
2017 INTELLIGENT SYSTEMS AND COMPUTER VISION (ISCV) | 2017年
关键词
Data Mining; Data Quality; Temporal Data; CRISP-DM; Software Engineering; Data warehousing; KNOWLEDGE DISCOVERY; MISSING DATA; MODELS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In Data Mining (DM) projects, more specifically in the Data Understanding and the Data Preparation phases, several techniques found in the literature are used to detect and handle data quality problems such as missing data, outliers, inconsistent data or time-variant data. However, the main limitation in the application of these techniques is the complexity caused by a lack of anticipation in the detection and resolution of data quality problems. Then, a DM process model designed for the prior management of data quality was recently proposed. It has the distinctive feature of having linked the DM process and the Software Engineering (SE) one by combining them in parallel. However, authors of that work [1] have just specified what should be done, not how it should be. The present research work is an improvement of that DM process model. It adds to it a methodology that indicates in a concrete way a guideline on how to combine the SE process and the DM one to anticipate and manage data quality problems that can be found during the mining process. This work will specifically address the case of temporal data. The main contribution of this methodology is the definition, in concrete terms, of how to anticipate and automate all activities necessary to remove temporal data quality problems in a mining process.
引用
收藏
页数:8
相关论文
共 34 条
  • [1] [Anonymous], TEMPORAL SPATIOTEMPO
  • [2] [Anonymous], 24744 ISOIEC
  • [3] [Anonymous], LECT NOTES ELECT ENG
  • [4] [Anonymous], LECT NOTES COMPUTER
  • [5] [Anonymous], 1996, ADV KNOWLEDGE DISCOV
  • [6] [Anonymous], MANAGING TIME DATABA
  • [7] [Anonymous], CRISP DM 1 0 STEP BY
  • [8] [Anonymous], THESIS
  • [9] [Anonymous], IEEE 7 INT C RES CHA
  • [10] Antunes C.M., 2001, KDD workshop on temporal data mining, V1, P13