An AI Planning System for Data Cleaning

被引:4
|
作者
Boselli, Roberto [1 ,2 ]
Cesarini, Mirko [1 ,2 ]
Mercorio, Fabio [1 ,2 ]
Mezzanzanica, Mario [1 ,2 ]
机构
[1] Univ Milano Bicocca, Dept Stat & Quantitat Methods, Milan, Italy
[2] Univ Milano Bicocca, CRISP Res Ctr, Milan, Italy
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT III | 2017年 / 10536卷
关键词
AI planning; Data quality; Data cleaning; ETL; CHECKING;
D O I
10.1007/978-3-319-71273-4_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data Cleaning represents a crucial and error prone activity in KDD that might have unpredictable effects on data analytics, affecting the believability of the whole KDD process. In this paper we describe how a bridge between AI Planning and Data Quality communities has been made, by expressing both the data quality and cleaning tasks in terms of AI planning. We also report a real-life application of our approach.
引用
收藏
页码:349 / 353
页数:5
相关论文
共 50 条
  • [41] Cleaning genotype data
    Broman, KW
    GENETIC EPIDEMIOLOGY, 1999, 17 : S79 - S83
  • [42] Possibilistic Data Cleaning
    Koehler, Henning
    Link, Sebastian
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (12) : 5939 - 5950
  • [43] A data preparation framework for cleaning electronic health records and assessing cleaning outcomes for secondary analysis
    Miao, Zhuqi
    Sealey, Meghan D.
    Sathyanarayanan, Shrieraam
    Delen, Dursun
    Zhu, Lan
    Shepherd, Scott
    INFORMATION SYSTEMS, 2023, 111
  • [44] Towards Transparent Data Cleaning: The Data Cleaning Model Explorer (DCM/X)
    Parulian, Nikolaus N.
    Ludascher, Bertram
    2021 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2021), 2021, : 326 - 327
  • [45] Data cleaning process for HIV-indicator data extracted from DHIS2 national reporting system: a case study of Kenya
    Gesicho, Milka Bochere
    Were, Martin Chieng
    Babic, Ankica
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2020, 20 (01)
  • [46] Cleaning Framework for BigData - AN INTERACTIVE APPROACH FOR DATA CLEANING
    Liu, Hong
    Kumar, Ashwin T. K.
    Thomas, Johnson P.
    Hou, Xiaofei
    PROCEEDINGS 2016 IEEE SECOND INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2016), 2016, : 174 - 181
  • [47] bdc: A toolkit for standardizing, integrating and cleaning biodiversity data
    Ribeiro, Bruno R.
    Elias Velazco, Santiago Jose
    Guidoni-Martins, Karlo
    Tessarolo, Geiziane
    Jardim, Lucas
    Bachman, Steven P.
    Loyola, Rafael
    METHODS IN ECOLOGY AND EVOLUTION, 2022, 13 (07): : 1421 - 1428
  • [48] An Incorrect Data Detection Method for Big Data Cleaning of Machinery Condition Monitoring
    Xu, Xuefang
    Lei, Yaguo
    Li, Zeda
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2020, 67 (03) : 2326 - 2336
  • [49] Research on Data Cleaning Method Based on SNM Algorithm
    Zhang, Ningning
    Guo, Aizhang
    Sun, Tao
    2017 IEEE 2ND ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2017, : 2639 - 2643
  • [50] A Universal Data Cleaning Framework Based on User Model
    Huang Yu
    Zhang Xiao-yi
    Yuan Zhen
    Jiang Guo-quan
    2009 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL II, 2009, : 200 - 202