An AI Planning System for Data Cleaning

被引:4
|
作者
Boselli, Roberto [1 ,2 ]
Cesarini, Mirko [1 ,2 ]
Mercorio, Fabio [1 ,2 ]
Mezzanzanica, Mario [1 ,2 ]
机构
[1] Univ Milano Bicocca, Dept Stat & Quantitat Methods, Milan, Italy
[2] Univ Milano Bicocca, CRISP Res Ctr, Milan, Italy
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT III | 2017年 / 10536卷
关键词
AI planning; Data quality; Data cleaning; ETL; CHECKING;
D O I
10.1007/978-3-319-71273-4_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data Cleaning represents a crucial and error prone activity in KDD that might have unpredictable effects on data analytics, affecting the believability of the whole KDD process. In this paper we describe how a bridge between AI Planning and Data Quality communities has been made, by expressing both the data quality and cleaning tasks in terms of AI planning. We also report a real-life application of our approach.
引用
收藏
页码:349 / 353
页数:5
相关论文
共 50 条
  • [1] Cleaning data with Llunatic
    Geerts, Floris
    Mecca, Giansalvatore
    Papotti, Paolo
    Santoro, Donatello
    VLDB JOURNAL, 2020, 29 (04): : 867 - 892
  • [2] Cleaning data with Llunatic
    Floris Geerts
    Giansalvatore Mecca
    Paolo Papotti
    Donatello Santoro
    The VLDB Journal, 2020, 29 : 867 - 892
  • [3] A Data Fusion and Data Cleaning System for Smart Grids Big Data
    Lv, Zhining
    Deng, Wei
    Zhang, Zhihan
    Guo, Ningxuan
    Yan, Gangfeng
    2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019), 2019, : 802 - 807
  • [4] A FRAMEWORK FOR DATA CLEANING IN DATA WAREHOUSES
    Peng, Taoxin
    ICEIS 2008: PROCEEDINGS OF THE TENTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL DISI: DATABASES AND INFORMATION SYSTEMS INTEGRATION, 2008, : 473 - 478
  • [5] Design and implementation of extensible system for data cleaning
    Cui Yun-chuan
    Liu Lian-zhang
    PROCEEDINGS OF 2006 CHINESE CONTROL AND DECISION CONFERENCE, 2006, : 829 - 832
  • [6] CrowdCleaner: A Data Cleaning System Based on Crowdsourcing
    Ye, Chen
    Wang, Hongzhi
    Li, Keli
    Chen, Qian
    Chen, Jianhua
    Song, Jiangduo
    Yuan, Weidong
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 657 - 661
  • [7] Research on the Technology of Data Cleaning in Big Data
    Feng, Fu-jun
    Yao, Jun-ping
    Li, Xiao-jun
    2018 2ND INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELING AND SIMULATION (AMMS 2018), 2018, 305 : 176 - 181
  • [8] Visual Cleaning of Genotype Data
    Kennedy, Jessie
    Graham, Martin
    Paterson, Trevor
    Law, Andy
    2013 IEEE SYMPOSIUM ON BIOLOGICAL DATA VISUALIZATION (BIOVIS), 2013, : 105 - 112
  • [9] Ontological Deep Data Cleaning
    Woodfield, Scott N.
    Seeger, Spencer
    Litster, Samuel
    Liddle, Stephen W.
    Grace, Brenden
    Embley, David W.
    CONCEPTUAL MODELING, ER 2018, 2018, 11157 : 100 - 108
  • [10] A Review of Data Cleaning Methods for Web Information System
    Wang, Jinlin
    Wang, Xing
    Yang, Yuchen
    Zhang, Hongli
    Fang, Binxing
    CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 62 (03): : 1053 - 1075