A case-based reasoning system for recommendation of data cleaning algorithms in classification and regression tasks

被引:27
|
作者
Camilo Corrales, David [1 ,2 ]
Ledezma, Agapito [1 ]
Carlos Corrales, Juan [2 ]
机构
[1] Univ Carlos III Madrid, Dept Informat, Madrid 28911, Spain
[2] Univ Cauca, Grp Ingn Telemat, Sector Tulcan, Popayan, Colombia
关键词
Case-based reasoning; Classification; Regression; CONCEPTUAL-FRAMEWORK; KNOWLEDGE DISCOVERY; SUPPORT; SIMILARITY; SELECTION; CBR;
D O I
10.1016/j.asoc.2020.106180
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, advances in Information Technologies (social networks, mobile applications, Internet of Things, etc.) generate a deluge of digital data; but to convert these data into useful information for business decisions is a growing challenge. Exploiting the massive amount of data through knowledge discovery (KD) process includes identifying valid, novel, potentially useful and understandable patterns from a huge volume of data. However, to prepare the data is a non-trivial refinement task that requires technical expertise in methods and algorithms for data cleaning. Consequently, the use of a suitable data analysis technique is a headache for inexpert users. To address these problems, we propose a case-based reasoning system (CBR) to recommend data cleaning algorithms for classification and regression tasks. In our approach, we represent the problem space by the meta-features of the dataset, its attributes, and the target variable. The solution space contains the algorithms of data cleaning used for each dataset. We represent the cases through a Data Cleaning Ontology. The case retrieval mechanism is composed of a filter and similarity phases. In the first phase, we defined two filter approaches based on clustering and quartile analysis. These filters retrieve a reduced number of relevant cases. The second phase computes a ranking of the retrieved cases by filter approaches, and it scores a similarity between a new case and the retrieved cases. The retrieval mechanism proposed was evaluated through a set of judges. The panel of judges scores the similarity between a query case against all cases of the case-base (ground truth). The results of the retrieval mechanism reach an average precision on judges ranking of 94.5% in top 3 (P@3), for top 7 (P@7) 84.55%, while in top 10 (P@10) 78.35%. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] An algorithm for conversational case-based reasoning in classification tasks
    McSherry, David
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8765 : 289 - 304
  • [2] Service Recommendation with Case-based Reasoning
    Yang, Pei
    Mao, Ke
    Zhong, Xianzhong
    Xu, Feng
    2015 IEEE 12TH INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL (ICNSC), 2015, : 631 - 635
  • [3] Product Recommendation in Case-based Reasoning
    Aldayel, Mashael
    Benhidour, Hafida
    2019 2ND INTERNATIONAL CONFERENCE ON COMPUTER APPLICATIONS & INFORMATION SECURITY (ICCAIS), 2019,
  • [4] Classification of Clients on the Basis of Modifying Case-Based Reasoning Algorithms
    Mezera, Filip
    Krupka, Jiri
    MAN-MACHINE INTERACTIONS 5, ICMMI 2017, 2018, 659 : 311 - 319
  • [5] Context awareness by case-based reasoning in a music recommendation system
    Lee, Jae Sik
    Lee, Jin Chun
    UBIQUITOUS COMPUTING SYSTEMS, PROCEEDINGS, 2007, 4836 : 45 - +
  • [6] An intelligent garment recommendation system based on case-based reasoning technology
    Zhang, Junjie
    Zeng, Xianyi
    Dong, Min
    Yuan, Hua
    Zhang, Yun
    INDUSTRIA TEXTILA, 2023, 74 (06): : 633 - 639
  • [7] Research on Personalized Recommendation Case Base and Data Source Based on Case-Based Reasoning
    Sun, Jieli
    Zhu, Zhiqing
    Zhang, Yanpiao
    Zhao, Yanxia
    Zhai, Yao
    CLOUD COMPUTING AND SECURITY, PT II, 2018, 11064 : 114 - 123
  • [8] Hybrid genetic algorithms and case-based reasoning systems for customer classification
    Ahn, Hyunchul
    Kim, Kyoung-Jae
    Han, Ingoo
    EXPERT SYSTEMS, 2006, 23 (03) : 127 - 144
  • [9] An Automatic Adaptive Case-based Reasoning System for Depression Remedy Recommendation
    AlSagri, Hatoon S.
    Ykhlef, Mourad
    Al-Qutt, Mirvat
    AlSanad, Abeer Abdulaziz
    AlSuwaidan, Lulwah
    Al-Alshaikh, Halah Abdulaziz
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (11) : 503 - 511
  • [10] Short-term profiling for a case-based reasoning recommendation system
    Aimeur, E
    Vézeau, M
    MACHINE LEARNING: ECML 2000, 2000, 1810 : 23 - 30