Decision tree: Compatibility of techniques for handling missing values at training and testing

被引:0
|
作者
Gavankar S. [1 ]
Sawarkar S. [1 ]
机构
[1] Department of Computer Engineering, Datta Meghe College of Engineering, Mumbai University, Navi Mumbai
来源
| 2016年 / UK Simulation Society, Clifton Lane, Nottingham, NG11 8NS, United Kingdom卷 / 17期
关键词
Compatibility; Data mining; Decision tree; Induction; Missing values; Testing data; Training data;
D O I
10.5013/IJSSST.a.17.34.10
中图分类号
学科分类号
摘要
Data mining rely on large amount of data to make learning model and the quality of data is very important. One of the important problem under data quality is the presence of missing values both at the time of training and testing. There are many methods proposed to deal with missing values in training data. Many of them resort to imputation techniques. However, Very few methods are there to deal with the missing values at testing/prediction time. In this paper, we discuss and summarize various strategies to deal with this problem both at training and testing time. Also, we have proposed the analysis of compatibility between various methods at training and testing. Our analysis indicates that the known value strategy at testing outperformed with various missing value handling techniques for training data followed by C4.5. © 2016, UK Simulation Society. All rights reserved.
引用
收藏
页码:10.1 / 10.7
相关论文
共 50 条
  • [1] Decision Tree: Review of Techniques for Missing Values at Training, Testing and Compatibility
    Gavankar, Sachin
    Sawarkar, Sudhirkumar
    2015 THIRD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, MODELLING AND SIMULATION (AIMS 2015), 2015, : 122 - 126
  • [2] Fuzzy based Techniques for Handling Missing Values
    El-Bakry, Malak
    El-Kilany, Ayman
    Mazen, Sherif
    Ali, Farid
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 50 - 55
  • [3] The problem of missing values in decision tree grafting
    Webb, GI
    ADVANCED TOPICS IN ARTIFICIAL INTELLIGENCE, 1998, 1502 : 273 - 283
  • [4] A Modified Algorithm for Missing Values in Data Stream Decision Tree Classification
    Hou, Xu-shan
    Lv, Pin
    Wang, Hao
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SOFTWARE ENGINEERING (AISE 2014), 2014, : 307 - 313
  • [5] Cost-sensitive Decision Tree with Missing Values and Multiple Cost Scales
    Liu, Xingyi
    FIRST IITA INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, : 294 - 297
  • [6] Techniques for dealing with missing values in classification
    Liu, WZ
    White, AP
    Thompson, SG
    Bramer, MA
    ADVANCES IN INTELLIGENT DATA ANALYSIS: REASONING ABOUT DATA, 1997, 1280 : 527 - 536
  • [7] Improving Decision Tree Performance by Exception Handling
    Appavu Alias Balamurugan Subramanian
    S.Pramala
    B.Rajalakshmi
    Ramasamy Rajaram
    International Journal of Automation & Computing, 2010, 7 (03) : 372 - 380
  • [8] Improving Decision Tree Performance by Exception Handling
    Subramanian A.A.B.
    Pramala S.
    Rajalakshmi B.
    Rajaram R.
    International Journal of Automation and Computing, 2010, 7 (3) : 372 - 380
  • [9] Handling missing values in the MDS-UPDRS
    Goetz, Christopher G.
    Luo, Sheng
    Wang, Lu
    Tilley, Barbara C.
    LaPelle, Nancy R.
    Stebbins, Glenn T.
    MOVEMENT DISORDERS, 2015, 30 (12) : 1632 - 1638
  • [10] Handling missing values in Principal Component Analysis
    Josse, Julie
    Husson, Francois
    Pages, Jerome
    JOURNAL OF THE SFDS, 2009, 150 (02): : 28 - 51