Directly Constructing Multiple Features for Classification with Missing Data using Genetic Programming with Interval Functions

被引:2
作者
Cao Truong Tran [1 ]
Zhang, Mengjie [1 ]
Andreae, Peter [1 ]
Xue, Bing [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
来源
PROCEEDINGS OF THE 2016 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'16 COMPANION) | 2016年
关键词
Missing Data; Classification; Feature Construction; Genetic Programming; Interval Functions;
D O I
10.1145/2908961.2909002
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Missing values are a common issue in many industrial and real-world datasets. Genetic programming-based multiple feature construction (GPMFC) is a recent promising filter approach to constructing multiple features for classification using genetic programming (GP). GPMFC has been demonstrated to improve classification performance and reduce the complexity of many decision trees and rule-based classifiers, but it cannot work with missing data. To deal with missing data, this paper propose IGPMFC, an extension of GPMFC that use interval functions as the GP function set to directly construct multiple features for classification with missing data. Empirical results on five datasets and four classifiers show that IGPMFC can substantially improve the performance and reduce the complexity of the classifiers when faced with missing data.
引用
收藏
页码:69 / 70
页数:2
相关论文
共 5 条
  • [1] Asuncion A., 2007, Uci machine learning repository
  • [2] Keijzer M, 2003, LECT NOTES COMPUT SC, V2610, P70
  • [3] A Filter Approach to Multiple Feature Construction for Symbolic Learning Classifiers Using Genetic Programming
    Neshatian, Kourosh
    Zhang, Mengjie
    Andreae, Peter
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2012, 16 (05) : 645 - 661
  • [4] Tran C. T., 2016, GENETIC PROGRAMMING, P149
  • [5] Multiple imputation using chained equations: Issues and guidance for practice
    White, Ian R.
    Royston, Patrick
    Wood, Angela M.
    [J]. STATISTICS IN MEDICINE, 2011, 30 (04) : 377 - 399