Feature Extraction and Selection for Parsimonious Classifiers With Multiobjective Genetic Programming

被引:35
作者
Nag, Kaustuv [1 ]
Pal, Nikhil R. [2 ]
机构
[1] Indian Inst Informat Technol Guwahati, Dept Comp Sci & Engn, Gauhati 781015, India
[2] Indian Stat Inst, Elect & Commun Sci Unit, Kolkata 700108, India
关键词
Classification; ensemble; feature extraction (FE); feature selection (FS); multiobjective genetic programming (MOGP); FITNESS FUNCTIONS; CLASSIFICATION; EVOLUTIONARY; PREDICTION; DISCOVERY; CANCER; RULES; TESTS;
D O I
10.1109/TEVC.2019.2927526
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The objectives of this paper are to investigate the capability of genetic programming to select and extract linearly separable features when the evolutionary process is guided to achieve the same and to propose an integrated system for that. We decompose a c-class problem into c binary classification problems and evolve c sets of binary classifiers employing a steady-state multiobjective genetic programming with three minimizing objectives. Each binary classifier is composed of a binary tree and a linear support vector machine (SVM). The features extracted by the feature nodes and some of the function nodes of the tree are used to train the SVM. The decision made by the SVM is considered the decision of the corresponding classifier. During crossover and mutation, the SVM-weights are used to determine the usefulness of the corresponding nodes. We also use a fitness function based on Golub's index to select useful features. To discard less frequently used features, we employ unfitness functions for the feature nodes. We compare our method with 34 classification systems using 18 datasets. The performance of the proposed method is found to be better than 432 out of 570, i.e., 75.79% of comparing cases. Our results confirm that the proposed method is capable of achieving our objectives.
引用
收藏
页码:454 / 466
页数:13
相关论文
共 81 条
  • [1] Binary Image Classification: A Genetic Programming Approach to the Problem of Limited Training Instances
    Al-Sahaf, Harith
    Zhang, Mengjie
    Johnston, Mark
    [J]. EVOLUTIONARY COMPUTATION, 2016, 24 (01) : 143 - 182
  • [2] LEARNING BOOLEAN CONCEPTS IN THE PRESENCE OF MANY IRRELEVANT FEATURES
    ALMUALLIM, H
    DIETTERICH, TG
    [J]. ARTIFICIAL INTELLIGENCE, 1994, 69 (1-2) : 279 - 305
  • [3] [Anonymous], 2008, A Field Guide to Genetic Programing
  • [4] Multiple Regression Genetic Programming
    Arnaldo, Ignacio
    Krawiec, Krzysztof
    O'Reilly, Una-May
    [J]. GECCO'14: PROCEEDINGS OF THE 2014 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2014, : 879 - 886
  • [5] Developing New Fitness Functions in Genetic Programming for Classification With Unbalanced Data
    Bhowan, Urvesh
    Johnston, Mark
    Zhang, Mengjie
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (02): : 406 - 421
  • [6] Genetic programming for feature construction and selection in classification on high-dimensional data
    Binh Tran
    Xue, Bing
    Zhang, Mengjie
    [J]. MEMETIC COMPUTING, 2016, 8 (01) : 3 - 15
  • [7] Genetic programming for knowledge discovery in chest-pain diagnosis
    Bojarczuk, CC
    Lopes, HS
    Freitas, AA
    [J]. IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, 2000, 19 (04): : 38 - 44
  • [8] Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
  • [9] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [10] Multi-objective genetic programming for feature extraction and data visualization
    Cano, Alberto
    Ventura, Sebastian
    Cios, Krzysztof J.
    [J]. SOFT COMPUTING, 2017, 21 (08) : 2069 - 2089