Genetic Programming Representations for Multi-dimensional Feature Learning in Biomedical Classification

被引:10
作者
La Cava, William [1 ]
Silva, Sara [2 ,3 ]
Vanneschi, Leonardo [4 ]
Spector, Lee [5 ]
Moore, Jason [1 ]
机构
[1] Univ Penn, Inst Biomed Informat, Philadelphia, PA 19104 USA
[2] Univ Lisbon, Fac Ciencias, Dept Informat, BioISI Biosyst & Integrat Sci Inst, P-1749016 Lisbon, Portugal
[3] Univ Coimbra, CISUC, Dept Informat Engn, Coimbra, Portugal
[4] Univ Nova Lisboa, NOVA IMS, P-1070312 Lisbon, Portugal
[5] Hampshire Coll, Sch Cognit Sci, Amherst, MA 01002 USA
来源
APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2017, PT I | 2017年 / 10199卷
基金
美国国家科学基金会;
关键词
Genetic programming; Feature learning; Classification; MULTICLASS CLASSIFICATION; FEATURE-SELECTION;
D O I
10.1007/978-3-319-55849-3_11
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a new classification method that uses genetic programming (GP) to evolve feature transformations for a deterministic, distanced-based classifier. This method, called M4GP, differs from common approaches to classifier representation in GP in that it does not enforce arbitrary decision boundaries and it allows individuals to produce multiple outputs via a stack-based GP system. In comparison to typical methods of classification, M4GP can be advantageous in its ability to produce readable models. We conduct a comprehensive study of M4GP, first in comparison to other GP classifiers, and then in comparison to six common machine learning classifiers. We conduct full hyper-parameter optimization for all of the methods on a suite of 16 biomedical data sets, ranging in size and difficulty. The results indicate that M4GP outperforms other GP methods for classification. M4GP performs competitively with other machine learning methods in terms of the accuracy of the produced models for most problems. M4GP also exhibits the ability to detect epistatic interactions better than the other methods.
引用
收藏
页码:158 / 173
页数:16
相关论文
共 40 条
  • [1] Building Predictive Models via Feature Synthesis
    Arnaldo, Ignacio
    O'Reilly, Una-May
    Veeramachaneni, Kalyan
    [J]. GECCO'15: PROCEEDINGS OF THE 2015 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2015, : 983 - 990
  • [2] Bache K., 2013, UCI Machine Learning Repository
  • [3] Caruana R., 2006, P 23 INT C MACH LEAR, P161, DOI DOI 10.1145/1143844.1143865
  • [4] Genetic programming-based feature transform and classification for the automatic detection of pulmonary nodules on computed tomography images
    Choi, Wook-Jin
    Choi, Tae-Sun
    [J]. INFORMATION SCIENCES, 2012, 212 : 57 - 78
  • [5] A relevance feedback method based on genetic programming for classification of remote sensing images
    dos Santos, J. A.
    Ferreira, C. D.
    Torres, R. da S.
    Goncalves, M. A.
    Lamparelli, R. A. C.
    [J]. INFORMATION SCIENCES, 2011, 181 (13) : 2671 - 2684
  • [6] A Survey on the Application of Genetic Programming to Classification
    Espejo, Pedro G.
    Ventura, Sebastian
    Herrera, Francisco
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2010, 40 (02): : 121 - 144
  • [7] Fang YS, 2010, LECT NOTES COMPUT SC, V6382, P181, DOI 10.1007/978-3-642-16493-4_19
  • [8] Guyon I, 2003, J MACH LEARN RES, V3, P1157, DOI DOI 10.1162/153244303322753616
  • [9] Hall M., 2009, SIGKDD EXPLORATIONS, V11, P10, DOI [DOI 10.1145/1656274.1656278, 10.1145/1656274.1656278]
  • [10] Solving Uncompromising Problems With Lexicase Selection
    Helmuth, Thomas
    Spector, Lee
    Matheson, James
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2015, 19 (05) : 630 - 643