Feature generation using genetic programming with comparative partner selection for diabetes classification

被引:59
作者
Aslam, Muhammad Waqar [1 ]
Zhu, Zhechen [2 ]
Nandi, Asoke Kumar [2 ,3 ]
机构
[1] Univ Liverpool, Dept Elect Engn & Elect, Liverpool L69 3GJ, Merseyside, England
[2] Brunel Univ, Dept Elect & Comp Engn, Uxbridge UB8 3PH, Middx, England
[3] Univ Jyvaskyla, Dept Math Informat Technol, FI-40014 Jyvaskyla, Finland
关键词
Pima Indian diabetes; Genetic programming; Comparative partner selection; EXPERT-SYSTEM; DIAGNOSIS; EXTRACTION; DESIGN;
D O I
10.1016/j.eswa.2013.04.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ultimate aim of this research is to facilitate the diagnosis of diabetes, a rapidly increasing disease in the world. In this research a genetic programming (GP) based method has been used for diabetes classification. GP has been used to generate new features by making combinations of the existing diabetes features, without prior knowledge of the probability distribution. The proposed method has three stages: features selection is performed at the first stage using t-test, Kolmogorov-Smirnov test, Kullback-Leibler divergence test, F-score selection, and GP. The results of feature selection methods are used to prepare an ordered list of original features where features are arranged in decreasing order of importance. Different subsets of original features are prepared by adding features one by one in each subset using sequential forward selection method according to the ordered list. At the second stage, GP is used to generate new features from each subset of original diabetes features, by making non-linear combinations of the original features. A variation of GP called GP with comparative partner selection (GP-CPS), utilising the strengths and the weaknesses of GP generated features, has been used at the second stage. The performance of GP generated features for classification is tested using the k-nearest neighbor and support vector machine classifiers at the last stage. The results and their comparisons with other methods demonstrate that the proposed method exhibits superior performance over other recent methods. (c) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:5402 / 5412
页数:11
相关论文
共 34 条
  • [21] Genetic programming for simultaneous feature selection and classifier design
    Muni, DP
    Pal, NR
    Das, J
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2006, 36 (01): : 106 - 117
  • [22] A novel approach to design classifiers using genetic programming
    Muni, DP
    Pal, NR
    Das, J
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2004, 8 (02) : 183 - 196
  • [23] Classification of breast masses in mammograms using genetic programming and feature selection
    Nandi, R. J.
    Nandi, A. K.
    Rangayyan, R. M.
    Scutt, D.
    [J]. MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2006, 44 (08) : 683 - 694
  • [24] Paramasivam I., 2011, INT J COMPUTER APPL, V29, P1
  • [25] Pearson R. K., 2006, ACM SIGKDD EXPLORATI, V8, P83
  • [26] An expert system approach based on principal component analysis and adaptive neuro-fuzzy inference system to diagnosis of diabetes disease
    Polat, Kemal
    Guenes, Salih
    [J]. DIGITAL SIGNAL PROCESSING, 2007, 17 (04) : 702 - 710
  • [27] Press W. H., 2002, NUMERICAL RECIPES C
  • [28] Neural-network feature selector
    Setiono, R
    Liu, H
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (03): : 654 - 662
  • [29] A NOTE ON GENETIC ALGORITHMS FOR LARGE-SCALE FEATURE-SELECTION
    SIEDLECKI, W
    SKLANSKY, J
    [J]. PATTERN RECOGNITION LETTERS, 1989, 10 (05) : 335 - 347
  • [30] A comparative study on diabetes disease diagnosis using neural networks
    Temurtas, Hasan
    Yumusak, Nejat
    Temurtas, Feyzullah
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (04) : 8610 - 8615