Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach

被引:889
作者
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
Browne, Will N. [1 ]
机构
[1] Victoria Univ Wellington, Evolutionary Computat Res Grp, Wellington 6140, New Zealand
关键词
Feature selection; multi-objective optimization; particle swarm optimization (PSO); PSO; ALGORITHM; RANKING;
D O I
10.1109/TSMCB.2012.2227469
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Classification problems often have a large number of features in the data sets, but not all of them are useful for classification. Irrelevant and redundant features may even reduce the performance. Feature selection aims to choose a small number of relevant features to achieve similar or even better classification performance than using all features. It has two main conflicting objectives of maximizing the classification performance and minimizing the number of features. However, most existing feature selection algorithms treat the task as a single objective problem. This paper presents the first study on multi-objective particle swarm optimization (PSO) for feature selection. The task is to generate a Pareto front of nondominated solutions (feature subsets). We investigate two PSO-based multi-objective feature selection algorithms. The first algorithm introduces the idea of nondominated sorting into PSO to address feature selection problems. The second algorithm applies the ideas of crowding, mutation, and dominance to PSO to search for the Pareto front solutions. The two multi-objective algorithms are compared with two conventional feature selection methods, a single objective feature selection method, a two-stage feature selection algorithm, and three well-known evolutionary multi-objective algorithms on 12 benchmark data sets. The experimental results show that the two PSO-based multi-objective algorithms can automatically evolve a set of nondominated solutions. The first algorithm outperforms the two conventional methods, the single objective method, and the two-stage algorithm. It achieves comparable results with the existing three well-known multi-objective algorithms in most cases. The second algorithm achieves better results than the first algorithm and all other methods mentioned previously.
引用
收藏
页码:1656 / 1671
页数:16
相关论文
共 58 条
[1]  
Abeel T, 2009, J MACH LEARN RES, V10, P931
[2]   LEARNING BOOLEAN CONCEPTS IN THE PRESENCE OF MANY IRRELEVANT FEATURES [J].
ALMUALLIM, H ;
DIETTERICH, TG .
ARTIFICIAL INTELLIGENCE, 1994, 69 (1-2) :279-305
[3]  
[Anonymous], 2012, P 35 AUSTR COMP SCI
[4]  
[Anonymous], P 9 INT WORKSH MACH
[5]  
[Anonymous], 1999, P 1999 C EV COMP EV
[6]  
[Anonymous], 2002, Computational Intelligence an Introduction
[7]  
[Anonymous], 2001, THESIS U PRETORIA PR
[8]  
Auger A, 2009, FOGA'09: PROCEEDINGS OF THE 10TH ACM SIGRVO CONFERENCE ON FOUNDATIONS OF GENETIC ALGORITHMS, P87
[9]   An approach to feature selection for keystroke dynamics systems based on PSO and feature weighting [J].
Azevedo, Gabriel L. F. B. G. ;
Cavalcanti, George D. C. ;
Carvalho Filho, E. C. B. .
2007 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-10, PROCEEDINGS, 2007, :3577-3584
[10]   Evolutionary rough feature selection in gene expression data [J].
Banerjee, Mohua ;
Mitra, Sushmita ;
Banka, Haider .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2007, 37 (04) :622-632