Selecting Optimal Feature Set in High-Dimensional Data by Swarm Search

被引:16
|
作者
Fong, Simon [1 ]
Zhuang, Yan [1 ]
Tang, Rui [1 ]
Yang, Xin-She [2 ]
Deb, Suash [3 ]
机构
[1] Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China
[2] Middlesex Univ, Fac Sci & Technol, London N17 8HR, England
[3] Cambridge Inst Technol, Dept Comp Sci & Engn, Ranchi, Bihar, India
关键词
D O I
10.1155/2013/590614
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Selecting the right set of features from data of high dimensionality for inducing an accurate classification model is a tough computational challenge. It is almost a NP-hard problem as the combinations of features escalate exponentially as the number of features increases. Unfortunately in data mining, as well as other engineering applications and bioinformatics, some data are described by a long array of features. Many feature subset selection algorithms have been proposed in the past, but not all of them are effective. Since it takes seemingly forever to use brute force in exhaustively trying every possible combination of features, stochastic optimization may be a solution. In this paper, we propose a new feature selection scheme called Swarm Search to find an optimal feature set by using metaheuristics. The advantage of Swarm Search is its flexibility in integrating any classifier into its fitness function and plugging in any metaheuristic algorithm to facilitate heuristic search. Simulation experiments are carried out by testing the Swarm Search over some high-dimensional datasets, with different classification algorithms and various metaheuristic algorithms. The comparative experiment results show that Swarm Search is able to attain relatively low error rates in classification without shrinking the size of the feature subset to its minimum.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Investigation on particle swarm optimisation for feature selection on high-dimensional data: local search and selection bias
    Binh Tran
    Xue, Bing
    Zhang, Mengjie
    Su Nguyen
    CONNECTION SCIENCE, 2016, 28 (03) : 270 - 294
  • [2] Feature-Robust Optimal Transport for High-Dimensional Data
    Petrovich, Mathis
    Liang, Chao
    Sato, Ryoma
    Liu, Yanbin
    Tsai, Yao-Hung Hubert
    Zhu, Linchao
    Yang, Yi
    Salakhutdinov, Ruslan
    Yamada, Makoto
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT V, 2023, 13717 : 291 - 307
  • [3] Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification
    Vipin Kumar
    Sonajharia Minz
    Knowledge and Information Systems, 2016, 49 : 1 - 59
  • [4] Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification
    Kumar, Vipin
    Minz, Sonajharia
    KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 49 (01) : 1 - 59
  • [5] Extended particle swarm optimization for feature selection of high-dimensional biomedical data
    Al-Shammary, Dhiah
    Albukhnefis, Adil L.
    Alsaeedi, Ali Hakem
    Al-Asfoor, Muntasir
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (10):
  • [6] Feature selection for high-dimensional data
    Destrero A.
    Mosci S.
    De Mol C.
    Verri A.
    Odone F.
    Computational Management Science, 2009, 6 (1) : 25 - 40
  • [7] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
  • [8] An Asymmetric Chaotic Competitive Swarm Optimization Algorithm for Feature Selection in High-Dimensional Data
    Pichai, Supailin
    Sunat, Khamron
    Chiewchanwattana, Sirapat
    SYMMETRY-BASEL, 2020, 12 (11): : 1 - 13
  • [9] On selecting interacting features from high-dimensional data
    Hall, Peter
    Xue, Jing-Hao
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 : 694 - 708
  • [10] Particle swarm optimization in high-dimensional bounded search spaces
    Helwig, Sabine
    Wanka, Rolf
    2007 IEEE SWARM INTELLIGENCE SYMPOSIUM, 2007, : 198 - +