A Novel Clustering-Based Hybrid Feature Selection Approach Using Ant Colony Optimization

被引:6
作者
Dwivedi, Rajesh [1 ]
Tiwari, Aruna [1 ]
Bharill, Neha [2 ]
Ratnaparkhe, Milind [3 ]
机构
[1] IIT Indore, Dept Comp Sci, Indore 453552, Madhya Pradesh, India
[2] Mahindra Univ, Ecole Cent Sch Engn, Dept Comp Sci, Hyderabad 500043, Telangana, India
[3] Indian Inst Soybean Res Indore, ICAR, Indore 452001, Madhya Pradesh, India
关键词
Ant colony optimization; Silhouette index; Laplacian score; K-means clustering; SNP; Protein sequences; ALGORITHM; INDEX;
D O I
10.1007/s13369-023-07719-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Feature selection is an essential task in the field of machine learning, data mining, and pattern recognition, primarily, when we deal with a large number of features. Feature selection assists in enhancing prediction accuracy, reducing computation time, and creating more comprehensible models. In feature selection, each feature has two possibilities, either it would be taken for computation or not, which implies for n number of features, there are 2(n) possible feature subsets. So, identifying a relevant feature subset in a reasonable amount of time is an NP-hard problem, but by using an approximation algorithm, a near-optimal solution can be achieved. However, many of the feature selection algorithms use a sequential search strategy to select relevant features, which adds or removes features from the dataset sequentially and leads to trapped into a local optimum solution. In this paper, we propose a novel clustering-based hybrid feature selection approach using ant colony optimization that selects features randomly and measures the qualities of features by K-means clustering in terms of silhouette index and Laplacian score. The proposed feature selection approach allows random selection of features, which allows a better exploration of feature space and thus avoids the problem of being trapped in a local optimal solution, and generates a global optimal solution. The same is verified when compared with another state-of-the-art method.
引用
收藏
页码:10727 / 10744
页数:18
相关论文
共 35 条
  • [1] Linear Cost-sensitive Max-margin Embedded Feature Selection for SVM
    Aram, Khalid Y.
    Lam, Sarah S.
    Khasawneh, Mohammad T.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 197
  • [2] Multi-parent advanced generation inter-cross (MAGIC) populations in rice: progress and potential for genetics research and breeding
    Bandillo, Nonoy
    Raghavan, Chitra
    Muyco, Pauline Andrea
    Sevilla, Ma Anna Lynn
    Lobina, Irish T.
    Dilla-Ermita, Christine Jade
    Tung, Chih-Wei
    McCouch, Susan
    Thomson, Michael
    Mauleon, Ramil
    Singh, Rakesh Kumar
    Gregorio, Glenn
    Redona, Edilberto
    Leung, Hei
    [J]. RICE, 2013, 6
  • [3] Blake C., 1998, UCI repository of machine learning databases
  • [4] A clustering-based feature selection framework for handwritten Indic script classification
    Chatterjee, Iman
    Ghosh, Manosij
    Sing, Pawan Kumar
    Sarkar, Ram
    Nasipuri, Mita
    [J]. EXPERT SYSTEMS, 2019, 36 (06)
  • [5] Hybrid particle swarm optimization with spiral-shaped mechanism for feature selection
    Chen, Ke
    Zhou, Feng-Yu
    Yuan, Xian-Feng
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 128 : 140 - 156
  • [6] Dash M, 2000, LECT NOTES ARTIF INT, V1805, P110
  • [7] CLUSTER SEPARATION MEASURE
    DAVIES, DL
    BOULDIN, DW
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) : 224 - 227
  • [8] A fast and elitist multiobjective genetic algorithm: NSGA-II
    Deb, K
    Pratap, A
    Agarwal, S
    Meyarivan, T
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2002, 6 (02) : 182 - 197
  • [9] Genome-wide Association Analysis Tracks Bacterial Leaf Blight Resistance Loci In Rice Diverse Germplasm
    Dilla-Ermita, Christine Jade
    Tandayu, Erwin
    Juanillas, Venice Margarette
    Detras, Jeffrey
    Lozada, Dennis Nicuh
    Dwiyanti, Maria Stefanie
    Cruz, Casiana Vera
    Mbanjo, Edwige Gaby Nkouaya
    Ardales, Edna
    Diaz, Maria Genaleen
    Mendioro, Merlyn
    Thomson, Michael J.
    Kretzschmar, Tobias
    [J]. RICE, 2017, 10
  • [10] Dorigo M., 1997, IEEE Transactions on Evolutionary Computation, V1, P53, DOI 10.1109/4235.585892