Automatic clustering using nature-inspired metaheuristics: A survey

被引:152
作者
Jose-Garcia, Adan [1 ]
Gomez-Flores, Wilfrido [1 ]
机构
[1] Natl Polytech Inst, Ctr Res & Adv Studies, Informat Technol Lab, Ciudad Victoria, Tamaulipas, Mexico
关键词
Cluster analysis; Automatic clustering; Nature-inspired metaheuristics; Single-objective and multiobjective; metaheuristics; GENETIC ALGORITHM; DIFFERENTIAL EVOLUTION; OPTIMIZATION ALGORITHM; PIXEL CLASSIFICATION; VALIDITY MEASURE; TABU SEARCH; PERFORMANCE; INDEXES;
D O I
10.1016/j.asoc.2015.12.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In cluster analysis, a fundamental problem is to determine the best estimate of the number of clusters; this is known as the automatic clustering problem. Because of lack of prior domain knowledge, it is difficult to choose an appropriate number of clusters, especially when the data have many dimensions, when clusters differ widely in shape, size, and density, and when overlapping exists among groups. In the late 1990s, the automatic clustering problem gave rise to a new era in cluster analysis with the application of nature-inspired metaheuristics. Since then, researchers have developed several new algorithms in this field. This paper presents an up-to-date review of all major nature-inspired metaheuristic algorithms used thus far for automatic clustering. Also, the main components involved during the formulation of metaheuristics for automatic clustering are presented, such as encoding schemes, validity indices, and proximity measures. A total of 65 automatic clustering approaches are reviewed, which are based on single-solution, single-objective, and multiobjective metaheuristics, whose usage percentages are 3%, 69%, and 28%, respectively. Single-objective clustering algorithms are adequate to efficiently group linearly separable clusters. However, a strong tendency in using multiobjective algorithms is found nowadays to address non-linearly separable problems. Finally, a discussion and some emerging research directions are presented. (C) 2016 Published by Elsevier B.V.
引用
收藏
页码:192 / 213
页数:22
相关论文
共 167 条
  • [61] A MULTI-OBJECTIVE GRAVITATIONAL SEARCH ALGORITHM
    Hassanzadeh, Hamid Reza
    Rouhani, Modjtaba
    [J]. 2010 SECOND INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, COMMUNICATION SYSTEMS AND NETWORKS (CICSYN), 2010, : 7 - 12
  • [62] Black hole: A new heuristic optimization approach for data clustering
    Hatamlou, Abdolreza
    [J]. INFORMATION SCIENCES, 2013, 222 : 175 - 184
  • [63] Hatamlou A, 2011, LECT NOTES ARTIF INT, V6954, P337, DOI 10.1007/978-3-642-24425-4_44
  • [64] A two-stage genetic algorithm for automatic clustering
    He, Hong
    Tan, Yonghong
    [J]. NEUROCOMPUTING, 2012, 81 : 49 - 59
  • [65] Holland JH., 1992, ADAPTATION NATURAL A, DOI [10.7551/mitpress/1090.001.0001, DOI 10.7551/MITPRESS/1090.001.0001]
  • [66] Hopcroft J., 2014, FDN DATA SCI, P260
  • [67] Evolutionary fuzzy clustering of relational data
    Horta, Danilo
    de Andrade, Ivan C.
    Campello, Ricardo J. G. B.
    [J]. THEORETICAL COMPUTER SCIENCE, 2011, 412 (42) : 5854 - 5870
  • [68] Hruschka E. R., 2003, Intelligent Data Analysis, V7, P15
  • [69] Evolving clusters in gene-expression data
    Hruschka, Eduardo R.
    Campello, Ricardo J. G. B.
    de Castro, Leandro N.
    [J]. INFORMATION SCIENCES, 2006, 176 (13) : 1898 - 1927
  • [70] A Survey of Evolutionary Algorithms for Clustering
    Hruschka, Eduardo Raul
    Campello, Ricardo J. G. B.
    Freitas, Alex A.
    de Carvalho, Andre C. Ponce Leon F.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2009, 39 (02): : 133 - 155