Gaussian model-based partitioning using iterated local search

被引:2
作者
Brusco, Michael J. [1 ]
Shireman, Emilie [2 ]
Steinley, Douglas [2 ]
Brudvig, Susan [3 ]
Cradit, J. Dennis [1 ]
机构
[1] Florida State Univ, Tallahassee, FL 32306 USA
[2] Univ Missouri, Columbia, MO USA
[3] Indiana Univ East, Richmond, IN USA
关键词
clustering; model-based partitioning; heuristics; WITHIN-CLUSTER SUMS; BINARY DATA; ALGORITHM; BRANCH; SELECTION; NUMBER; LIKELIHOOD; VARIABLES; CRITERIA;
D O I
10.1111/bmsp.12084
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The emergence of Gaussian model-based partitioning as a viable alternative to K-means clustering fosters a need for discrete optimization methods that can be efficiently implemented using model-based criteria. A variety of alternative partitioning criteria have been proposed for more general data conditions that permit elliptical clusters, different spatial orientations for the clusters, and unequal cluster sizes. Unfortunately, many of these partitioning criteria are computationally demanding, which makes the multiple-restart (multistart) approach commonly used for K-means partitioning less effective as a heuristic solution strategy. As an alternative, we propose an approach based on iterated local search (ILS), which has proved effective in previous combinatorial data analysis contexts. We compared multistart, ILS and hybrid multistart-ILS procedures for minimizing a very general model-based criterion that assumes no restrictions on cluster size or within-group covariance structure. This comparison, which used 23 data sets from the classification literature, revealed that the ILS and hybrid heuristics generally provided better criterion function values than the multistart approach when all three methods were constrained to the same 10-min time limit. In many instances, these differences in criterion function values reflected profound differences in the partitions obtained.
引用
收藏
页码:1 / 24
页数:24
相关论文
共 87 条
  • [51] BRANCH AND BOUND CLUSTERING ALGORITHM
    KOONTZ, WLG
    NARENDRA, PM
    FUKUNAGA, K
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 1975, 24 (09) : 908 - 915
  • [52] LICHMAN M., 2013, UCI MACHINE LEARNING
  • [53] EFFECTIVE HEURISTIC ALGORITHM FOR TRAVELING-SALESMAN PROBLEM
    LIN, S
    KERNIGHAN, BW
    [J]. OPERATIONS RESEARCH, 1973, 21 (02) : 498 - 516
  • [54] Lourenço HR, 2010, INT SER OPER RES MAN, V146, P363, DOI 10.1007/978-1-4419-1665-5_12
  • [55] Lourenço HR, 2003, INT SER OPER RES MAN, V57, P321
  • [56] MacQueen J., 1967, P 5 BERK S MATH STAT, DOI DOI 10.1007/S11665-016-2173-6
  • [57] MULTIVARIATE CLUSTERING PROCEDURES WITH VARIABLE METRICS
    MARONNA, R
    JACOVKIS, PM
    [J]. BIOMETRICS, 1974, 30 (03) : 499 - 505
  • [58] MARRIOTT FHC, 1982, BIOMETRIKA, V69, P417, DOI 10.1093/biomet/69.2.417
  • [59] McLachlan G., 2000, WILEY SER PROB STAT, V1, DOI 10.1002/0471721182
  • [60] A STUDY OF STANDARDIZATION OF VARIABLES IN CLUSTER-ANALYSIS
    MILLIGAN, GW
    COOPER, MC
    [J]. JOURNAL OF CLASSIFICATION, 1988, 5 (02) : 181 - 204