Optimal partitioning of a data set based on the p-median model

被引:24
作者
Brusco, Michael J. [1 ]
Koehn, Hans-Friedrich [2 ]
机构
[1] Florida State Univ, Dept Mkt, Tallahassee, FL 32306 USA
[2] Univ Missouri, Columbia, MO 65211 USA
关键词
combinatorial data analysis; cluster analysis; p-median problem; Lagrangian relaxation; branch and bound; heuristics;
D O I
10.1007/s11336-007-9021-4
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Although the K-means algorithm for minimizing the within-cluster sums of squared deviations from cluster centroids is perhaps the most common method for applied cluster analyses, a variety of other criteria are available. The p-median model is an especially well-studied clustering problem that requires the selection of p objects to serve as cluster centers. The objective is to choose the cluster centers such that the sum of the Euclidean distances (or some other dissimilarity measure) of objects assigned to each center is minimized. Using 12 data sets from the literature, we demonstrate that a three-stage procedure consisting of a greedy heuristic, Lagrangian relaxation, and a branch-and-bound algorithm can produce globally optimal solutions for p-median problems of nontrivial size (several hundred objects, five or more variables, and up to 10 clusters). We also report the results of an application of the p-median model to an empirical data set from the telecommunications industry.
引用
收藏
页码:89 / 105
页数:17
相关论文
共 46 条
[1]   THE RELAXATION METHOD FOR LINEAR INEQUALITIES [J].
AGMON, S .
CANADIAN JOURNAL OF MATHEMATICS-JOURNAL CANADIEN DE MATHEMATIQUES, 1954, 6 (03) :382-392
[2]  
[Anonymous], TSPLIB
[3]  
AVELLA P, 2003, COMPUTATIONAL STUDY
[4]   Solving the p-median problem with a semi-Lagrangian relaxation [J].
Beltran, C. ;
Tadonki, C. ;
Vial, J. -Ph. .
COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2006, 35 (02) :239-260
[5]   A repetitive branch-and-bound procedure for minimum within-cluster sums of squares partitioning [J].
Brusco, Michael J. .
PSYCHOMETRIKA, 2006, 71 (02) :347-363
[6]   Multicriterion clusterwise regression for joint segmentation settings: An application to customer value [J].
Brusco, MJ ;
Cradit, JD ;
Tashchian, A .
JOURNAL OF MARKETING RESEARCH, 2003, 40 (02) :225-234
[7]   A TREE-SEARCH ALGORITHM FOR THE PARA-MEDIAN PROBLEM [J].
CHRISTOFIDES, N ;
BEASLEY, JE .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1982, 10 (02) :196-204
[8]   LOCATION OF BANK ACCOUNTS TO OPTIMIZE FLOAT - ANALYTIC STUDY OF EXACT AND APPROXIMATE ALGORITHMS [J].
CORNUEJOLS, G ;
FISHER, ML ;
NEMHAUSER, GL .
MANAGEMENT SCIENCE, 1977, 23 (08) :789-810
[9]   An interior point algorithm for minimum sum-of-squares clustering [J].
Du Merle, O ;
Hansen, P ;
Jaumard, B ;
Mladenovic, N .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2000, 21 (04) :1485-1505
[10]  
DUMERLE O, 2002, 200223 U GEN