Large-Scale Clustering Using Mathematical Programming

被引:0
作者
Gnagi, Mario [1 ]
Baumann, Philipp [1 ]
机构
[1] Univ Bern, Dept Business Adm, Schuetzenmattstr 14, CH-3012 Bern, Switzerland
来源
2017 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEM) | 2017年
关键词
Big Data and Analytics; Clustering; Mathematical Programming; Sparse-reduced Computation; FORMULATIONS; ALGORITHM;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Cluster analysis is a fundamental task in exploratory data analysis with a wide range of applications. Several clustering approaches based on mathematical programming have been proposed in the literature and were successfully used for small-and medium-scale data sets. However, mathematical programming-based clustering models are rarely used for large-scale data sets due to their extensive running time. In this paper, we propose a general scaling approach for existing mathematical programming-based clustering models that is based on the idea of replacing identical or nearly-identical objects by a small set of representatives. Our computational results indicate that the proposed scaling approach substantially reduces running time with a minor loss in clustering accuracy.
引用
收藏
页码:789 / 793
页数:5
相关论文
共 19 条
[11]  
2-0
[12]  
MacQueen, 1967, BERK S MATH STAT PRO, DOI DOI 10.1007/S11665-016-2173-6
[13]   Generation of random clusters with specified degree of separation [J].
Qiu, Weiliang ;
Joe, Harry .
JOURNAL OF CLASSIFICATION, 2006, 23 (02) :315-334
[15]   CLUSTER ANALYSIS AND MATHEMATICAL PROGRAMMING [J].
RAO, MR .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1971, 66 (335) :622-626
[16]   SILHOUETTES - A GRAPHICAL AID TO THE INTERPRETATION AND VALIDATION OF CLUSTER-ANALYSIS [J].
ROUSSEEUW, PJ .
JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 1987, 20 :53-65
[17]   A mixed-integer programming approach to the clustering problem with an application in customer segmentation [J].
Saglam, Burcu ;
Salman, F. Sibel ;
Sayin, Serpil ;
Turkay, Metin .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2006, 173 (03) :866-879
[18]   Mathematical Programming Formulations and Algorithms for Discrete k-Median Clustering of Time-Series Data [J].
Seref, Onur ;
Fan, Ya-Ju ;
Chaovalitwongse, Wanpracha Art .
INFORMS JOURNAL ON COMPUTING, 2014, 26 (01) :160-172
[19]  
Vinod H. D., 1969, J AM STAT ASSOC, V64, P506