Large-Scale Clustering Using Mathematical Programming

被引:0
作者
Gnagi, Mario [1 ]
Baumann, Philipp [1 ]
机构
[1] Univ Bern, Dept Business Adm, Schuetzenmattstr 14, CH-3012 Bern, Switzerland
来源
2017 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEM) | 2017年
关键词
Big Data and Analytics; Clustering; Mathematical Programming; Sparse-reduced Computation; FORMULATIONS; ALGORITHM;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Cluster analysis is a fundamental task in exploratory data analysis with a wide range of applications. Several clustering approaches based on mathematical programming have been proposed in the literature and were successfully used for small-and medium-scale data sets. However, mathematical programming-based clustering models are rarely used for large-scale data sets due to their extensive running time. In this paper, we propose a general scaling approach for existing mathematical programming-based clustering models that is based on the idea of replacing identical or nearly-identical objects by a small set of representatives. Our computational results indicate that the proposed scaling approach substantially reduces running time with a minor loss in clustering accuracy.
引用
收藏
页码:789 / 793
页数:5
相关论文
共 19 条
[1]  
[Anonymous], 2004, ICML
[2]  
Bache K., 2013, UCI Machine Learning Repository
[3]  
Baumann P, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEM), P1284, DOI 10.1109/IEEM.2016.7798085
[4]   An LP-based k-means algorithm for balancing weighted point sets [J].
Borgwardt, S. ;
Brieden, A. ;
Gritzmann, P. .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2017, 263 (02) :349-355
[5]   Mathematical programming for data mining: Formulations and challenges [J].
Bradley, PS ;
Fayyad, UM ;
Mangasarian, OL .
INFORMS JOURNAL ON COMPUTING, 1999, 11 (03) :217-238
[6]   An enhanced branch-and-bound algorithm for a partitioning problem [J].
Brusco, MJ .
BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 2003, 56 :83-92
[7]   A flexible ILP formulation for hierarchical clustering [J].
Gilpin, Sean ;
Davidson, Ian .
ARTIFICIAL INTELLIGENCE, 2017, 244 :95-109
[8]   Sparse Computation for Large-Scale Data Mining [J].
Hochbaum, Dorit S. ;
Baumann, Philipp .
IEEE Transactions on Big Data, 2016, 2 (02) :151-174
[9]  
Hochbaum DS, 2014, IEEE INT CONF BIG DA, P354, DOI 10.1109/BigData.2014.7004252
[10]  
KLEIN G, 1991, NAV RES LOG, V38, P447, DOI 10.1002/1520-6750(199106)38:3<447::AID-NAV3220380312>3.0.CO