MR-SNN: Design of Parallel Shared Nearest Neighbor Clustering Algorithm Using MapReduce

被引:0
作者
Wang, Sujing [1 ]
Eick, Christoph F. [2 ]
机构
[1] Lamar Univ, Dept Comp Sci, Beaumont, TX 77710 USA
[2] Univ Houston, Dept Comp Sci, Houston, TX 77204 USA
来源
2017 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA) | 2017年
关键词
clustering; big data analysis; MapReduce; Hadoop;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Shared Nearest Neighbor (SNN) Clustering is a well-established density based clustering algorithm, which can find clusters of different sizes, shapes, and densities. SNN has been widely adopted in numerous applications. As the size of dataset becomes extremely large nowadays, it is inefficient or even impossible for large-scale data to be stored and processed on a single machine. Therefore, the scalability problem of clustering algorithm running on a single machine has to be addressed. In this paper, we improve the traditional SNN clustering algorithm by utilizing high-performance computing clusters and powerful programming platform (MapReduce) for big data analysis. In particular, we design the MapReduce-based Shared Nearest Neighbor clustering algorithm called MR-SNN for big data analysis.
引用
收藏
页码:317 / 320
页数:4
相关论文
共 15 条
[1]   Automatic subspace clustering of high dimensional data [J].
Agrawal, R ;
Gehrke, J ;
Gunopulos, D ;
Raghavan, P .
DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (01) :5-33
[2]  
Anchalia P., 2014, 2014 UKSIM AMSS 16 I
[3]  
Anchalia P. P., 2013, Information Science and Applications (ICISA), 2013 International Conference on, P1, DOI [10.1109/icisa.2013.6579448, DOI 10.1109/ICISA.2013.6579448]
[4]  
Ankerst M, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P49
[5]  
[Anonymous], 2 SIAM INT C DAT MIN
[6]  
Bi-Ru Dai, 2012, 2012 IEEE 5th International Conference on Cloud Computing (CLOUD), P59, DOI 10.1109/CLOUD.2012.42
[7]  
Ene A., 2011, P 17 ACM KDD, P681, DOI DOI 10.1145/2020408.2020515
[8]  
Ester M., 1996, 2 INT C KNOWL DISC D, V2, P226
[9]  
Fan J., 2014, NATL SCI REV
[10]  
KARLOFF H, 2010, SODA, V135, P938