StreamLeader: A New Stream Clustering Algorithm not Based in Conventional Clustering

被引:0
|
作者
Andres-Merino, Jaime [1 ]
Belanche, Lluis A. [1 ]
机构
[1] Tech Univ Catalonia, Dept Comp Sci, Jordi Girona 1-3, Barcelona 08034, Spain
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II | 2016年 / 9887卷
关键词
Stream algorithms; Clustering; Big Data;
D O I
10.1007/978-3-319-44781-0_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stream clustering algorithms normally require two phases: an online first step that statistically summarizes the stream while forming special structures - such as micro-clusters- and a second, offline phase, that uses a conventional clustering algorithm taking the microclusters as pseudo-points to deliver the final clustering. This procedure tends to produce oversized or overlapping clusters in medium-to-high dimensional spaces, and typically degrades seriously in noisy data environments. In this paper we introduce STREAMLEADER, a novel stream clustering algorithm suitable to massive data that does not resort to a conventional clustering phase, being based on the notion of Leader Cluster and on an aggressive noise reduction process. We report an extensive systematic testing in which the new algorithm is shown to consistently outperform its contenders both in terms of quality and scalability.
引用
收藏
页码:208 / 215
页数:8
相关论文
共 50 条
  • [21] A Topic-based Dynamic Clustering Algorithm for Text Stream
    Rao, Y.
    Li, X. J.
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRIAL ENGINEERING (AIIE 2015), 2015, 123 : 480 - 483
  • [22] AN EFFICIENT DATA STREAM CLUSTERING ALGORITHM BASED ON DYNAMIC GRIDS
    Yun Wu
    Gao Feng
    NEW TRENDS AND APPLICATIONS OF COMPUTER-AIDED MATERIAL AND ENGINEERING, 2011, 186 : 665 - +
  • [23] Data Stream Clustering Algorithm Based on Affinity Propagation and Density
    Li Yang
    Tan Baihong
    MANUFACTURING SYSTEMS AND INDUSTRY APPLICATIONS, 2011, 267 : 444 - 449
  • [24] A Stream Clustering Algorithm using Information Theoretic Clustering Evaluation Function
    Gokcay, Erhan
    CLOSER: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, 2018, : 582 - 588
  • [25] Incremental clustering algorithm based on rough reduction for data stream
    College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
    Xinan Jiaotong Daxue Xuebao, 2009, 5 (637-643+653):
  • [26] A dynamic data stream clustering algorithm based on probability and exemplar
    Bi A.
    Dong A.
    Wang S.
    1600, Science Press (53): : 1029 - 1042
  • [27] Hierarchical Clustering Algorithm Based on a New Measure
    Zhang Guofen
    Ye Jianjun
    COMPREHENSIVE EVALUATION OF ECONOMY AND SOCIETY WITH STATISTICAL SCIENCE, 2009, : 1026 - 1030
  • [28] A New Trajectory Clustering Algorithm Based on TRACLUS
    Chen Jiashun
    PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 783 - 787
  • [29] A new algorithm based on metaheuristics for data clustering
    Tsutomu SHOHDOHJI
    Fumihiko YANO
    Yoshiaki TOYODA
    Journal of Zhejiang University-Science A(Applied Physics & Engineering), 2010, (12) : 921 - 926
  • [30] A new algorithm based on metaheuristics for data clustering
    Shohdohji, Tsutomu
    Yano, Fumihiko
    Toyoda, Yoshiaki
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2010, 11 (12): : 921 - 926