StreamLeader: A New Stream Clustering Algorithm not Based in Conventional Clustering

被引:0
|
作者
Andres-Merino, Jaime [1 ]
Belanche, Lluis A. [1 ]
机构
[1] Tech Univ Catalonia, Dept Comp Sci, Jordi Girona 1-3, Barcelona 08034, Spain
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II | 2016年 / 9887卷
关键词
Stream algorithms; Clustering; Big Data;
D O I
10.1007/978-3-319-44781-0_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stream clustering algorithms normally require two phases: an online first step that statistically summarizes the stream while forming special structures - such as micro-clusters- and a second, offline phase, that uses a conventional clustering algorithm taking the microclusters as pseudo-points to deliver the final clustering. This procedure tends to produce oversized or overlapping clusters in medium-to-high dimensional spaces, and typically degrades seriously in noisy data environments. In this paper we introduce STREAMLEADER, a novel stream clustering algorithm suitable to massive data that does not resort to a conventional clustering phase, being based on the notion of Leader Cluster and on an aggressive noise reduction process. We report an extensive systematic testing in which the new algorithm is shown to consistently outperform its contenders both in terms of quality and scalability.
引用
收藏
页码:208 / 215
页数:8
相关论文
共 50 条
  • [31] A new clustering algorithm based on distance and density
    Yu, XP
    Zhou, DY
    Zhou, Y
    2005 INTERNATIONAL CONFERENCE ON SERVICES SYSTEMS AND SERVICES MANAGEMENT, VOLS 1 AND 2, PROCEEDINGS, 2005, : 1016 - 1021
  • [32] A new algorithm based on metaheuristics for data clustering
    Tsutomu Shohdohji
    Fumihiko Yano
    Yoshiaki Toyoda
    Journal of Zhejiang University-SCIENCE A, 2010, 11 : 921 - 926
  • [33] A new clustering algorithm based on token ring
    Liang, Yongquan
    Fan, Jiancong
    Zhao, Zhongying
    SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 1, PROCEEDINGS, 2007, : 443 - +
  • [34] A new clustering algorithm based on Voronoi diagram
    Reddy, Damodar
    Jana, Prasanta K.
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2014, 6 (01) : 49 - 64
  • [35] A New Clustering Algorithm Based on Graph Connectivity
    Li, Yu-Feng
    Lu, Liang-Hung
    Hung, Ying-Chao
    INTELLIGENT COMPUTING, VOL 1, 2019, 858 : 442 - 454
  • [36] A new algorithm based on metaheuristics for data clustering
    Tsutomu SHOHDOHJI
    Fumihiko YANO
    Yoshiaki TOYODA
    Journal of Zhejiang University-Science A(Applied Physics & Engineering), 2010, 11 (12) : 921 - 926
  • [37] A new clustering algorithm based on KNN and DENCLUE
    Yu, XG
    Jian, Y
    PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, : 2033 - 2038
  • [38] A Weighted Fuzzy Clustering Algorithm for Data Stream
    Wan, Renxia
    Yan, Xiaoya
    Su, Xiaoke
    2008 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL 1, PROCEEDINGS, 2008, : 360 - +
  • [39] Effective clustering algorithm for probabilistic data stream
    Dai, Dong-Bo
    Zhao, Gang
    Sun, Sheng-Li
    Ruan Jian Xue Bao/Journal of Software, 2009, 20 (05): : 1313 - 1328
  • [40] SubtStream: Online subtractive stream clustering algorithm
    Milli, Musa
    Bulut, Hasan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (15):