StreamLeader: A New Stream Clustering Algorithm not Based in Conventional Clustering

被引:0
|
作者
Andres-Merino, Jaime [1 ]
Belanche, Lluis A. [1 ]
机构
[1] Tech Univ Catalonia, Dept Comp Sci, Jordi Girona 1-3, Barcelona 08034, Spain
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II | 2016年 / 9887卷
关键词
Stream algorithms; Clustering; Big Data;
D O I
10.1007/978-3-319-44781-0_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stream clustering algorithms normally require two phases: an online first step that statistically summarizes the stream while forming special structures - such as micro-clusters- and a second, offline phase, that uses a conventional clustering algorithm taking the microclusters as pseudo-points to deliver the final clustering. This procedure tends to produce oversized or overlapping clusters in medium-to-high dimensional spaces, and typically degrades seriously in noisy data environments. In this paper we introduce STREAMLEADER, a novel stream clustering algorithm suitable to massive data that does not resort to a conventional clustering phase, being based on the notion of Leader Cluster and on an aggressive noise reduction process. We report an extensive systematic testing in which the new algorithm is shown to consistently outperform its contenders both in terms of quality and scalability.
引用
收藏
页码:208 / 215
页数:8
相关论文
共 50 条
  • [1] Text stream clustering based on Squeezer algorithm
    School of Economics and Management, Beihang University, Beijing 100191, China
    不详
    Kongzhi yu Juece Control Decis, 2012, 4 (542-546):
  • [2] New fuzzy-clustering algorithm for data stream
    Sun, Li-Juan
    Chen, Xiao-Dong
    Han, Chong
    Guo, Jian
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2015, 37 (07): : 1620 - 1625
  • [3] Research on Data Stream Clustering Based on FCM Algorithm
    Gao, Tiancheng
    Li, Aihua
    Meng, Fan
    5TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, ITQM 2017, 2017, 122 : 595 - 602
  • [4] THE CLUSTERING ALGORITHM OF EVOLUTIONAL DATA STREAM BASED ON DENSITY
    Meng, Yuyu
    Zheng, Liying
    3RD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND COMPUTER SCIENCE (ITCS 2011), PROCEEDINGS, 2011, : 473 - 477
  • [5] IMPROVED DENSITY BASED ALGORITHM FOR DATA STREAM CLUSTERING
    Mousavi, Maryam
    Abu Bakar, Azuraliza
    JURNAL TEKNOLOGI, 2015, 77 (18): : 73 - 77
  • [6] Clustering Algorithm Based on Grid and Density for Data Stream
    Wang, Lang
    Li, Haiqing
    MATERIALS SCIENCE, ENERGY TECHNOLOGY, AND POWER ENGINEERING I, 2017, 1839
  • [7] Drifted Data Stream Clustering Based on ClusTree Algorithm
    Zgraja, Jakub
    Wozniak, Michal
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS (HAIS 2018), 2018, 10870 : 338 - 349
  • [8] A new clustering algorithm based on connectivity
    Wan, Jiaqiang
    Zhang, Kesheng
    Guo, Zhenpeng
    Miao, Duoqian
    APPLIED INTELLIGENCE, 2023, 53 (17) : 20272 - 20292
  • [9] A new clustering algorithm based on connectivity
    Jiaqiang Wan
    Kesheng Zhang
    Zhenpeng Guo
    Duoqian Miao
    Applied Intelligence, 2023, 53 : 20272 - 20292
  • [10] An Improved Data Stream Algorithm for Clustering
    Kim, Sang-Sub
    Ahn, Hee-Kap
    LATIN 2014: THEORETICAL INFORMATICS, 2014, 8392 : 273 - 284