Fat node leading tree for data stream clustering with density peaks

被引:28
作者
Xu, Ji [1 ,3 ,4 ]
Wang, Guoyin [2 ]
Li, Tianrui [1 ]
Deng, Weihui [3 ]
Gou, Guanglei [1 ]
机构
[1] Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu 610031, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Computat Intelligence, Chongqing 400065, Peoples R China
[3] Chinese Acad Sci, Chongqing Inst Green & Intelligent Technol, Chongqing 400714, Peoples R China
[4] Guizhou Univ Engn Sci, Sch Informat Engn, Bijie 551700, Peoples R China
基金
中国国家自然科学基金;
关键词
Data stream clustering; Density peaks; Fat node leading tree; Change point;
D O I
10.1016/j.knosys.2016.12.025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting clusters of arbitrary shape and constantly delivering the results for newly arrived items are two critical challenges in the study of data stream clustering. However, the existing clustering methods could not deal with these two problems simultaneously. In this paper, we employ the density peaks based clustering (DPClust) algorithm to construct a leading tree (LT) and further transform it into a fat node leading tree (FNLT) in a granular computing way. FNLT is a novel interpretable synopsis of the current state of data stream for clustering. New incoming data is blended into the evolving FNLT structure quickly, and thus the clustering result of the incoming data can be delivered on the fly. During the interval between the delivery of the clustering results and the arrival of new data, the FNLT with blended data is granulated as a new FNLT with a constant number of fat nodes. The FNLT of the current data stream is maintained in a real-time fashion by the Blending-Granulating-Fading mechanism. At the same time, the change points are detected using the partial order relation between each pair of the cluster centers and the martingale theory. Compared to several state-of-the-art clustering methods, the presented model shows promising accuracy and efficiency. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:99 / 117
页数:19
相关论文
共 36 条
  • [1] Ackermann M. R., 2012, J EXPT ALGORITHMICS, V17, P173
  • [2] Aggarwal CC, 2014, CH CRC DATA MIN KNOW, P1
  • [3] [Anonymous], 2004, P 30 INT C VER LARG
  • [4] [Anonymous], 2003, P 29 INT C VER LARG
  • [5] [Anonymous], 2009, P 26 ANN INT C MACH
  • [6] Basseville I. V., 1993, Detection of Abrupt Changes: Theoryand Application
  • [7] Density-Based Clustering over an Evolving Data Stream with Noise
    Cao, Feng
    Ester, Martin
    Qian, Weining
    Zhou, Aoying
    [J]. PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 328 - +
  • [8] Ester M., 1996, KDD-96 Proceedings. Second International Conference on Knowledge Discovery and Data Mining, P226
  • [9] Clustering by passing messages between data points
    Frey, Brendan J.
    Dueck, Delbert
    [J]. SCIENCE, 2007, 315 (5814) : 972 - 976
  • [10] Gaber MM, 2005, SIGMOD REC, V34, P18, DOI 10.1145/1083784.1083789