MVStream: Multiview Data Stream Clustering

被引:36
|
作者
Huang, Ling [1 ,2 ,3 ]
Wang, Chang-Dong [1 ,2 ,3 ]
Chao, Hong-Yang [1 ,3 ]
Yu, Philip S. [4 ,5 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510006, Peoples R China
[2] Sun Yat Sen Univ, Guangdong Prov Key Lab Computat Sci, Guangzhou 510006, Peoples R China
[3] Minist Educ, Key Lab Machine Intelligence & Adv Comp, Guangzhou 510006, Peoples R China
[4] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
[5] Tsinghua Univ, Inst Data Sci, Beijing 100084, Peoples R China
关键词
Clustering algorithms; Shape; Task analysis; Support vector machines; Indexes; Data models; Computer science; Clustering; clusters of arbitrary shapes; data stream; multiview; support vector (SV); ALGORITHM;
D O I
10.1109/TNNLS.2019.2944851
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article studies a new problem of data stream clustering, namely, multiview data stream (MVStream) clustering. Although many data stream clustering algorithms have been developed, they are restricted to the single-view streaming data, and clustering MVStreams still remains largely unsolved. In addition to the many issues encountered by the conventional single-view data stream clustering, such as capturing cluster evolution and discovering clusters of arbitrary shapes under the limited computational resources, the main challenge of MVStream clustering lies in integrating information from multiple views in a streaming manner and abstracting summary statistics from the integrated features simultaneously. In this article, we propose a novel MVStream clustering algorithm for the first time. The main idea is to design a multiview support vector domain description (MVSVDD) model, by which the information from multiple insufficient views can be integrated, and the outputting support vectors (SVs) are utilized to abstract the summary statistics of the historical multiview data objects. Based on the MVSVDD model, a new multiview cluster labeling method is designed, whereby clusters of arbitrary shapes can be discovered for each view. By tracking the cluster labels of SVs in each view, the cluster evolution associated with concept drift can be captured. Since the SVs occupy only a small portion of data objects, the proposed MVStream algorithm is quite efficient with the limited computational resources. Extensive experiments are conducted to demonstrate the effectiveness and efficiency of the proposed method.
引用
收藏
页码:3482 / 3496
页数:15
相关论文
共 50 条
  • [31] Anomaly detection model based on data stream clustering
    Chunyong Yin
    Sun Zhang
    Zhichao Yin
    Jin Wang
    Cluster Computing, 2019, 22 : 1729 - 1738
  • [32] Learning in the presence of concept recurrence in data stream clustering
    K. Namitha
    G. Santhosh Kumar
    Journal of Big Data, 7
  • [33] Growing Hierarchical Trees for Data Stream Clustering and Visualization
    Nhat-Quang Doan
    Ghesmoune, Mohammed
    Azzag, Hanane
    Lebbah, Mustapha
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [34] Learning in the presence of concept recurrence in data stream clustering
    Namitha, K.
    Kumar, G. Santhosh
    JOURNAL OF BIG DATA, 2020, 7 (01)
  • [35] Twitter spammer detection using data stream clustering
    Miller, Zachary
    Dickinson, Brian
    Deitrick, William
    Hu, Wei
    Wang, Alex Hai
    INFORMATION SCIENCES, 2014, 260 : 64 - 73
  • [36] Clustering data stream using adaptive resonance theory
    Xu, WX
    Liao, MH
    ISAS/CITSA 2004: International Conference on Cybernetics and Information Technologies, Systems and Applications and 10th International Conference on Information Systems Analysis and Synthesis, Vol 1, Proceedings: COMMUNICATIONS, INFORMATION TECHNOLOGIES AND COMPUTING, 2004, : 5 - 9
  • [37] Generalized Incomplete Multiview Clustering With Flexible Locality Structure Diffusion
    Wen, Jie
    Zhang, Zheng
    Zhang, Zhao
    Fei, Lunke
    Wang, Meng
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (01) : 101 - 114
  • [38] Anomaly intrusion detection based on clustering a data stream
    Oh, Sang-Hyun
    Kang, Jin-Suk
    Bytin, Yung-Cheol
    Jeong, Taikyeong T.
    Lee, Won-Suk
    INFORMATION SECURITY, PROCEEDINGS, 2006, 4176 : 415 - 426
  • [39] Clustering of Robotic Environment using Image Data Stream
    Nair, Priyanka C.
    Radhakrishnan, G.
    Gupta, Deepa
    Sudarshan, T. S. B.
    2015 COMMUNICATION, CONTROL AND INTELLIGENT SYSTEMS (CCIS), 2015, : 208 - 213
  • [40] Anomaly detection model based on data stream clustering
    Yin, Chunyong
    Zhang, Sun
    Yin, Zhichao
    Wang, Jin
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 1): : 1729 - 1738