MVStream: Multiview Data Stream Clustering

被引:36
|
作者
Huang, Ling [1 ,2 ,3 ]
Wang, Chang-Dong [1 ,2 ,3 ]
Chao, Hong-Yang [1 ,3 ]
Yu, Philip S. [4 ,5 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510006, Peoples R China
[2] Sun Yat Sen Univ, Guangdong Prov Key Lab Computat Sci, Guangzhou 510006, Peoples R China
[3] Minist Educ, Key Lab Machine Intelligence & Adv Comp, Guangzhou 510006, Peoples R China
[4] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
[5] Tsinghua Univ, Inst Data Sci, Beijing 100084, Peoples R China
关键词
Clustering algorithms; Shape; Task analysis; Support vector machines; Indexes; Data models; Computer science; Clustering; clusters of arbitrary shapes; data stream; multiview; support vector (SV); ALGORITHM;
D O I
10.1109/TNNLS.2019.2944851
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article studies a new problem of data stream clustering, namely, multiview data stream (MVStream) clustering. Although many data stream clustering algorithms have been developed, they are restricted to the single-view streaming data, and clustering MVStreams still remains largely unsolved. In addition to the many issues encountered by the conventional single-view data stream clustering, such as capturing cluster evolution and discovering clusters of arbitrary shapes under the limited computational resources, the main challenge of MVStream clustering lies in integrating information from multiple views in a streaming manner and abstracting summary statistics from the integrated features simultaneously. In this article, we propose a novel MVStream clustering algorithm for the first time. The main idea is to design a multiview support vector domain description (MVSVDD) model, by which the information from multiple insufficient views can be integrated, and the outputting support vectors (SVs) are utilized to abstract the summary statistics of the historical multiview data objects. Based on the MVSVDD model, a new multiview cluster labeling method is designed, whereby clusters of arbitrary shapes can be discovered for each view. By tracking the cluster labels of SVs in each view, the cluster evolution associated with concept drift can be captured. Since the SVs occupy only a small portion of data objects, the proposed MVStream algorithm is quite efficient with the limited computational resources. Extensive experiments are conducted to demonstrate the effectiveness and efficiency of the proposed method.
引用
收藏
页码:3482 / 3496
页数:15
相关论文
共 50 条
  • [21] Varying density method for data stream clustering
    Mousavi, Maryam
    Khotanlou, Hassan
    Abu Bakar, Azuraliza
    Vakilian, Mohammadmahdi
    APPLIED SOFT COMPUTING, 2020, 97
  • [22] An Adaptive Density Data Stream Clustering Algorithm
    Ding, Shifei
    Zhang, Jian
    Jia, Hongjie
    Qian, Jun
    COGNITIVE COMPUTATION, 2016, 8 (01) : 30 - 38
  • [23] A Review of Uncertain Data Stream Clustering Algorithms
    Yang, Yue
    Liu, Zhuo
    Xing, Zhidan
    2015 EIGHTH INTERNATIONAL CONFERENCE ON INTERNET COMPUTING FOR SCIENCE AND ENGINEERING (ICICSE), 2015, : 111 - 116
  • [24] Continual Multiview Spectral Clustering via Multilevel Knowledge
    Wang, Kangru
    Wang, Lei
    Zhang, Xiaolin
    Li, Jiamao
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1555 - 1559
  • [25] Tensorial Multiview Subspace Clustering for Polarimetric Hyperspectral Images
    Chen, Zhengyi
    Zhang, Chunmin
    Mu, Tingkui
    He, Yifan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [26] Data Stream Clustering: A Survey
    Silva, Jonathan A.
    Faria, Elaine R.
    Barros, Rodrigo C.
    Hruschka, Eduardo R.
    de Carvalho, Andre C. P. L. F.
    Gama, Joao
    ACM COMPUTING SURVEYS, 2013, 46 (01)
  • [27] Fuzzy clustering for multiview data by combining latent information
    Wei, Huiqin
    Chen, Long
    Chen, C. L. Philip
    Duan, Junwei
    Han, Ruizhi
    Guo, Li
    APPLIED SOFT COMPUTING, 2022, 126
  • [28] Graph-based unsupervised feature selection and multiview clustering for microarray data
    Swarnkar, Tripti
    Mitra, Pabitra
    JOURNAL OF BIOSCIENCES, 2015, 40 (04) : 755 - 767
  • [29] Clustering data stream with uncertainty using belief function theory and fading function
    Hamidzadeh, Javad
    Ghadamyari, Reyhaneh
    SOFT COMPUTING, 2020, 24 (12) : 8955 - 8974
  • [30] Semi-Supervised Clustering via Cannot Link Relationship for Multiview Data
    Zhu, Zhaorui
    Gao, Quanxue
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8744 - 8755