Equi-Clustream: a framework for clustering time evolving mixed data

被引:6
|
作者
Sangam, Ravi Sankar [1 ]
Om, Hari [2 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Tadepalligudem 534101, Andhra Prades, India
[2] Indian Inst Technol, Dept Comp Sci & Engn, Indian Sch Mines, Dhanbad 826004, Jharkhand, India
关键词
Clustering; Data streams; Time-evolving data; Data mining; DATA STREAMS; ALGORITHM;
D O I
10.1007/s11634-018-0316-3
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In data stream environment, most of the conventional clustering algorithms are not sufficiently efficient, since large volumes of data arrive in a stream and these data points unfold with time. The problem of clustering time-evolving metric data and categorical time-evolving data has separately been well explored in recent years, but the problem of clustering mixed type time-evolving data remains a challenging issue due to an awkward gap between the structure of metric and categorical attributes. In this paper, we devise a generalized framework, termed Equi-Clustream to dynamically cluster mixed type time-evolving data, which comprises three algorithms: a Hybrid Drifting Concept Detection Algorithm that detects the drifting concept between the current sliding window and previous sliding window, a Hybrid Data Labeling Algorithm that assigns an appropriate cluster label to each data vector of the current non-drifting window based on the clustering result of the previous sliding window, and a visualization algorithm that analyses the relationship between the clusters at different timestamps and also visualizes the evolving trends of the clusters. The efficacy of the proposed framework is shown by experiments on synthetic and real world datasets.
引用
收藏
页码:973 / 995
页数:23
相关论文
共 50 条
  • [21] Online embedding and clustering of evolving data streams
    Zubaroglu, Alaettin
    Atalay, Volkan
    STATISTICAL ANALYSIS AND DATA MINING, 2023, 16 (01) : 29 - 44
  • [22] SPARSE SUBSPACE CLUSTERING FOR EVOLVING DATA STREAMS
    Sui, Jinping
    Liu, Zhen
    Liu, Li
    Jung, Alexander
    Liu, Tianpeng
    Peng, Bo
    Li, Xiang
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7455 - 7459
  • [23] CLUSTERING OF VARIABLES FOR MIXED DATA
    Saracco, J.
    Chavent, M.
    STATISTICS FOR ASTROPHYSICS: CLUSTERING AND CLASSIFICATION, 2016, 77 : 121 - 169
  • [24] Fuzzy clustering of mixed data
    D'Urso, Pierpaolo
    Massari, Riccardo
    INFORMATION SCIENCES, 2019, 505 : 513 - 534
  • [25] A Framework for Clustering Uncertain Data
    Schubert, Erich
    Koos, Alexander
    Emrich, Tobias
    Zuefle, Andreas
    Schmid, Klaus Arthur
    Zimek, Arthur
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (12): : 1977 - 1980
  • [26] An MDL framework for data clustering
    Kontkanen, P
    Myllymäki, P
    Buntine, W
    Rissanen, J
    Tirri, H
    ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 323 - 353
  • [27] Evolving data stream clustering based on constant false clustering probability
    Kashani, Elham S.
    Shouraki, Saeed Bagheri
    Norouzi, Yaser
    INFORMATION SCIENCES, 2022, 614 : 1 - 18
  • [28] Reconstructing and evolving software architectures using a coordinated clustering framework
    Sheikh Motahar Naim
    Kostadin Damevski
    M. Shahriar Hossain
    Automated Software Engineering, 2017, 24 : 543 - 572
  • [29] Reconstructing and evolving software architectures using a coordinated clustering framework
    Naim, Sheikh Motahar
    Damevski, Kostadin
    Hossain, M. Shahriar
    AUTOMATED SOFTWARE ENGINEERING, 2017, 24 (03) : 543 - 572
  • [30] Evolving principal component clustering with a low run-time complexity for LRF data mapping
    Klancar, Gregor
    Skrjanc, Igor
    APPLIED SOFT COMPUTING, 2015, 35 : 349 - 358