A Framework for Clustering Categorical Time-Evolving Data

被引:43
|
作者
Cao, Fuyuan [1 ]
Liang, Jiye [1 ]
Bai, Liang [1 ]
Zhao, Xingwang [1 ]
Dang, Chuangyin [2 ]
机构
[1] Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Peoples R China
[2] City Univ Hong Kong, Dept Mfg Engn & Engn Management, Kowloon, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Categorical time-evolving data; clusters relationship analysis; data labeling; drifting-concept detecting; K-MEANS ALGORITHM; ROUGH; UNCERTAINTY; REDUCTION; SETS;
D O I
10.1109/TFUZZ.2010.2050891
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A fundamental assumption often made in unsupervised learning is that the problem is static, i.e., the description of the classes does not change with time. However, many practical clustering tasks involve changing environments. It is hence recognized that the methods and techniques to analyze the evolving trends for changing environments are of increasing interest and importance. Although the problem of clustering numerical time-evolving data is well-explored, the problem of clustering categorical time-evolving data remains as a challenging issue. In this paper, we propose a generalized clustering framework for categorical time-evolving data, which is composed of three algorithms: a drifting-concept detecting algorithm that detects the difference between the current sliding window and the last sliding window, a data-labeling algorithm that decides the most-appropriate cluster label for each object of the current sliding window based on the clustering results of the last sliding window, and a cluster-relationship-analysis algorithm that analyzes the relationship between clustering results at different time stamps. The time-complexity analysis indicates that these proposed algorithms are effective for large datasets. Experiments on a real dataset show that the proposed framework not only accurately detects the drifting concepts but also attains clustering results of better quality. Furthermore, compared with the other framework, the proposed one needs fewer parameters, which is favorable for specific applications.
引用
收藏
页码:872 / 882
页数:11
相关论文
共 50 条
  • [21] A unified incremental updating framework of attribute reduction for two-dimensionally time-evolving data
    Yang, Xin
    Yang, Yuxuan
    Luo, Junfang
    Liu, Dun
    Li, Tianrui
    INFORMATION SCIENCES, 2022, 601 : 287 - 305
  • [22] Koopman-Based Spectral Clustering of Directed and Time-Evolving Graphs
    Klus, Stefan
    Conrad, Natasa Djurdjevac
    JOURNAL OF NONLINEAR SCIENCE, 2023, 33 (01)
  • [23] A bi-clustering framework for categorical data
    Pensa, RG
    Robardet, C
    Boulicaut, JF
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2005, 2005, 3721 : 643 - 650
  • [24] Co-clustering of time-evolving news story with transcript and keyframe
    Wu, X
    Ngo, CW
    Li, Q
    2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 117 - 120
  • [25] Improving Community Detection in Time-Evolving Networks Through Clustering Fusion
    Jin, Ran
    Kou, Chunhai
    Liu, Ruijuan
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2015, 15 (02) : 63 - 74
  • [26] Equi-Clustream: a framework for clustering time evolving mixed data
    Ravi Sankar Sangam
    Hari Om
    Advances in Data Analysis and Classification, 2018, 12 : 973 - 995
  • [27] Equi-Clustream: a framework for clustering time evolving mixed data
    Sangam, Ravi Sankar
    Om, Hari
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2018, 12 (04) : 973 - 995
  • [28] Asymptotic results with estimating equations for time-evolving clustered data
    Dumitrescu, Laura
    Schiopu-Kratina, Ioana
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2021, 214 : 41 - 61
  • [29] A Framework for Clustering Massive Text and Categorical Data Streams
    Aggarwal, Charu C.
    Yu, Philip S.
    PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 479 - 483
  • [30] Time-evolving interfaces in a Stokes flow
    Kropinski, MCA
    SCIENTIFIC COMPUTING AND APPLICATIONS, 2001, 7 : 83 - 90