A Framework for Clustering Categorical Time-Evolving Data

被引:43
|
作者
Cao, Fuyuan [1 ]
Liang, Jiye [1 ]
Bai, Liang [1 ]
Zhao, Xingwang [1 ]
Dang, Chuangyin [2 ]
机构
[1] Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Peoples R China
[2] City Univ Hong Kong, Dept Mfg Engn & Engn Management, Kowloon, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Categorical time-evolving data; clusters relationship analysis; data labeling; drifting-concept detecting; K-MEANS ALGORITHM; ROUGH; UNCERTAINTY; REDUCTION; SETS;
D O I
10.1109/TFUZZ.2010.2050891
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A fundamental assumption often made in unsupervised learning is that the problem is static, i.e., the description of the classes does not change with time. However, many practical clustering tasks involve changing environments. It is hence recognized that the methods and techniques to analyze the evolving trends for changing environments are of increasing interest and importance. Although the problem of clustering numerical time-evolving data is well-explored, the problem of clustering categorical time-evolving data remains as a challenging issue. In this paper, we propose a generalized clustering framework for categorical time-evolving data, which is composed of three algorithms: a drifting-concept detecting algorithm that detects the difference between the current sliding window and the last sliding window, a data-labeling algorithm that decides the most-appropriate cluster label for each object of the current sliding window based on the clustering results of the last sliding window, and a cluster-relationship-analysis algorithm that analyzes the relationship between clustering results at different time stamps. The time-complexity analysis indicates that these proposed algorithms are effective for large datasets. Experiments on a real dataset show that the proposed framework not only accurately detects the drifting concepts but also attains clustering results of better quality. Furthermore, compared with the other framework, the proposed one needs fewer parameters, which is favorable for specific applications.
引用
收藏
页码:872 / 882
页数:11
相关论文
共 50 条
  • [31] Classification of Multidimensional Time-Evolving Data Using Histograms of Grassmannian Points
    Dimitropoulos, Kosmas
    Barmpoutis, Panagiotis
    Kitsikidis, Alexandros
    Grammalidis, Nikos
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (04) : 892 - 905
  • [32] GBTM: Community detection and network reconstruction for noisy and time-evolving data
    Chen, Xiao
    Hu, Jie
    Chen, Yu
    INFORMATION SCIENCES, 2024, 679
  • [33] Scar State on Time-evolving Wavepacket
    Tomiya, Mitsuyoshi
    Tsuyuki, Hiroyoshi
    Kawamura, Kentaro
    Sakamoto, Shoichi
    Heller, Eric J.
    XXVI IUPAP CONFERENCE ON COMPUTATIONAL PHYSICS (CCP2014), 2015, 640
  • [34] Time-Evolving Graph Processing at Scale
    Iyer, Anand Padmanabha
    Li, Li Erran
    Das, Tathagata
    Stoica, Ion
    FOURTH INTERNATIONAL WORKSHOP ON GRAPH DATA MANAGEMENT EXPERIENCES AND SYSTEMS (GRADES2016), 2016,
  • [35] Incremental attribute reduction approaches for ordered data with time-evolving objects
    Sang, Binbin
    Chen, Hongmei
    Yang, Lei
    Zhou, Dapeng
    Li, Tianrui
    Xu, Weihua
    KNOWLEDGE-BASED SYSTEMS, 2021, 212
  • [36] Algorithms on Compressed Time-Evolving Graphs
    Nelson, Michael
    Radhakrishnan, Sridhar
    Sekharan, Chandra N.
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 227 - 232
  • [37] TIME-EVOLVING MODELING OF SOCIAL NETWORKS
    Wang, Eric
    Silva, Jorge
    Willett, Rebecca
    Carin, Lawrence
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2184 - 2187
  • [38] EFFICIENT MANAGEMENT OF TIME-EVOLVING DATABASES
    TSOTRAS, VJ
    GOPINATH, B
    HART, GW
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1995, 7 (04) : 591 - 608
  • [39] Engineering Gels with Time-Evolving Viscoelasticity
    Mattei, Giorgio
    Cacopardo, Ludovica
    Ahluwalia, Arti
    MATERIALS, 2020, 13 (02)
  • [40] A New Context-Based Clustering Framework for Categorical Data
    Thanh-Phu Nguyen
    Duy-Tai Dinh
    Van-Nam Huynh
    PRICAI 2018: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2018, 11012 : 697 - 709