Dynamic and Static Enhanced BIRCH for Functional Data Clustering

被引:0
|
作者
Li, Wang [1 ]
Li, Hanfang [1 ]
Luo, Youxi [1 ]
机构
[1] Hubei Univ Technol, Sch Sci, Wuhan 430068, Peoples R China
关键词
Clustering algorithms; Heuristic algorithms; Clustering methods; Principal component analysis; Feature extraction; Prediction algorithms; Data models; Functional data; clustering; BIRCH; dynamic and static information fusion; HIGH-DIMENSIONAL DATA; K-MEANS; DENSITY; ALGORITHM;
D O I
10.1109/ACCESS.2023.3322929
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate and efficient clustering of large-scale functional data is of utmost importance in the era of big data. However, the current research falls short in fully considering the differentiability inherent in functional data. To tackle this significant challenge, we propose a novel method, namely Dynamic and Static Enhanced-BIRCH (DSE-BIRCH), which incorporates both the constant and derivate features to simultaneously measure the static and dynamic distances between functional samples. To this end, a novel matrix factorization-based approach is introduced to transform constant features, extracted through principal component analysis, into derivative features. Subsequently, these two sets of features are fused to form global clustering features with different weighting coefficients are assigned to each of them, reflecting their respective importance. Finally, an enhanced BIRCH algorithm is employed to handle both static and dynamic constraints, enabling hierarchical clustering from a more comprehensive perspective. The mathematical definition of the algorithm is rigorously provided. The superior empirical performance of our method on publicly available datasets and simulated datasets fully demonstrates its effective capture of dynamic information and its capability to achieve accurate clustering on real-world data. Further experiments involving noise and complexity attest to the algorithm's robustness and efficiency, highlighting its broad potential for applications in various complex scenarios involving large-scale functional data.
引用
收藏
页码:111448 / 111465
页数:18
相关论文
共 50 条
  • [1] Accelerating BIRCH for Clustering Large Scale Streaming Data Using CUDA Dynamic Parallelism
    Dong, Jianqiang
    Wang, Fei
    Yuan, Bo
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2013, 2013, 8206 : 409 - 416
  • [2] Two-Stage Sparse Representation Clustering for Dynamic Data Streams
    Chen, Jie
    Wang, Zhu
    Yang, Shengxiang
    Mao, Hua
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (10) : 6408 - 6420
  • [3] Pseudo-quantile functional data clustering
    Kim, Joonpyo
    Oh, Hee-Seok
    JOURNAL OF MULTIVARIATE ANALYSIS, 2020, 178
  • [4] Functional data clustering: a survey
    Jacques, Julien
    Preda, Cristian
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2014, 8 (03) : 231 - 255
  • [5] Robust Functional Manifold Clustering
    Guo, Yi
    Tierney, Stephen
    Gao, Junbin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (02) : 777 - 787
  • [6] CLUSTERING FUNCTIONAL DATA USING WAVELETS
    Antoniadis, Anestis
    Brossat, Xavier
    Cugliari, Jairo
    Poggi, Jean-Michel
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2013, 11 (01)
  • [7] Manifold Enhanced 2-D Fuzzy Subspace Clustering for Image Data
    Shi, Zhaoyin
    Chen, Long
    Chen, Guang-Yong
    Zhao, Kai
    Chen, C. L. Philip
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (02): : 741 - 752
  • [8] MR-BIRCH: A scalable MapReduce-based BIRCH clustering algorithm
    Li, Yufeng
    Jiang, HaiTian
    Lu, Jiyong
    Li, Xiaozhong
    Sun, Zhiwei
    Li, Min
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (03) : 5295 - 5305
  • [9] Dynamic Sparse Subspace Clustering for Evolving High-Dimensional Data Streams
    Sui, Jinping
    Liu, Zhen
    Liu, Li
    Jung, Alexander
    Li, Xiang
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (06) : 4173 - 4186
  • [10] Functional data clustering: a survey
    Julien Jacques
    Cristian Preda
    Advances in Data Analysis and Classification, 2014, 8 : 231 - 255