Tensor-Based Possibilistic C-Means Clustering

被引:1
作者
Benjamin, Josephine Bernadette M. [1 ]
Yang, Miin-Shen [2 ]
机构
[1] Univ Santo Tomas, Dept Math & Phys, Manila 1008, Philippines
[2] Chung Yuan Christian Univ, Dept Appl Math, Taoyuan 32023, Taiwan
关键词
Tensors; Clustering algorithms; Phase change materials; Arrays; Linear programming; Heuristic algorithms; Euclidean distance; Clustering; possibilistic C-means (PCM); tensor data; tensor decomposition; tensor distance (TD); tensor-based clustering; tensor-based PCM (TPCM); COMPONENT ANALYSIS; BIG DATA; DECOMPOSITIONS; ALGORITHMS; DISTANCE; SHIFT;
D O I
10.1109/TFUZZ.2024.3435730
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The current data acquisition techniques enable the gathering and storage of extensive datasets, encompassing multidimensional arrays. Recent researchers focus on the analysis of large datasets having diverse data points. These multidimensional datasets comprise diverse data points and can be represented as tensors or multidimensional arrays. Clustering, a data analysis technique, can be used to discover and reveal latent data patterns from these datasets. The traditional clustering algorithms such as k-means, fuzzy c-means (FCM), and possibilistic c-means (PCM) pose some drawbacks in efficiently delivering high-quality clustering results for tensor or multidimensional array data. This may stem from the fact that these algorithms are primarily designed for single-view or low-array datasets, rendering them less suitable for the complexities of multidimensional arrays. In response to this challenge, this article introduces the tensor-based PCM (TPCM) algorithm. TPCM utilizes a tensor distance (TD) function as the distance metric, different from the usual Euclidean distance. The TD function evaluates the distance between data points and cluster centers by considering relationships among different coordinates. To further enhance the analysis, the canonical polyadic decomposition (CPD) method and PARAFAC2 decomposition techniques are used to restructure heterogeneous data into low-order tensors. Our experiments consider two types of datasets: multiview datasets and tensor datasets. CPD is applied for tensor data decomposition, while PARAFAC2, a CPD variant, addresses multiview data with varying feature space sizes in each view. Through comprehensive illustrations and evaluations using synthetic and real datasets, we demonstrate the superior performance of TPCM. Experimental results reveal that TPCM consistently achieves higher clustering performance compared to most existing clustering algorithms.
引用
收藏
页码:5939 / 5950
页数:12
相关论文
共 59 条
[31]   Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories [J].
Li Fei-Fei ;
Fergus, Rob ;
Perona, Pietro .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2007, 106 (01) :59-70
[32]   A privacy-preserving high-order neuro-fuzzy c-means algorithm with cloud computing [J].
Li, Peng ;
Chen, Zhikui ;
Yang, Laurence T. ;
Zhao, Liang ;
Zhang, Qingchen .
NEUROCOMPUTING, 2017, 256 :82-89
[33]   Tensor Distance Based Multilinear Locality-Preserved Maximum Information Embedding [J].
Liu, Yang ;
Liu, Yan ;
Chan, Keith C. C. .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (11) :1848-1854
[34]  
MacQueen J., 1966, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, V1, P281
[35]   Solving GC-MS problems with PARAFAC2 [J].
Manuel Amigo, Jose ;
Skov, Thomas ;
Coello, Jordi ;
Maspoch, Santiago ;
Bro, Rasmus .
TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2008, 27 (08) :714-725
[36]  
McLachlan G., 2007, The EM algorithm and extensions, V382
[37]   Tensors for Data Mining and Data Fusion: Models, Applications, and Scalable Algorithms [J].
Papalexakis, Evangelos E. ;
Faloutsos, Christos ;
Sidiropoulos, Nicholas D. .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2017, 8 (02)
[38]   QUANTIZATION AND THE METHOD OF K-MEANS [J].
POLLARD, D .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1982, 28 (02) :199-205
[39]  
Rabanser S, 2017, Arxiv, DOI arXiv:1711.10781