Tensor-Based Possibilistic C-Means Clustering

被引：1

作者：

Benjamin, Josephine Bernadette M. ^{[1
]}

Yang, Miin-Shen ^{[2
]}

机构：

[1] Univ Santo Tomas, Dept Math & Phys, Manila 1008, Philippines

[2] Chung Yuan Christian Univ, Dept Appl Math, Taoyuan 32023, Taiwan

来源：

IEEE TRANSACTIONS ON FUZZY SYSTEMS | 2024年 / 32卷 / 10期

关键词：

Tensors; Clustering algorithms; Phase change materials; Arrays; Linear programming; Heuristic algorithms; Euclidean distance; Clustering; possibilistic C-means (PCM); tensor data; tensor decomposition; tensor distance (TD); tensor-based clustering; tensor-based PCM (TPCM); COMPONENT ANALYSIS; BIG DATA; DECOMPOSITIONS; ALGORITHMS; DISTANCE; SHIFT;

D O I：

10.1109/TFUZZ.2024.3435730

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The current data acquisition techniques enable the gathering and storage of extensive datasets, encompassing multidimensional arrays. Recent researchers focus on the analysis of large datasets having diverse data points. These multidimensional datasets comprise diverse data points and can be represented as tensors or multidimensional arrays. Clustering, a data analysis technique, can be used to discover and reveal latent data patterns from these datasets. The traditional clustering algorithms such as k-means, fuzzy c-means (FCM), and possibilistic c-means (PCM) pose some drawbacks in efficiently delivering high-quality clustering results for tensor or multidimensional array data. This may stem from the fact that these algorithms are primarily designed for single-view or low-array datasets, rendering them less suitable for the complexities of multidimensional arrays. In response to this challenge, this article introduces the tensor-based PCM (TPCM) algorithm. TPCM utilizes a tensor distance (TD) function as the distance metric, different from the usual Euclidean distance. The TD function evaluates the distance between data points and cluster centers by considering relationships among different coordinates. To further enhance the analysis, the canonical polyadic decomposition (CPD) method and PARAFAC2 decomposition techniques are used to restructure heterogeneous data into low-order tensors. Our experiments consider two types of datasets: multiview datasets and tensor datasets. CPD is applied for tensor data decomposition, while PARAFAC2, a CPD variant, addresses multiview data with varying feature space sizes in each view. Through comprehensive illustrations and evaluations using synthetic and real datasets, we demonstrate the superior performance of TPCM. Experimental results reveal that TPCM consistently achieves higher clustering performance compared to most existing clustering algorithms.

引用

页码：5939 / 5950

页数：12

共 59 条

[1]

Bader B.W., 2015, MATLAB TENSOR TOOLBO

[2] Tensorial extensions of independent component analysis for multisubject FMRI analysis [J].

Beckmann, CF ;

Smith, SM .

NEUROIMAGE, 2005, 25 (01) :294-311

[3] Weighted Multiview Possibilistic C-Means Clustering With L2 Regularization [J].

Benjamin, Josephine Bernadette M. ;

Yang, Miin-Shen .

IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (05) :1357-1370

[4]

Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms

[5] Exploratory study of sugar production using fluorescence spectroscopy and multi-way analysis [J].

Bro, R .

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1999, 46 (02) :133-147

[6]

BRO R., 1998, Ph.D. thesis,

[7]

CAI X., 2013, 23 INT JOINT C ARTIF, P2598

[8] ANALYSIS OF INDIVIDUAL DIFFERENCES IN MULTIDIMENSIONAL SCALING VIA AN N-WAY GENERALIZATION OF ECKART-YOUNG DECOMPOSITION [J].

CARROLL, JD ;

CHANG, JJ .

PSYCHOMETRIKA, 1970, 35 (03) :283-&

[9] On mean shift-based clustering for circular data [J].

Chang-Chien, Shou-Jen ;

Hung, Wen-Liang ;

Yang, Miin-Shen .

SOFT COMPUTING, 2012, 16 (06) :1043-1060

[10]

Chaomurilige C., 2015, IEEE T FUZZY SYST, V23, P2329, DOI DOI 10.1109/TFUZZ.2015.2421071

← 1 2 3 4 5 6 →