Information Theoretical Importance Sampling Clustering and Its Relationship With Fuzzy C-Means

被引:0
作者
Zhang, Jiangshe [1 ]
Ji, Lizhen [1 ]
Wang, Meng [2 ]
机构
[1] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Peoples R China
[2] Northwest China Grid Co Ltd, Xian 710048, Peoples R China
基金
中国国家自然科学基金;
关键词
Optimization; Monte Carlo methods; Clustering algorithms; Distortion; Synthetic data; Load forecasting; Clustering methods; Fuzzy c-means (FCM); importance sampling; information theory; minimax principle; OPTIMIZATION; CLASSIFICATION; ALGORITHM;
D O I
10.1109/TFUZZ.2023.3345874
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A current assumption of most clustering methods is that the training data and future data are taken from the same distribution. However, this assumption may not hold in most real-world scenarios. In this article, we propose an information theoretical importance sampling based approach for clustering problems (ITISC), which minimizes the worst case of expected distortions under the constraint of distribution deviation. The distribution deviation constraint can be converted to the constraint over a set of weight distributions centered on the uniform distribution derived from importance sampling. The objective of the proposed approach is to minimize the loss under maximum degradation hence the resulting problem is a constrained minimax optimization problem, which can be reformulated as an unconstrained problem using the Lagrange method. The optimization problem can be solved by either an alternating optimization algorithm or a general optimization routine by commercially available software. Experiment results on synthetic datasets and a real-world load forecasting problem validate the effectiveness of the proposed model. Furthermore, we demonstrate that fuzzy c-means is a special case of ITISC with the logarithmic distortion, and this observation provides an interesting physical interpretation for fuzzy exponent m.
引用
收藏
页码:2164 / 2175
页数:12
相关论文
共 54 条
[1]  
Abdollahpouri H., 2019, PROC 32 INTFLORIDA A
[2]  
[Anonymous], 2022, Matlab: Optimization Toolbox Version 9.3 (R2022a)
[3]  
[Anonymous], 2004, Mathware Soft Comput.
[4]  
Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
[5]  
Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms
[6]  
Bezdek JamesChristian., 1973, FUZZY MATH PATTERN C
[7]   FCM - THE FUZZY C-MEANS CLUSTERING-ALGORITHM [J].
BEZDEK, JC ;
EHRLICH, R ;
FULL, W .
COMPUTERS & GEOSCIENCES, 1984, 10 (2-3) :191-203
[8]   Fuzzy unsupervised classification of multivariate time trajectories with the Shannon entropy [J].
Coppi, R ;
D'Urso, P .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (06) :1452-1477
[9]   Robust clustering methods: A unified view [J].
Dave, RN ;
Krishnapuram, R .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1997, 5 (02) :270-293
[10]  
Dias M. L. D., 2024, fuzzy-c-means: An implementation of fuzzyc-meansclustering algorithm