An improved clustering algorithm based on finite Gaussian mixture model

被引:15
作者
He, Zhilin [1 ]
Ho, Chun-Hsing [2 ]
机构
[1] Yuncheng Univ, Math & Informat Technol Sch, 1155 Fudan West St, Yuncheng, Shanxi, Peoples R China
[2] No Arizona Univ, Dept Civil Engn Construct Management & Environm E, POB 15600, Flagstaff, AZ 86011 USA
基金
中国国家自然科学基金;
关键词
Gaussian mixture model; EM algorithm; Cluster analysis; LIKELIHOOD;
D O I
10.1007/s11042-018-6988-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Finite Gaussian Mixture Model (FGMM) is the most commonly used model for describing mixed density distribution in cluster analysis. An important feature of the FGMM is that it can infinitely approximate any continuous distribution, as long as the model contains enough number of components. In the clustering analysis based on the FGMM, the EM algorithm is usually used to estimate the parameters of the model. The advantage is that the computation is stable and the convergence speed is fast. However, the EM algorithm relies heavily on the estimation of incomplete data. It does not use any information to reduce the uncertainty of missing data. To solve this problem, an EM algorithm based on entropy penalized maximum likelihood estimation is proposed. The novel algorithm constructs the conditional entropy model between incomplete data and missing data, and reduces the uncertainty of missing data through incomplete data. Theoretical analysis and experimental results show that the novel algorithm can effectively adapt to the FGMM, improve the clustering results and improve the efficiency of the algorithm.
引用
收藏
页码:24285 / 24299
页数:15
相关论文
共 31 条
[11]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[12]  
Li J, 2018, IEEE Transactions on Cybernetics, V99, P12
[13]   View-Based 3-D Model Retrieval: A Benchmark [J].
Liu, An-An ;
Nie, Wei-Zhi ;
Gao, Yue ;
Su, Yu-Ting .
IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (03) :916-928
[14]   A new clustering method of gene expression data based on multivariate Gaussian mixture models [J].
Liu, Zhe ;
Song, Yu-qing ;
Xie, Cong-hua ;
Tang, Zheng .
SIGNAL IMAGE AND VIDEO PROCESSING, 2016, 10 (02) :359-368
[15]  
Meila M., 1998, Uncertainty in Artificial Intelligence. Proceedings of the Fourteenth Conference (1998), P386
[16]   Initializing the EM algorithm in Gaussian mixture models with an unknown number of components [J].
Melnykov, Volodymyr ;
Melnykov, Igor .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (06) :1381-1395
[17]   Audio-Visual Emotion Recognition using Gaussian Mixture Models for Face and Voice [J].
Metallinou, Angeliki ;
Lee, Sungbok ;
Narayanan, Shrikanth .
ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, :250-257
[18]   Learning from multiple social networks [J].
Nie, Liqiang ;
Song, Xuemeng ;
Chua, Tat-Seng .
Synthesis Lectures on Information Concepts, Retrieval, and Services, 2016, 8 (02) :1-120
[19]   Beyond Doctors: Future Health Prediction from Multimedia and Multimodal Observations [J].
Nie, Liqiang ;
Zhang, Luming ;
Yang, Yi ;
Wang, Meng ;
Hong, Richang ;
Chua, Tat-Seng .
MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, :591-600
[20]   Modeling Temporal Information of Mitotic for Mitotic Event Detection [J].
Nie, Weizhi ;
Cheng, Huiyun ;
Su, Yuting .
IEEE Transactions on Big Data, 2017, 3 (04) :458-469