DOCUMENTS CLUSTERING BA SED ON MAX-CORRENTROPY NONNEGATIVE MATRIX FACTORIZATION

被引:0
作者
Li, Le [1 ]
Yang, Jianjun [2 ]
Xu, Yang [3 ]
Qin, Zhen [3 ]
Zhang, Honggang [3 ]
机构
[1] Univ Waterloo, David R Cheriton Sch Comp Sci, Waterloo, ON N2L 3G1, Canada
[2] Univ North Georgia, Dept Comp Sci, Oakwood, GA 30566 USA
[3] Beijing Univ Posts & Telecommun, Pattem Recognit & Intelligent Syst Lab, Beijing, Peoples R China
来源
PROCEEDINGS OF 2014 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 2 | 2014年
基金
中国国家自然科学基金;
关键词
Document clustering; Nonnegative matrix factorization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nonnegative matrix factorization (NMF) has been successfully applied to many areas of both classification and clustering. Commonly used NMF algorithms mainly target on minimizing the l(2) distance or the Kullback-Leibler (KL) divergence, which may not be suitable for nonlinear cases. In this paper, we propose a new decomposition method by maximizing the correntropy between the original and the product of two low-rank matrices for document clustering. This method also allows us to learn new basis vectors of the semantic feature space from data. To our knowledge, there is no existing work which clusters high dimensional document data by maximizing the correntropy in NMF. Our experimental results show the supremacy of the proposed method over other variants of NMF algorithms on Reuters21578 and TDT2 databasets.
引用
收藏
页码:850 / 855
页数:6
相关论文
共 35 条
[1]  
Aggarwal C. C., 2012, MINING TEXT DATA, P163, DOI [DOI 10.1007/978-1-4614-3223-46, DOI 10.1007/978-1-4614-3223-4, 10.1007/978-1-4614-3223-4]
[2]  
[Anonymous], 2012, INT J INF SECUR
[3]  
[Anonymous], IRON ALLOY
[4]  
[Anonymous], NEUROIMAGE
[5]  
[Anonymous], J SOFTWARE
[6]  
[Anonymous], 2003, P 26 ANN INT ACM SIG, DOI DOI 10.1145/860435.860485
[7]  
[Anonymous], INT C INF MULT TECHN
[8]  
[Anonymous], MOBILE TARGET SCENAR
[9]  
[Anonymous], J SOFTWARE
[10]  
Baker L. D., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P96, DOI 10.1145/290941.290970