Analysis of Non-Negative Double Singular Value Decomposition Initialization Method on Eigenspace-based Fuzzy C-Means Algorithm For Indonesian Online News Topic Detection

被引:0
作者
Sutrisman, Raden Trivan [1 ]
Murfi, Hendri [1 ]
机构
[1] Univ Indonesia, Dept Math, Depok 16424, Indonesia
来源
2018 6TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOICT) | 2018年
关键词
topic detection; fuzzy c-means; eigenspace; initialization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid increasing of online news in Indonesia creates the need for news analysis to obtain information as fast as possible. Topics are basic components that are often used to analyze data in the textual forms, such as the news article. By using topic modeling, topics can be detected automatically on large news documents which are difficult to perform manually. One of the topic modeling that can be used is the clustering-based method, i. e., Eigenspace-based Fuzzy C-Means (EFCM). The common initialization method of EFCM is random. However, this random initialization usually produces different topics for each run. Therefore, we consider Non-Negative Double Singular Value Decomposition (NNDSVD) as an initialization method of EFCM. Besides the advantage of non-randomness, our simulations show that the NNDSVD method gives better accuracies in term of interpretability score than the random method.
引用
收藏
页码:55 / 60
页数:6
相关论文
共 13 条
[1]  
Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms
[2]   Probabilistic Topic Models [J].
Blei, David M. .
COMMUNICATIONS OF THE ACM, 2012, 55 (04) :77-84
[3]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[4]   SVD based initialization: A head start for nonnegative matrix factorization [J].
Boutsidis, C. ;
Gallopoulos, E. .
PATTERN RECOGNITION, 2008, 41 (04) :1350-1362
[5]  
Burden RL, 2011, Numerical analysis
[6]  
Fitriyani SR, 2016, 2016 4TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOICT)
[7]  
Lau Jey Han, 2014, P 14 C EUROPEAN CHAP, P530, DOI DOI 10.3115/V1/E14-1056
[8]   Learning the parts of objects by non-negative matrix factorization [J].
Lee, DD ;
Seung, HS .
NATURE, 1999, 401 (6755) :788-791
[9]   Eigenspace-Based Fuzzy C-Means for Sensing Trending Topics in Twitter [J].
Muliawati, T. ;
Murfi, H. .
INTERNATIONAL SYMPOSIUM ON CURRENT PROGRESS IN MATHEMATICS AND SCIENCES 2016 (ISCPMS 2016), 2017, 1862
[10]  
Murfi H., 2014, INT J INTELLIGENT IN, V5