Word Embedding based Clustering to Detect Topics in Social Media

被引:29
作者
Comito, Carmela [1 ]
Forestiero, Agostino [1 ]
Pizzuti, Clara [1 ]
机构
[1] Nat Res Council Italy CNR, Inst High Performance Comp & Networking ICAR, Arcavacata Di Rende, Italy
来源
2019 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2019) | 2019年
关键词
Social Media; Topic Detection; Word Embedding; Clustering;
D O I
10.1145/3350546.3352518
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social media are playing an increasingly important role in reporting major events happening in the world. However, detecting events and topics of interest from social media is a challenging task due to the huge magnitude of the data and the complex semantics of the language being processed. The paper proposes an online algorithm to discover topics that incrementally groups short text by incorporating the textual content with latent feature vector representations of words appearing in the text, trained on very large corpora to improve the check-in topic mapping learnt on a smaller corpus. Experimental results show that by using information from the external corpora, the approach obtains significant improvements with respect to classical topic detection methods.
引用
收藏
页码:192 / 199
页数:8
相关论文
共 25 条
[1]   Can We Predict a Riot? Disruptive Event Detection Using Twitter [J].
Alsaedi, Nasser ;
Burnap, Pete ;
Rana, Omer .
ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2017, 17 (02)
[2]  
[Anonymous], USING MACHINE LEARNI
[3]  
[Anonymous], 2016, SOCIAL NETWORK ANAL
[4]  
[Anonymous], ADV NEURAL INF PROCE
[5]  
[Anonymous], 2018, IEEE ACS 15 INT C CO
[6]  
[Anonymous], IEEE INTELL SYST
[7]  
[Anonymous], MIXING DIRICHLET TOP
[8]  
[Anonymous], 2010, HLT 10
[9]  
[Anonymous], LECT NOTES COMPUTER
[10]  
[Anonymous], 2011, P 5 INT C WEBL SOC M