Real-Time Novel Event Detection from Social Media

被引:38
作者
Li, Quanzhi [1 ]
Nourbakhsh, Armineh [1 ]
Shah, Sameena [1 ]
Liu, Xiaomo [1 ]
机构
[1] Thomson Reuters, Res & Dev, 3 Times Sq, New York, NY 10036 USA
来源
2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017) | 2017年
关键词
event detection; event novelty; novel event; temporal identification; temporal information; semantic class; social media;
D O I
10.1109/ICDE.2017.157
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a new approach for detecting novel events from social media, specially Twitter, at real-time. An event is usually defined by who, what, where and when, and an event tweet usually contains terms corresponding to these aspects. To exploit this information, we propose a method that incorporates simple semantics by splitting the tweet term space into groups of terms that have the meaning of the same type. These groups are called semantic categories (classes) and each reflects one or more event aspects. The semantic classes include named entity, mention, location, hashtag, verb, noun and embedded link. To group tweets talking about the same event into the same cluster, similarity measuring is conducted by calculating class-wise similarity and then aggregating them together. Users of a real-time event detection system are usually only interested in novel (new) events, which are happening now or just happened a short time ago. To fulfill this requirement, a temporal identification module is used to filter out event clusters that are about old stories. The clustering module also computes a novelty score for each event cluster, which reflects how novel the event is, compared to previous events. We evaluated our event detection method using multiple quality metrics and a large-scale event corpus having millions of tweets. The experiment results show that the proposed online event detection method achieves the state-of-the-art performance. Our experiment also shows that the temporal identification module can effectively detect old events.
引用
收藏
页码:1129 / 1139
页数:11
相关论文
共 49 条
[1]  
Aggarwa C., SIAM SDM 2012
[2]  
Allan J., 2000, CIKM 2000
[3]  
Allan J., 2000, P TOP DET TRACK TDT
[4]  
Amer-Yahia S., SIGMOD 12
[5]  
Amigo Enrique., 2008, INFORM RETRIEVAL
[6]  
[Anonymous], 2010, Proceedings of the 5th International Workshop on Semantic Evaluation
[7]  
[Anonymous], 2010, HLT 10
[8]  
[Anonymous], ICWSM 2011
[9]  
[Anonymous], 2010, First Monday, DOI [DOI 10.5210/FM.V15I1.2793, 10.5210/fm.v15i1.2793]
[10]   A SURVEY OF TECHNIQUES FOR EVENT DETECTION IN TWITTER [J].
Atefeh, Farzindar ;
Khreich, Wael .
COMPUTATIONAL INTELLIGENCE, 2015, 31 (01) :132-164