Big Data Analysis for Event Detection in Microblogs

被引:6
作者
Cherichi, Soumaya [1 ]
Faiz, Rim [2 ]
机构
[1] Univ Tunis, ISG, LARODEC, Tunis, Tunisia
[2] Univ Carthage, IHEC, LARODEC, Tunis, Tunisia
来源
RECENT DEVELOPMENTS IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS | 2016年 / 642卷
关键词
Microblogs; Relevant information; NLP; Event detection; Big data;
D O I
10.1007/978-3-319-31277-4_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The growing complexity of the Twitter micro-blogging service in terms of size, number of users, and variety of bloggers relationships have generated a big data which requires innovative approaches in order to analyse, extract and detect non-obvious and popular events. Under such a circumstance, we aim, in this paper, to use big data analytics within twitter to allow real time event detection. These challenges present a big opportunity for Natural Language Processing (NLP) and Information Extraction (IE) technology to enable new large-scale data-analysis applications. Taking to account all the difficulties, this paper proposes a new metric to improve the results of the searches in microblogs. It combines content relevance, tweet relevance and author relevance, and develops a Natural Language Processing method for extracting temporal information of events from posts more specifically tweets. Our approach is based on a methodology of temporal markers classes and on a contextual exploration method. To evaluate our model, we built a knowledge management system. Actually, we used a collection of 10 thousand of tweets talking about the current events in 2014 and 2015.
引用
收藏
页码:309 / 319
页数:11
相关论文
共 26 条
[1]  
[Anonymous], 2010, WWW 10
[2]  
[Anonymous], 2010, P 23 INT C COMP LING
[3]  
[Anonymous], 2011, P ACM 2011 C COMP SU, DOI DOI 10.1145/1958824.1958830
[4]  
[Anonymous], 2011, ICWSM
[5]  
[Anonymous], 2006, INTERNA TIONAL J COM
[6]  
[Anonymous], 2011, BIG DATA NEXT FRONTI
[7]  
[Anonymous], P EMNLP
[8]  
[Anonymous], 2010, 1 WORKSH SOC MED AN
[9]  
[Anonymous], 2010, ICWSM, DOI DOI 10.1609/ICWSM.V4I1.14031
[10]  
[Anonymous], WWW 10