Topic-based content and sentiment analysis of Ebola virus on Twitter and in the news

被引:90
作者
Kim, Erin Hea-Jin [1 ]
Jeong, Yoo Kyung [1 ]
Kim, Yuyoung [1 ]
Kang, Keun Young [1 ]
Song, Min [1 ]
机构
[1] Yonsei Univ, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Ebola; sentiment analysis; text-mining; topic models; ONLINE; TRACKING;
D O I
10.1177/0165551515608733
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The present study investigates topic coverage and sentiment dynamics of two different media sources, Twitter and news publications, on the hot health issue of Ebola. We conduct content and sentiment analysis by: (1) applying vocabulary control to collected datasets; (2) employing the n-gram LDA topic modeling technique; (3) adopting entity extraction and entity network; and (4) introducing the concept of topic-based sentiment scores. With the query term 'Ebola' or 'Ebola virus', we collected 16,189 news articles from 1006 different publications and 7,106,297 tweets with the Twitter stream API. The experiments indicate that topic coverage of Twitter is narrower and more blurry than that of the news media. In terms of sentiment dynamics, the life span and variance of sentiment on Twitter is shorter and smaller than in the news. In addition, we observe that news articles focus more on event-related entities such as person, organization and location, whereas Twitter covers more time-oriented entities. Based on the results, we report on the characteristics of Twitter and news media as two distinct news outlets in terms of content coverage and sentiment dynamics.
引用
收藏
页码:763 / 781
页数:19
相关论文
共 47 条
[1]  
[Anonymous], 1999, Tech. Rep. C-1
[2]  
[Anonymous], 2010, P 23 INT C COMP LING
[3]  
[Anonymous], 2009, Sentiment140
[4]  
[Anonymous], 2011, P 20 INT C COMP WORL, DOI DOI 10.1145/1963192.1963222
[5]  
[Anonymous], P 20 INT C COMPUTATI, DOI DOI 10.3115/1220355.1220555
[6]  
[Anonymous], 2007, ICWSM
[7]  
[Anonymous], 2009, ARTIF INTELL
[8]  
[Anonymous], 2010, P INT AAAI C WEB SOC, DOI DOI 10.1609/ICWSM.V4I1.14033
[9]  
Balahur-Dobrescu A., 2010, P 7 INT C LANG RES E
[10]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022