Signals of Public Opinion in Online Communication: A Comparison of Methods and Data Sources

被引:64
作者
Gonzalez-Bailon, Sandra [1 ]
Paltoglou, Georgios [2 ]
机构
[1] Univ Penn, Annenberg Sch Commun, Philadelphia, PA 19104 USA
[2] Wolverhampton Univ, Sch Math & Comp Sci, Wolverhampton, W Midlands, England
关键词
content analysis; text mining; sentiment analysis; language formality; information diversity; lexicon-based methods; machine learning; SENTIMENT;
D O I
10.1177/0002716215569192
中图分类号
D0 [政治学、政治理论];
学科分类号
0302 ; 030201 ;
摘要
This study offers a systematic comparison of automated content analysis tools. The ability of different lexicons to correctly identify affective tone (e.g., positive vs. negative) is assessed in different social media environments. Our comparisons examine the reliability and validity of publicly available, off-the-shelf classifiers. We use datasets from a range of online sources that vary in the diversity and formality of the language used, and we apply different classifiers to extract information about the affective tone in these datasets. We first measure agreement (reliability test) and then compare their classifications with the benchmark of human coding (validity test). Our analyses show that validity and reliability vary with the formality and diversity of the text; we also show that ready-to-use methods leave much space for improvement when analyzing domain-specific content and that a machine-learning approach offers more accurate predictions across communication domains.
引用
收藏
页码:95 / 107
页数:13
相关论文
共 30 条
[1]  
[Anonymous], 2011, Proc. Int. AAAI Conf. Web Soc. Media, DOI DOI 10.1609/ICWSM.V5I1.14171
[2]  
[Anonymous], 1999, WORKSH MACH LEARN IN
[3]  
[Anonymous], MALLET MACHINE LEARN
[4]  
Bache K, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P23
[5]  
Bing L., 2012, Sentiment Analysis and Opinion Mining (Synthesis Lectures on Human Language Technologies)
[6]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[7]  
Bo Pang, 2008, Foundations and Trends in Information Retrieval, V2, P1, DOI 10.1561/1500000001
[8]  
Bradley Margaret M., 1999, C1 U FLOR CTR RES PS
[9]  
Christopher D., 2008, INTRO INFORM RETRIEV
[10]   Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of US government arts funding [J].
DiMaggio, Paul ;
Nag, Manish ;
Blei, David .
POETICS, 2013, 41 (06) :570-606