Harnessing Twitter 'Big Data' for Automatic Emotion Identification

被引：171

作者：

Wang, Wenbo ^{[1
]}

Chen, Lu ^{[1
]}

Thirunarayan, Krishnaprasad ^{[1
]}

Sheth, Amit P. ^{[1
]}

机构：

[1] Wright State Univ, Kno E Sis Ctr, Dayton, OH 45435 USA

来源：

PROCEEDINGS OF 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY, RISK AND TRUST AND 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING (SOCIALCOM/PASSAT 2012) | 2012年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/SocialCom-PASSAT.2012.119

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

User generated content on Twitter (produced at an enormous rate of 340 million tweets per day) provides a rich source for gleaning people's emotions, which is necessary for deeper understanding of people's behaviors and actions. Extant studies on emotion identification lack comprehensive coverage of "emotional situations" because they use relatively small training datasets. To overcome this bottleneck, we have automatically created a large emotion-labeled dataset (of about 2.5 million tweets) by harnessing emotion-related hashtags available in the tweets. We have applied two different machine learning algorithms for emotion identification, to study the effectiveness of various feature combinations as well as the effect of the size of the training data on the emotion identification task. Our experiments demonstrate that a combination of unigrams, bigrams, sentiment/emotion-bearing words, and parts-of-speech information is most effective for gleaning emotions. The highest accuracy (65.57%) is achieved with a training data containing about 2 million tweets.

引用

页码：587 / 592

页数：6

共 19 条

[1]

Alm C. O., 2005, P HUM LANG TECHN C, P579

[2]

Aman S., 2008, Proceedings of the Third International Joint Conference on Natural Language Processing, P296

[3]

[Anonymous], 2007, SEMEVAL2007

[4]

[Anonymous], 2011, P ACL

[5]

[Anonymous], 2012, SEM 2012 1 JOINT C L

[6]

Chesley P., 2006, Training, V580, P233

[7]

Choudhury M., 2012, P ICWSM

[8]

Fan RE, 2008, J MACH LEARN RES, V9, P1871

[9]

Hall M., 2009, SIGKDD Explorations, V11, P10, DOI DOI 10.1145/1656274.1656278

[10]

Mishne G., P ACM SIGIR 2005 WOR

← 1 2 →