Microblog Sentiment Analysis with Emoticon Space Model

被引:48
作者
Jiang, Fei [1 ,2 ,3 ]
Liu, Yi-Qun [1 ,2 ,3 ]
Luan, Huan-Bo [1 ,2 ,3 ]
Sun, Jia-Shen [4 ]
Zhu, Xuan [4 ]
Zhang, Min [1 ,2 ,3 ]
Ma, Shao-Ping [1 ,2 ,3 ]
机构
[1] State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China
[2] Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
[3] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[4] Samsung Res & Dev Inst China, Language Comp Lab, Beijing 100028, Peoples R China
基金
中国国家自然科学基金;
关键词
microblog sentiment analysis; emoticon space; polarity classification; subjectivity classification; emotion classification;
D O I
10.1007/s11390-015-1587-1
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Emoticons have been widely employed to express different types of moods, emotions, and feelings in microblog environments. They are therefore regarded as one of the most important signals for microblog sentiment analysis. Most existing studies use several emoticons that convey clear emotional meanings as noisy sentiment labels or similar sentiment indicators. However, in practical microblog environments, tens or even hundreds of emoticons are frequently adopted and all emoticons have their own unique emotional meanings. Besides, a considerable number of emoticons do not have clear emotional meanings. An improved sentiment analysis model should not overlook these phenomena. Instead of manually assigning sentiment labels to several emoticons that convey relatively clear meanings, we propose the emoticon space model (ESM) that leverages more emoticons to construct word representations from a massive amount of unlabeled data. By projecting words and microblog posts into an emoticon space, the proposed model helps identify subjectivity, polarity, and emotion in microblog environments. The experimental results for a public microblog benchmark corpus (NLP&CC 2013) indicate that ESM effectively leverages emoticon signals and outperforms previous state-of-the-art strategies and benchmark best runs.
引用
收藏
页码:1120 / 1129
页数:10
相关论文
共 23 条
[1]  
[Anonymous], 2010, Proceedings of the 23rdInternational Conference on Computational Linguistics: Posters
[2]  
[Anonymous], 2011, 49 ANN M ASS COMP LI
[3]  
[Anonymous], 2010, Proceedings of the 19th ACM International Conference on Information and Knowledge Management, DOI DOI 10.1145/1871437.1871741
[4]  
[Anonymous], 2009, Advances in neural information processing systems
[5]  
Barbosa L., 2010, INT C COMP LING, P36, DOI [DOI 10.1145/3167132.3167324, 10.1016/j.sedgeo.2006.07.004]
[6]  
Bengio Y, 2001, ADV NEUR IN, V13, P932
[7]  
Bifet A, 2010, LECT NOTES ARTIF INT, V6332, P1, DOI 10.1007/978-3-642-16184-1_1
[8]   Twitter mood predicts the stock market [J].
Bollen, Johan ;
Mao, Huina ;
Zeng, Xiaojun .
JOURNAL OF COMPUTATIONAL SCIENCE, 2011, 2 (01) :1-8
[9]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[10]  
Cui AQ, 2011, LECT NOTES COMPUT SC, V7097, P238, DOI 10.1007/978-3-642-25631-8_22