A Statistical Parsing Framework for Sentiment Classification

被引:26
作者
Dong, Li [1 ,2 ]
Wei, Furu [3 ]
Liu, Shujie [3 ]
Zhou, Ming [3 ]
Xu, Ke [1 ]
机构
[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China
[2] Microsoft Res, Redmond, WA USA
[3] Microsoft Res Asia, Nat Language Comp Grp, Beijing 100080, Peoples R China
关键词
Formal Description - Polarity models - Sentence level - Sentiment classification - Statistical parser - Statistical parsing - Syntactic annotation - Syntactic parsing;
D O I
10.1162/COLI_a_00221
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a statistical parsing framework for sentence-level sentiment classification in this article. Unlike previous works that use syntactic parsing results for sentiment analysis, we develop a statistical parser to directly analyze the sentiment structure of a sentence. We show that complicated phenomena in sentiment analysis (e.g., negation, intensification, and contrast) can be handled the same way as simple and straightforward sentiment expressions in a unified and probabilistic way. We formulate the sentiment grammar upon Context-Free Grammars (CFGs), and provide a formal description of the sentiment parsing framework. We develop the parsing model to obtain possible sentiment parse trees for a sentence, from which the polarity model is proposed to derive the sentiment strength and polarity, and the ranking model is dedicated to selecting the best sentiment tree. We train the parser directly from examples of sentences annotated only with sentiment polarity labels but without any syntactic annotations or polarity annotations of constituents within sentences. Therefore we can obtain training data easily. In particular, we train a sentiment parser, s.parser, from a large amount of review sentences with users' ratings as rough sentiment polarity labels. Extensive experiments on existing benchmark data sets show significant improvements over baseline sentiment classification approaches.
引用
收藏
页码:293 / 336
页数:44
相关论文
共 85 条
[1]  
Agrawal R., P 20 INT C VERY LARG
[2]  
[Anonymous], 2007, ACL 07
[3]  
[Anonymous], 2006, Proceedings of the 2006 conference on empirical methods in natural language processing
[4]  
[Anonymous], 2012, SEMANTIC COMPOSITION, DOI DOI 10.1162/153244303322533223
[5]  
[Anonymous], 2012, P 2012 JOINT C EMP M
[6]  
[Anonymous], 2005, P 43 ANN M ASS COMP
[7]  
[Anonymous], 2010, P 23 INT C COMPUTATI
[8]  
[Anonymous], 2012, Synthesis Lectures on Human Language Technologies
[9]  
[Anonymous], 2007, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
[10]  
[Anonymous], P 27 AAAI C ART INT