A Sentiment Classification Model Using Group Characteristics of Writing Style Features

被引:6
|
作者
Zhao, Huan [1 ]
Zhang, Xixiang [1 ]
Li, Keqin [2 ]
机构
[1] Hunan Univ, Sch Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[2] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
关键词
IMDb; machine learning; sentiment classification; writing style; WORDS;
D O I
10.1142/S021800141756016X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis is becoming increasingly important mainly because of the growth of web comments. Sentiment polarity classification is a popular process in this field. Writing style features, such as lexical and word-based features, are often used in the authorship identification and gender classification of online messages. However, writing style features were only used in feature selection for sentiment classification. This research presents an exploratory study of the group characteristics of writing style features on the Internet Movie Database (IMDb) movie sentiment data set. Furthermore, this study utilizes the specific group characteristics of writing style in improving the performance of sentiment classification. We determine the optimum clustering number of user reviews based on writing style features distribution. According to the classification model trained on a training subset with specific writing style clustering tags, we determine that the model trained on the data set of a specific writing style group has an optimal e r ect on the classification accuracy, which is better than the model trained on the entire data set in a particular positive or negative polarity. Through the polarity characteristics of specific writing style groups, we propose a general model in improving the performance of the existing classification approach. Results of the experiments on sentiment classification using the IMDb data set demonstrate that the proposed model improves the performance in terms of classification accuracy.
引用
收藏
页数:19
相关论文
empty
未找到相关数据