Text feature selection for sentiment classification of Chinese online reviews

被引:22
作者
Wang, Hongwei [1 ]
Yin, Pei [1 ]
Yao, Jiani [1 ]
Liu, James N. K. [2 ]
机构
[1] Tongji Univ, Sch Econ & Management, Shanghai 200092, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
关键词
feature selection method; text classification; sentiment classification; Chinese online reviews;
D O I
10.1080/0952813X.2012.721139
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to meet the requirement of customised services for online communities, sentiment classification of online reviews has been applied to study the unstructured reviews so as to identify users' opinions on certain products. The purpose of this article is to select features for sentiment classification of Chinese online reviews with techniques well performed in traditional text classification. First, adjectives, adverbs and verbs are identified as the potential text features containing sentiment information. Then, four statistical feature selection methods, such as document frequency (DF), information gain (IG), chi-squared statistic (CHI) and mutual information (MI), are adopted to select features. After that, the Boolean weighting method is applied to set feature weights and construct a vector space model. Finally, a support vector machine (SVM) classifier is employed to predict the sentiment polarity of online reviews. Comparative experiments are conducted based on hotel online reviews in Chinese. The results indicate that the highest accuracy of the sentiment classification of Chinese online reviews is achieved by taking adjectives, adverbs and verbs together as the feature. Besides that, different feature selection methods make distinct performances on sentiment classification, as DF performs the best, CHI follows and IG ranks the last, whereas MI is not suitable for sentiment classification of Chinese online reviews. This conclusion will be helpful to improve the accuracy of sentiment classification and be useful for further research.
引用
收藏
页码:425 / 439
页数:15
相关论文
共 50 条
[41]   Predictive aspect-based sentiment classification of online tourist reviews [J].
Afzaal, Muhammad ;
Usman, Muhammad ;
Fong, Alvis .
JOURNAL OF INFORMATION SCIENCE, 2019, 45 (03) :341-363
[42]   Chinese Text Sentiment Classification Based on Extreme Learning Machine [J].
Lin, Fangye ;
Yu, Yuanlong .
PROCEEDINGS OF ELM-2016, 2018, 9 :171-181
[43]   A mixture language model for the classification of Chinese online reviews [J].
Jiang, Ming ;
Wang, Jian ;
Wang, Xingqi ;
Tang, Jingfan ;
Wu, Chunming .
International Journal of Information and Communication Technology, 2015, 7 (01) :109-122
[44]   A novel two-stage wrapper feature selection approach based on greedy search for text sentiment classification [J].
Sagbas, Ensar Arif .
NEUROCOMPUTING, 2024, 590
[45]   Sentiment Classification of Crowdsourcing Participants' Reviews Text Based on LDA Topic Model [J].
Huang, Yanrong ;
Wang, Rui ;
Huang, Bin ;
Wei, Bo ;
Zheng, Shu Li ;
Chen, Min .
IEEE ACCESS, 2021, 9 :108131-108143
[46]   Sentiment classification for chinese reviews: A comparison between SVM and semantic approaches [J].
Ye, Q ;
Lin, B ;
Li, YJ .
PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, :2341-2346
[47]   Sentiment Classification for Chinese Product Reviews Based on Semantic Relevance of Phrasen [J].
Chen, Heng ;
Jin, Hai ;
Yuan, Pingpeng ;
Zhu, Lei ;
Zhu, Hang .
WEB TECHNOLOGIES AND APPLICATIONS (APWEB 2015), 2015, 9313 :340-351
[48]   Implicit Sentiment Classification Model Based on Enhancement of Sentiment Features Oriented to Chinese Text [J].
Tan, Guangpu ;
Zhu, Guangli ;
Wei, Siyu .
Computer Engineering and Applications, 2024, 60 (03) :196-204
[49]   An Experimental Research on Sentiment Classification of Chinese Reviews by Semantic Orientation Method [J].
Li Shi ;
Yang Jun-zuo ;
Li Yi-jun ;
Ye Qiang .
2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, :3999-+
[50]   Hybrid Filter–Wrapper Feature Selection Method for Sentiment Classification [J].
Gunjan Ansari ;
Tanvir Ahmad ;
Mohammad Najmud Doja .
Arabian Journal for Science and Engineering, 2019, 44 :9191-9208