Text feature selection for sentiment classification of Chinese online reviews

被引:22
|
作者
Wang, Hongwei [1 ]
Yin, Pei [1 ]
Yao, Jiani [1 ]
Liu, James N. K. [2 ]
机构
[1] Tongji Univ, Sch Econ & Management, Shanghai 200092, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
关键词
feature selection method; text classification; sentiment classification; Chinese online reviews;
D O I
10.1080/0952813X.2012.721139
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to meet the requirement of customised services for online communities, sentiment classification of online reviews has been applied to study the unstructured reviews so as to identify users' opinions on certain products. The purpose of this article is to select features for sentiment classification of Chinese online reviews with techniques well performed in traditional text classification. First, adjectives, adverbs and verbs are identified as the potential text features containing sentiment information. Then, four statistical feature selection methods, such as document frequency (DF), information gain (IG), chi-squared statistic (CHI) and mutual information (MI), are adopted to select features. After that, the Boolean weighting method is applied to set feature weights and construct a vector space model. Finally, a support vector machine (SVM) classifier is employed to predict the sentiment polarity of online reviews. Comparative experiments are conducted based on hotel online reviews in Chinese. The results indicate that the highest accuracy of the sentiment classification of Chinese online reviews is achieved by taking adjectives, adverbs and verbs together as the feature. Besides that, different feature selection methods make distinct performances on sentiment classification, as DF performs the best, CHI follows and IG ranks the last, whereas MI is not suitable for sentiment classification of Chinese online reviews. This conclusion will be helpful to improve the accuracy of sentiment classification and be useful for further research.
引用
收藏
页码:425 / 439
页数:15
相关论文
共 50 条
  • [1] Feature Selection for Chinese Online Reviews Sentiment Classification
    Chen, Xian
    Ma, Jing
    Lu, Yueming
    2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PROBLEM-SOLVING (ICCP), 2013, : 79 - 82
  • [2] Sentimental feature selection for sentiment analysis of Chinese online reviews
    Zheng, Lijuan
    Wang, Hongwei
    Gao, Song
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (01) : 75 - 84
  • [3] Sentimental feature selection for sentiment analysis of Chinese online reviews
    Lijuan Zheng
    Hongwei Wang
    Song Gao
    International Journal of Machine Learning and Cybernetics, 2018, 9 : 75 - 84
  • [4] A hybrid method of feature selection for Chinese text sentiment classification
    Wang, Suge
    Wei, Yingjie
    Li, Deyu
    Zhang, Wu
    Li, Wei
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2007, : 435 - +
  • [5] Sentiment Feature Identification from Chinese Online Reviews
    Yao, Jiani
    Wang, Hongwei
    Yin, Pei
    ADVANCES IN INFORMATION TECHNOLOGY AND EDUCATION, PT I, 2011, 201 : 315 - 322
  • [6] Sentiment Analysis in Online Reviews Classification using Text Mining Techniques
    Agueda, M.
    Rita, P.
    Guerreiro, P.
    2019 14TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI), 2019,
  • [7] Sentiment Analysis of Movie Reviews: A study on Feature Selection & Classification Algorithms
    Sahu, Tirath Prasad
    Ahuja, Sanjeev
    2016 INTERNATIONAL CONFERENCE ON MICROELECTRONICS, COMPUTING AND COMMUNICATIONS (MICROCOM), 2016,
  • [8] Sentiment classification of Chinese online reviews: a comparison of factors influencing performances
    Wang, Hongwei
    Zheng, Lijuan
    ENTERPRISE INFORMATION SYSTEMS, 2016, 10 (02) : 228 - 244
  • [9] Feature selection and text classification for Chinese web documents
    Xu, JC
    Liu, DY
    Hu, M
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1304 - 1309
  • [10] Feature selection and machine learning algorithms for uyghur text sentiment classification
    Turhuntay, Raxida
    Slamu, Wushour
    Dawut, Abdusalam
    Hamdulla, Askar
    Turhun, Erxat
    Boletin Tecnico/Technical Bulletin, 2017, 55 (13): : 56 - 66