Leveraging sentiment analysis at the aspects level to predict ratings of reviews

被引:40
作者
Qiu, Jiangtao [1 ,2 ,3 ]
Liu, Chuanhui [4 ]
Li, Yinghong [5 ]
Lin, Zhangxi [6 ,7 ]
机构
[1] Southwestern Univ Finance & Econ, Sch Informat, Chengdu, Sichuan, Peoples R China
[2] Southwestern Univ Finance & Econ, Res Ctr Big Data, Chengdu, Sichuan, Peoples R China
[3] Key Lab Financial Intelligence & Financial Engn S, Chengdu, Sichuan, Peoples R China
[4] Southwestern Univ Finance & Econ, Sch Econ, Chengdu, Sichuan, Peoples R China
[5] Southwestern Univ Finance & Econ, Sch Humanities, Chengdu, Sichuan, Peoples R China
[6] Xihua Univ, Sch Econ, Chengdu, Sichuan, Peoples R China
[7] Texas Tech Univ, Rawls Coll Business Adm, Lubbock, TX 79409 USA
基金
中国国家自然科学基金; 国家教育部科学基金资助;
关键词
Sentiment analysis; Class imbalance; Ratings of reviews; Business Intelligence;
D O I
10.1016/j.ins.2018.04.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Online reviews are an important asset for users who are deciding to buy a product, see a movie, or go to a restaurant and for managers who are making business decisions. The reviews from e-commerce websites are usually attached to ratings, which facilitates learning from the reviews by users. However, many reviews that spread across forums or social media are written in plain text, which is not rated, and these reviews are called non-rated reviews in this paper. From the perspective of sentiment analysis at the aspects level, this study develops a predictive framework for calculating ratings for non-rated reviews. The idea behind the framework began with an observation: the sentiment of an aspect is determined by its context; the rating of the review depends on the sentiment of the aspects and the number of positive and negative aspects in the review. Viewing term pairs that co-occur with aspects as their context, we conceived of a variant of a Conditional Random Field model, called SentiCRF, for generating term pairs and calculating their sentiment scores from a training set. Then, we developed a cumulative logit model that uses aspects and their sentiments in a review to predict the ratings of the review. In addition, we met the challenge of class imbalance when calculating the sentiment scores of term pairs. We also conceived of a heuristic re-sampling algorithm to tackle class imbalance. Experiments were conducted on the Yelp dataset, and their results demonstrate that the predictive framework is feasible and effective at predicting the ratings of reviews. Published by Elsevier Inc.
引用
收藏
页码:295 / 309
页数:15
相关论文
共 28 条
[1]  
[Anonymous], 2001, PROC 18 INT C MACH L
[2]  
[Anonymous], 2006, Proceedings of the LREC-06, 5th conference on language resources and evaluation,, noeth, DOI DOI 10.1155/2015/715730
[3]   A hybrid approach to the sentiment analysis problem at the sentence level [J].
Appel, Orestes ;
Chiclana, Francisco ;
Carter, Jenny ;
Fujita, Hamido .
KNOWLEDGE-BASED SYSTEMS, 2016, 108 :110-124
[4]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5]   Learning User and Product Distributed Representations Using a Sequence Model for Sentiment Analysis [J].
Chen, Tao ;
Xu, Ruifeng ;
He, Yulan ;
Xia, Yunqing ;
Wang, Xuan .
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2016, 11 (03) :35-45
[6]   Evaluation of classifiers for an uneven class distribution problem [J].
Daskalaki, S ;
Kopanas, I ;
Avouris, N .
APPLIED ARTIFICIAL INTELLIGENCE, 2006, 20 (05) :381-417
[7]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[8]   On the effectiveness of preprocessing methods when dealing with different levels of class imbalance [J].
Garcia, V. ;
Sanchez, J. S. ;
Mollineda, R. A. .
KNOWLEDGE-BASED SYSTEMS, 2012, 25 (01) :13-21
[9]  
Kalchbrenner N., 2014, SENTENCE MODEL BASED
[10]   Sentiment classification of movie reviews using contextual valence shifters [J].
Kennedy, Alistair ;
Inkpen, Diana .
COMPUTATIONAL INTELLIGENCE, 2006, 22 (02) :110-125