Semantic Features for Optimizing Supervised Approach of Sentiment Analysis on Product Reviews

被引:8
作者
Rintyarna, Bagus Setya [1 ,2 ]
Sarno, Riyanarto [1 ]
Fatichah, Chastine [1 ]
机构
[1] Inst Teknol Sepuluh Nopember, Dept Informat, Surabaya 60111, Indonesia
[2] Univ Muhammadiyah Jember, Dept Elect Engn, Jember 68124, Indonesia
关键词
sentiment analysis; product reviews; machine learning; INFORMATION;
D O I
10.3390/computers8030055
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The growth of ecommerce has triggered online reviews as a rich source of product information. Revealing consumer sentiment from the reviews through Sentiment Analysis (SA) is an important task of online product review analysis. Two popular approaches of SA are the supervised approach and the lexicon-based approach. In supervised approach, the employed machine learning (ML) algorithm is not the only one to influence the results of SA. The utilized text features also handle an important role in determining the performance of SA tasks. In this regard, we proposed a method to extract text features that takes into account semantic of words. We argue that this semantic feature is capable of augmenting the results of supervised SA tasks compared to commonly utilized features, i.e., bag-of-words (BoW). To extract the features, we assigned the correct sense of the word in reviewing the sentence by adopting a Word Sense Disambiguation (WSD) technique. Several WordNet similarity algorithms were involved, and correct sentiment values were assigned to words. Accordingly, we generated text features for product review documents. To evaluate the performance of our text features in the supervised approach, we conducted experiments using several ML algorithms and feature selection methods. The results of the experiments using 10-fold cross-validation indicated that our proposed semantic features favorably increased the performance of SA by 10.9%, 9.2%, and 10.6% of precision, recall, and F-Measure, respectively, compared with baseline methods.
引用
收藏
页数:16
相关论文
共 34 条
[1]   Random Forest and Support Vector Machine based Hybrid Approach to Sentiment Analysis [J].
Al Amrani, Yassine ;
Lazaar, Mohamed ;
El Kadiri, Kamal Eddine .
PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES (ICDS2017), 2018, 127 :511-520
[2]  
[Anonymous], INT J INF DECIS SCI
[3]  
[Anonymous], 2011, P 2011 SIAM INT C DA
[4]  
Baccianella S, 2010, LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION
[5]  
Banerjee S., 2002, Computational Linguistics and Intelligent Text Processing. Third International Conference, CICLing 2002. Proceedings (Lecture Notes in Computer Science Vol.2276), P136
[6]  
Hall M, 2009, ACM SIGKDD explorations newsletter, V11, P10, DOI [DOI 10.1145/1656274.1656278, 10.1145/1656274.1656278]
[7]  
Hall M.A., 1999, P 17 INT C MACHINE L, P359
[8]   VERY SIMPLE CLASSIFICATION RULES PERFORM WELL ON MOST COMMONLY USED DATASETS [J].
HOLTE, RC .
MACHINE LEARNING, 1993, 11 (01) :63-91
[9]  
Jiang J, 1997, INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 1997 DIGEST OF TECHNICAL PAPERS, P94
[10]   SentiMI: Introducing point-wise mutual information with SentiWordNet to improve sentiment polarity detection [J].
Khan, Farhan Hassan ;
Qamar, Usman ;
Bashir, Saba .
APPLIED SOFT COMPUTING, 2016, 39 :140-153