Cross-D-vectorizers: a set of feature-spaces for cross-domain sentiment analysis from consumer review

被引:0
作者
Atanu Dey
Mamata Jenamani
Jitesh J. Thakkar
机构
[1] Indian Institute of Technology Kharagpur,
来源
Multimedia Tools and Applications | 2019年 / 78卷
关键词
Sentiment analysis; N-grams; Cross domain; Maximum entropy; TFIDF;
D O I
暂无
中图分类号
学科分类号
摘要
Supervised sentiment classification approaches require labeled training (source) and testing (target) dataset. Generation of such datasets demands substantial time and effort but cross-domain classification minimizes the effort by considering two different domains for source and target datasets. In this paper, we propose Cross-D-Vectorizers i.e., a set of three sentiment n-gram feature-spaces (Lexical-TFIDF, Lex-Delta-TFIDF and SEND) for the purpose of cross-domain analysis. We construct the features by extracting sentiment unigrams combination with intensifiers and negations from the source dataset. By utilizing an existing lexicon the scores of these features are computed in three different procedures. The scores for each feature are computed by multiplying sentiment value with corresponding TFIDF rating, Delta-TFIDF rating and feature-importance-values (FIV) respectively. Importance-value for each SEND (Sentiment wEight ofN-grams inDataset) feature is calculated by multiplying the number of times the feature appears in the review and the logarithm of its inverse frequency in the corpus. We experiment by using Maximum Entropy, Support Vector Machine and K-Nearest Neighbors classifiers on three benchmark datasets and one proposed dataset for cross-domain classification. Proposed approach show improved results in comparison with existing methods. The advantage of our approach is the complexity of system reduces by considering sentiment n-grams as domain independent features instead of any n-grams.
引用
收藏
页码:23141 / 23159
页数:18
相关论文
共 31 条
[1]  
Bollegala D(2016)Cross-domain sentiment classification using sentiment sensitive embeddings IEEE Trans Knowl Data Eng 28 398-410
[2]  
Mu T(1995)A k-nearest neighbor classification rule based on Dempster-Shafer theory IEEE Trans Syst Man Cybern 25 804-813
[3]  
Goulermas JY(2018)Senti-N-Gram: An n-gram lexicon for sentiment analysis Expert Syst Appl 103 92-105
[4]  
Denoeux T(1973)Index term weighting Information storage and retrieval 9 619-633
[5]  
Dey A(2015)A holistic model of mining product aspects and associated sentiments from online reviews Multimed Tools Appl 74 10177-10194
[6]  
Jenamani M(2016)Combining local and global information for product feature extraction in opinion documents Inf Process Lett 116 623-627
[7]  
Thakkar JJ(2018)A two-phase sentiment analysis approach for judgement prediction J Inf Sci 44 594-607
[8]  
Jones KS(2016)Emotion space model for classifying opinions in stock message board Expert Syst Appl 44 138-146
[9]  
Li Y(2010)A survey on transfer learning IEEE Trans Knowl Data Eng 22 1345-1359
[10]  
Qin Z(2011)Lexicon-based methods for sentiment analysis Comput Linguist 37 267-307