Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks

被引:308
作者
Onan, Aytug [1 ]
机构
[1] Izmir Katip Celebi Univ, Fac Engn & Architecture, Dept Comp Engn, TR-35620 Izmir, Turkey
关键词
deep learning; LSTM; sentiment analysis; weighted word embeddings;
D O I
10.1002/cpe.5909
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Sentiment analysis is one of the major tasks of natural language processing, in which attitudes, thoughts, opinions, or judgments toward a particular subject has been extracted. Web is an unstructured and rich source of information containing many text documents with opinions and reviews. The recognition of sentiment can be helpful for individual decision makers, business organizations, and governments. In this article, we present a deep learning-based approach to sentiment analysis on product reviews obtained from Twitter. The presented architecture combines TF-IDF weighted Glove word embedding with CNN-LSTM architecture. The CNN-LSTM architecture consists of five layers, that is, weighted embedding layer, convolution layer (where, 1-g, 2-g, and 3-g convolutions have been employed), max-pooling layer, followed by LSTM, and dense layer. In the empirical analysis, the predictive performance of different word embedding schemes (ie, word2vec, fastText, GloVe, LDA2vec, and DOC2vec) with several weighting functions (ie, inverse document frequency, TF-IDF, and smoothed inverse document frequency function) have been evaluated in conjunction with conventional deep neural network architectures. The empirical results indicate that the proposed deep learning architecture outperforms the conventional deep learning methods.
引用
收藏
页数:12
相关论文
共 38 条
[1]   Properties of Nanostructure Bismuth Telluride Thin Films Using Thermal Evaporation [J].
Arora, Swati ;
Jaimini, Vivek ;
Srivastava, Subodh ;
Vijay, Y.K. .
Journal of Nanotechnology, 2017, 2017
[2]   Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification [J].
Aydogan, Murat ;
Karci, Ali .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2020, 541
[3]  
Bojanowski P., 2017, Transactions of the association for computational linguistics, V5, P135, DOI DOI 10.1162/TACL_A_00051
[4]  
Çano E, 2018, ADV INTELL SYST COMP, V745, P330, DOI 10.1007/978-3-319-77703-0_34
[5]  
Cho K., 2014, C EMP METH NAT LANG, P1724, DOI [10.3115/v1/D14-1179, DOI 10.3115/V1/D14-1179]
[6]  
Collobert R., 2008, P 25 INT C MACH LEAR, P160, DOI [DOI 10.1145/1390156.1390177.ICML08, DOI 10.1145/1390156.1390177]
[7]  
Collobert R, 2011, J MACH LEARN RES, V12, P2493
[8]   An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit [J].
Curiskis, Stephan A. ;
Drake, Barry ;
Osborn, Thomas R. ;
Kennedy, Paul J. .
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (02)
[9]   Representation learning for very short texts using weighted word embedding aggregation [J].
De Boom, Cedric ;
Van Canneyt, Steven ;
Demeester, Thomas ;
Dhoedt, Bart .
PATTERN RECOGNITION LETTERS, 2016, 80 :150-156
[10]  
Djaballah KA, 2019, 2019 SIXTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), P223, DOI [10.1109/SNAMS.2019.8931827, 10.1109/snams.2019.8931827]