Evaluation of Sentiment Analysis via Word Embedding and RNN Variants for Amazon Online Reviews

被引：27

作者：

Alharbi, Najla M. ^{[1
]}

Alghamdi, Norah S. ^{[2
]}

Alkhammash, Eman H. ^{[3
]}

Al Amri, Jehad F. ^{[4
]}

机构：

[1] King Abdulaziz City Sci & Technol, Riyadh, Saudi Arabia

[2] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Riyadh 11671, Saudi Arabia

[3] Taif Univ, Coll Comp & Informat Technol, Dept Comp Sci, At Taif 21944, Saudi Arabia

[4] Taif Univ, Coll Comp & Informat Technol, Dept Informat Technol, At Taif 21944, Saudi Arabia

来源：

MATHEMATICAL PROBLEMS IN ENGINEERING | 2021年 / 2021卷

关键词：

Sentiment analysis - Extraction - Feature extraction - Brain - Embeddings;

D O I：

10.1155/2021/5536560

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Consumer feedback is highly valuable in business to assess their performance and is also beneficial to customers as it gives them an idea of what to expect from new products. In this research, the aim is to evaluate different deep learning approaches to accurately predict the opinion of customers based on mobile phone reviews obtained from Amazon.com. The prediction is based on analysing these reviews and categorizing them as positive, negative, or neutral. Different deep learning algorithms have been implemented and evaluated such as simple RNN with its four variants, namely, Long Short-Term Memory Networks (LRNN), Group Long Short-Term Memory Networks (GLRNN), gated recurrent unit (GRNN), and update recurrent unit (UGRNN). All evaluated algorithms are combined with word embedding as feature extraction approach for sentiment analysis including Glove, word2vec, and FastText by Skip-grams. The five different algorithms with the three feature extraction methods are evaluated based on accuracy, recall, precision, and F1-score for both balanced and unbalanced datasets. For the unbalanced dataset, it was found that the GLRNN algorithms with FastText feature extraction scored the highest accuracy of 93.75%. This result achieved the highest accuracy on this dataset when compared with other methods mentioned in the literature. For the balanced dataset, the highest achieved accuracy was 88.39% by the LRNN algorithm.

引用

页数：10

共 17 条

[1]

Ali NM, 2019, International Journal of Data Mining Knowledge Management Process Vol, V9

[2]

Bansal Barkha, 2018, Procedia Computer Science, V132, P1147, DOI 10.1016/j.procs.2018.05.029

[3]

Bojanowski P., 2017, Transactions of the association for computational linguistics, V5, P135, DOI DOI 10.1162/TACL_A_00051

[4]

Collins J., 2017, P ICLR 2017 TOUL FRA

[5]

Fu R, 2016, 2016 31ST YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), P324, DOI 10.1109/YAC.2016.7804912

[6] Extracting Customer Reviews from Online Shopping and Its Perspective on Product Design [J].

Kieu Que Anh ;

Nagai, Yukari ;

Le Minh Nguyen .

VIETNAM JOURNAL OF COMPUTER SCIENCE, 2019, 6 (01) :43-56

[7] SVM and k-Means Hybrid Method for Textual Data Sentiment Analysis [J].

Korovkinas, Konstantinas ;

Danenas, Paulius ;

Garsva, Gintautas .

BALTIC JOURNAL OF MODERN COMPUTING, 2019, 7 (01) :47-60

[8] Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method [J].

Kumar, H. I. Keerthi ;

Harish, B. S. ;

Darshan, I. I. K. .

INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2019, 5 (05) :109-114

[9]

Lakshmi B.S., 2017, Int. J. Pure Appl. Math., V114, P47

[10]

Mikolov Tomas, 2013, Efficient estimation of word representations in vector space

← 1 2 →