Sentiment Analysis on Reviews of Amazon Products Using Different Machine Learning Algorithms

被引:1
作者
Tasci, Merve Esra [1 ]
Rasheed, Jawad [2 ]
Ozkul, Tarik [2 ]
机构
[1] Istanbul Sabahattin Zaim Univ, Dept Software Engn, Istanbul, Turkiye
[2] Istanbul Sabahattin Zaim Univ, Dept Comp Engn, Istanbul, Turkiye
来源
FORTHCOMING NETWORKS AND SUSTAINABILITY IN THE AIOT ERA, VOL 2, FONES-AIOT 2024 | 2024年 / 1036卷
关键词
Natural Language Processing; Data Mining; Sentiment Analysis; Machine Learning Algorithms;
D O I
10.1007/978-3-031-62881-8_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There are thousands of products with hundreds of reviews on major e-commerce sites such as Amazon and eBay. Customers often browse through positive and negative reviews before making a purchase decision. Reading hundreds of reviews for a single product can be time-consuming and overwhelming for customers. Sentiment analysis approach has been identified to address this issue. The study aspires to use several machine learning algorithms to do sentiment analysis on Amazon product reviews. For this purpose, supervised learning, online learning, and ensemble learning algorithms have been applied to Amazon product reviews obtained from the Kaggle database. Natural language processing and data mining techniques were applied to the dataset. Firstly, natural language processing techniques were applied for data preprocessing. The dataset was separated into 20% for testing and 80% for training. Term Frequency-Inverse Document Frequency (TF-IDF) vectorization was employed to create word vectors. Passive Aggressive (PA), SupportVector Machine (SVM), Random Forest (RF), AdaBoost, K-Nearest Neighbor (KNN), and XGBoost algorithms were employed in model implementation, which was the crucial step. Accuracy rates, cross-validation scores, confusion matrices, and classification report results were compared. The Random Forest algorithm provided the highest accuracy rate with a prediction accuracy of 96.13%.
引用
收藏
页码:318 / 327
页数:10
相关论文
共 14 条
[1]   Classification model for accuracy and intrusion detection using machine learning approach [J].
Agarwal, Arushi ;
Sharma, Purushottam ;
Alshehri, Mohammed ;
Mohamed, Ahmed A. ;
Alfarraj, Osama .
PEERJ COMPUTER SCIENCE, 2021,
[2]   A random forest classifier for lymph diseases [J].
Azar, Ahmad Taher ;
Elshazly, Hanaa Ismail ;
Hassanien, Aboul Ella ;
Elkorany, Abeer Mohamed .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2014, 113 (02) :465-473
[3]  
Chand Bansal J., 2020, P 2 INT C SMART EN C
[4]  
Desai M, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), P149, DOI 10.1109/CCAA.2016.7813707
[5]  
Dey Sanjay, 2020, 2020 International Conference on Contemporary Computing and Applications (IC3A), P217, DOI 10.1109/IC3A48958.2020.233300
[6]  
Elmurngi E.I., 2018, J. Comput. Sci., V14, P714, DOI DOI 10.3844/JCSSP.2018.714.726
[7]  
Guner L., 2019, Sentiment analysis for Amazon.com reviews, DOI [10.13140/RG.2.2.13939.37920, DOI 10.13140/RG.2.2.13939.37920]
[8]  
Haque T. U., 2018, 2018 IEEE INT C INN, P1, DOI DOI 10.1109/ICIRD.2018.8376299
[9]   Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons [J].
Khoo, Christopher S. G. ;
Johnkhan, Sathik Basha .
JOURNAL OF INFORMATION SCIENCE, 2018, 44 (04) :491-511
[10]  
Liu CZ, 2018, 2018 IEEE INTERNATIONAL CONFERENCE OF INTELLIGENT ROBOTICS AND CONTROL ENGINEERING (IRCE), P218, DOI 10.1109/IRCE.2018.8492945