Machine learning-based new approach to films review

被引:0
作者
Mustafa Abdalrassual Jassim
Dhafar Hamed Abd
Mohamed Nazih Omri
机构
[1] University of Sousse,MARS Research Laboratory
[2] University of Monastir,Monastir Faculty of Science
[3] Al-Muthanna University,Department of Computer Science
[4] Al-Maaref University College,undefined
来源
Social Network Analysis and Mining | / 13卷
关键词
Sentiment analysis; Movie review; Machine learning; Word selection; Decision-making; Text analysis; Data science;
D O I
暂无
中图分类号
学科分类号
摘要
The main purpose of Sentiment Analysis (SA) is to derive useful insights from large amounts of unstructured data compiled from various sources. This analysis helps to interpret and classify textual data using different techniques applied in machine learning (ML) models. In this paper, we compared simple and ensemble ML methods as classifiers for SA: Random Forest, K-Nearest Neighbor, Artificial Neural Network, Gradient Boosting, Support Vector Machine (SVM), AdaBoost, Extreme Gradient Boosting, Decision Tree, Light GBM, Stochastic Gradient Descent and Bagging. For this, we considered a test set database of 50,000 movie reviews, of which 25,000 were rated positive and 25,000 negatives. We have chosen 20,000 words that have an impact on the feelings of the documents. This work aims to propose a new rating prediction approach based on a textual customer review. We consider term frequency characteristics and term frequency-inverse document frequency from the large-scale and serial trials to compare the results obtained by various classifiers using feature extraction techniques. For the decision phase, we applied the Fuzzy Decision by Opinion Score Method, one of the most recent methods for multi-criteria decision-making. To evaluate and quantify the performance of the different ML methods we considered, we apply six standard measures namely precision, accuracy, recall, F-score, AUC, and Kappa-measure. The results we obtained, at the end of the experimental work that we conducted, indicated that the SVM classier is the best with 88,333% as a precision rate followed by the FDOSM method, with 0.800 for the same measurement.
引用
收藏
相关论文
共 156 条
[1]  
Ababneh J(2019)Application of Naïve Bayes, decision tree, and k-nearest neighbors for automated text classification Mod Appl Sci 13 31-831
[2]  
Ahmed MA(2021)Automatic COVID-19 pneumonia diagnosis from X-ray lung image: a deep feature and machine learning solution J Phys Conf Ser 1963 012099-831
[3]  
Al-Qaysi ZT(2020)Multidimensional benchmarking of the active queue management methods of network congestion control based on extension of fuzzy decision by opinion score method Int J Intell Syst 36 796-4955
[4]  
Shuwandy ML(2021)Multidimensional benchmarking of the active queue management methods of network congestion control based on extension of fuzzy decision by opinion score method Int J Intell Syst 36 796-13069
[5]  
Salih MM(2022)Integration of fuzzy-weighted zero-inconsistency and fuzzy decision by opinion score methods under a q-rung orthopair environment: a distribution case study of COVID-19 vaccine doses Comput Stand Interfaces 80 4937-60
[6]  
Ali MH(2022)A new extension of FDOSM based on pythagorean fuzzy environment for evaluating and benchmarking sign language recognition systems Neural Comput Appl 34 13051-208
[7]  
Albahri OS(2022)Extension of interval-valued pythagorean FDOSM for evaluating and benchmarking real-time SLRSS based on multidimensional criteria of hand gesture recognition and sensor glove perspectives Appl Soft Comput 116 52-89
[8]  
Zaidan AA(2012)A state-of the-art survey of TOPSIS applications Expert Syst Appl 39 185-19
[9]  
Salih MM(2011)A framework for dynamic multiple-criteria decision making Decis Support Syst 52 82-41
[10]  
Zaidan BB(2014)Comparative analysis of normalization procedures in TOPSIS method: with an application to Turkish deposit banking market Informatica 25 1-28756