Machine learning-based new approach to films review

被引:3
作者
Jassim, Mustafa Abdalrassual [1 ,2 ,3 ]
Abd, Dhafar Hamed [4 ]
Omri, Mohamed Nazih [1 ]
机构
[1] Univ Sousse, MARS Res Lab, Sousse, Tunisia
[2] Univ Monastir, Monastir Fac Sci, Monastir, Tunisia
[3] Al Muthanna Univ, Samawah, Iraq
[4] Al Maaref Univ Coll, Dept Comp Sci, Alanbar, Iraq
基金
英国科研创新办公室;
关键词
Sentiment analysis; Movie review; Machine learning; Word selection; Decision-making; Text analysis; Data science; SENTIMENT ANALYSIS; FUZZY TOPSIS; SELECTION;
D O I
10.1007/s13278-023-01042-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The main purpose of Sentiment Analysis (SA) is to derive useful insights from large amounts of unstructured data compiled from various sources. This analysis helps to interpret and classify textual data using different techniques applied in machine learning (ML) models. In this paper, we compared simple and ensemble ML methods as classifiers for SA: Random Forest, K-Nearest Neighbor, Artificial Neural Network, Gradient Boosting, Support Vector Machine (SVM), AdaBoost, Extreme Gradient Boosting, Decision Tree, Light GBM, Stochastic Gradient Descent and Bagging. For this, we considered a test set database of 50,000 movie reviews, of which 25,000 were rated positive and 25,000 negatives. We have chosen 20,000 words that have an impact on the feelings of the documents. This work aims to propose a new rating prediction approach based on a textual customer review. We consider term frequency characteristics and term frequency-inverse document frequency from the large-scale and serial trials to compare the results obtained by various classifiers using feature extraction techniques. For the decision phase, we applied the Fuzzy Decision by Opinion Score Method, one of the most recent methods for multi-criteria decision-making. To evaluate and quantify the performance of the different ML methods we considered, we apply six standard measures namely precision, accuracy, recall, F-score, AUC, and Kappa-measure. The results we obtained, at the end of the experimental work that we conducted, indicated that the SVM classier is the best with 88,333% as a precision rate followed by the FDOSM method, with 0.800 for the same measurement.
引用
收藏
页数:17
相关论文
共 69 条
[1]  
Ababneh J., 2019, Mod. Appl. Sci, V13, P31, DOI [10.5539/mas.v13n11p31, DOI 10.5539/MAS.V13N11P31]
[2]  
Ahmed M. A., 2021, Journal of Physics: Conference Series, DOI 10.1088/1742-6596/1963/1/012099
[3]   A systematic rank of smart training environment applications with motor imagery brain-computer interface [J].
Al-Qaysi, Z. T. ;
Ahmed, M. A. ;
Hammash, Nayif Mohammed ;
Hussein, Ahmed Faeq ;
Albahri, A. S. ;
Suzani, M. S. ;
Al-Bander, Baidaa .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (12) :17905-17927
[4]   A new extension of FDOSM based on Pythagorean fuzzy environment for evaluating and benchmarking sign language recognition systems [J].
Al-Samarraay, Mohammed S. ;
Salih, Mahmood M. ;
Ahmed, Mohamed A. ;
Zaidan, A. A. ;
Albahri, O. S. ;
Pamucar, Dragan ;
AlSattar, H. A. ;
Alamoodi, A. H. ;
Zaidan, B. B. ;
Dawood, Kareem ;
Albahri, A. S. .
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (06) :4937-4955
[5]   Extension of interval-valued Pythagorean FDOSM for evaluating and benchmarking real-time SLRSs based on multidimensional criteria of hand gesture recognition and sensor glove perspectives [J].
Al-Samarraay, Mohammed S. ;
Zaidan, A. A. ;
Albahri, O. S. ;
Pamucar, Dragan ;
AlSattar, H. A. ;
Alamoodi, A. H. ;
Zaidan, B. B. ;
Albahri, A. S. .
APPLIED SOFT COMPUTING, 2022, 116
[6]   Integration of fuzzy-weighted zero-inconsistency and fuzzy decision by opinion score methods under a q-rung orthopair environment: A distribution case study of COVID-19 vaccine doses [J].
Albahri, A. S. ;
Albahri, O. S. ;
Zaidan, A. A. ;
Alnoor, Alhamzah ;
Alsattar, H. A. ;
Mohammed, Rawia ;
Alamoodi, A. H. ;
Zaidan, B. B. ;
Aickelin, Uwe ;
Alazab, Mamoun ;
Garfan, Salem ;
Ahmaro, Ibraheem Y. Y. ;
Ahmed, M. A. .
COMPUTER STANDARDS & INTERFACES, 2022, 80
[7]   Multidimensional benchmarking of the active queue management methods of network congestion control based on extension of fuzzy decision by opinion score method [J].
Albahri, Osamah Shihab ;
Zaidan, Aws Alaa ;
Salih, Mahmood M. ;
Zaidan, Bilal Bahaa ;
Khatari, Maimuna A. ;
Ahmed, Mohamed A. ;
Albahri, Ahmed Shihab ;
Alazab, Mamoun .
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (02) :796-831
[8]  
[Anonymous], 2007, Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07
[9]   A state-of the-art survey of TOPSIS applications [J].
Behzadian, Majid ;
Otaghsara, S. Khanmohammadi ;
Yazdani, Morteza ;
Ignatius, Joshua .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (17) :13051-13069
[10]  
Bennett S, 2016, PREDICTING ELECTIONS