A Multi-Criteria Approach for Arabic Dialect Sentiment Analysis for Online Reviews: Exploiting Optimal Machine Learning Algorithm Selection

被引：22

作者：

Abo, Mohamed Elhag Mohamed ^{[1
]}

Idris, Norisma ^{[1
]}

Mahmud, Rohana ^{[1
]}

Qazi, Atika ^{[2
]}

Hashem, Ibrahim Abaker Targio ^{[3
]}

Maitama, Jaafar Zubairu ^{[1
,4
]}

Naseem, Usman ^{[5
]}

Khan, Shah Khalid ^{[6
]}

Yang, Shuiqing ^{[7
]}

机构：

[1] Univ Malaya, Fac Comp Sci & Informat Technol, Dept Artificial Intelligence, Kuala Lumpur 50603, Malaysia

[2] Univ Brunei Darussalam, Ctr Lifelong Learning, BE-1410 Gadong, Brunei

[3] Univ Sharjah, Dept Comp Sci, Coll Comp & Informat, Sharjah 27272, U Arab Emirates

[4] Bayero Univ, Fac Comp Sci & Informat Technol, Dept Informat Technol, Kano 3011, Nigeria

[5] Univ Sydney, Sch Comp Sci, Sydney, NSW 2006, Australia

[6] RMIT Univ, Sch Engn, Carlton, Vic 3053, Australia

[7] Zhejiang Univ Finance & Econ, Sch Informat Management & Artificial Intelligence, Hangzhou 310018, Peoples R China

来源：

SUSTAINABILITY | 2021年 / 13卷 / 18期

关键词：

multiple-criteria; Arabic dialect; sentiment analysis; machine learning; performance evaluation; OF-THE-ART; CLASSIFICATION; NETWORKS; CRITERIA;

D O I：

10.3390/su131810018

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

A sentiment analysis of Arabic texts is an important task in many commercial applications such as Twitter. This study introduces a multi-criteria method to empirically assess and rank classifiers for Arabic sentiment analysis. Prominent machine learning algorithms were deployed to build classification models for Arabic sentiment analysis classifiers. Moreover, an assessment of the top five machine learning classifiers' performances measures was discussed to rank the performance of the classifier. We integrated the top five ranking methods with evaluation metrics of machine learning classifiers such as accuracy, recall, precision, F-measure, CPU Time, classification error, and area under the curve (AUC). The method was tested using Saudi Arabic product reviews to compare five popular classifiers. Our results suggest that deep learning and support vector machine (SVM) classifiers perform best with accuracy 85.25%, 82.30%; precision 85.30, 83.87%; recall 88.41%, 83.89; F-measure 86.81, 83.87%; classification error 14.75, 17.70; and AUC 0.93, 0.90, respectively. They outperform decision trees, K-nearest neighbours (K-NN), and Naive Bayes classifiers.

引用

页数：20

共 85 条

[1] Abdul-Mageed M, 2011, P 49 ANN M ASS COMP, P587
[2] Abdulkareem Mustafa, 2017, Journal of Theoretical and Applied Information Technology, V95, P403
[3] Abdulla N., 2013, 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), P1, DOI [10.1109/AEECT.2013.6716448, DOI 10.1109/AEECT.2013.6716448]
[4] Abo MEM, 2018, 2018 INT C COMP CONT, P1
[5] SSA-SDA: Subjectivity and Sentiment Analysis of Sudanese Dialect Arabic
Abo, Mohamed Elhag M.
Shah, Nordiana Ahmad Kharman
Balakrishnan, Vimala
Kamal, Mohamed
Abdelaziz, Ahmed
Haruna, Khalid
[J]. 2019 INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCES (ICCIS), 2019, : 206 - 210
[6] A Review on Arabic Sentiment Analysis: State-of-the-Art, Taxonomy and Open Research Challenges
Abo, Mohamed Elhag Mohamed
Raj, Ram Gopal
Qazi, Atika
[J]. IEEE ACCESS, 2019, 7 : 162008 - 162024
[7] Automatic categorization of Arabic articles based on their political orientation
Abooraig, Raddad
Al-Zu'bi, Shadi
Kanan, Tarek
Hawashin, Bilal
Al Ayoub, Mahmoud
Hmeidi, Ismail
[J]. DIGITAL INVESTIGATION, 2018, 25 : 24 - 41
[8] A rule-based stemmer for Arabic Gulf dialect
Abuata, Belal
Al-Omari, Asma
[J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2015, 27 (02) : 104 - 112
[9] Automated detection of coronary artery disease using different durations of ECG segments with convolutional neural network
Acharya, U. Rajendra
Fujita, Hamido
Lih, Oh Shu
Adam, Muhammad
Tan, Jen Hong
Chua, Chua Kuang
[J]. KNOWLEDGE-BASED SYSTEMS, 2017, 132 : 62 - 71
[10] Ahmed S, 2013, IEEE INT CONF INNOV

← 1 2 3 4 5 6 7 8 9 →