A Multi-Criteria Approach for Arabic Dialect Sentiment Analysis for Online Reviews: Exploiting Optimal Machine Learning Algorithm Selection

被引:22
作者
Abo, Mohamed Elhag Mohamed [1 ]
Idris, Norisma [1 ]
Mahmud, Rohana [1 ]
Qazi, Atika [2 ]
Hashem, Ibrahim Abaker Targio [3 ]
Maitama, Jaafar Zubairu [1 ,4 ]
Naseem, Usman [5 ]
Khan, Shah Khalid [6 ]
Yang, Shuiqing [7 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Dept Artificial Intelligence, Kuala Lumpur 50603, Malaysia
[2] Univ Brunei Darussalam, Ctr Lifelong Learning, BE-1410 Gadong, Brunei
[3] Univ Sharjah, Dept Comp Sci, Coll Comp & Informat, Sharjah 27272, U Arab Emirates
[4] Bayero Univ, Fac Comp Sci & Informat Technol, Dept Informat Technol, Kano 3011, Nigeria
[5] Univ Sydney, Sch Comp Sci, Sydney, NSW 2006, Australia
[6] RMIT Univ, Sch Engn, Carlton, Vic 3053, Australia
[7] Zhejiang Univ Finance & Econ, Sch Informat Management & Artificial Intelligence, Hangzhou 310018, Peoples R China
关键词
multiple-criteria; Arabic dialect; sentiment analysis; machine learning; performance evaluation; OF-THE-ART; CLASSIFICATION; NETWORKS; CRITERIA;
D O I
10.3390/su131810018
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
A sentiment analysis of Arabic texts is an important task in many commercial applications such as Twitter. This study introduces a multi-criteria method to empirically assess and rank classifiers for Arabic sentiment analysis. Prominent machine learning algorithms were deployed to build classification models for Arabic sentiment analysis classifiers. Moreover, an assessment of the top five machine learning classifiers' performances measures was discussed to rank the performance of the classifier. We integrated the top five ranking methods with evaluation metrics of machine learning classifiers such as accuracy, recall, precision, F-measure, CPU Time, classification error, and area under the curve (AUC). The method was tested using Saudi Arabic product reviews to compare five popular classifiers. Our results suggest that deep learning and support vector machine (SVM) classifiers perform best with accuracy 85.25%, 82.30%; precision 85.30, 83.87%; recall 88.41%, 83.89; F-measure 86.81, 83.87%; classification error 14.75, 17.70; and AUC 0.93, 0.90, respectively. They outperform decision trees, K-nearest neighbours (K-NN), and Naive Bayes classifiers.
引用
收藏
页数:20
相关论文
共 85 条
  • [1] Abdul-Mageed M, 2011, P 49 ANN M ASS COMP, P587
  • [2] Abdulkareem Mustafa, 2017, Journal of Theoretical and Applied Information Technology, V95, P403
  • [3] Abdulla N., 2013, 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), P1, DOI [10.1109/AEECT.2013.6716448, DOI 10.1109/AEECT.2013.6716448]
  • [4] Abo MEM, 2018, 2018 INT C COMP CONT, P1
  • [5] SSA-SDA: Subjectivity and Sentiment Analysis of Sudanese Dialect Arabic
    Abo, Mohamed Elhag M.
    Shah, Nordiana Ahmad Kharman
    Balakrishnan, Vimala
    Kamal, Mohamed
    Abdelaziz, Ahmed
    Haruna, Khalid
    [J]. 2019 INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCES (ICCIS), 2019, : 206 - 210
  • [6] A Review on Arabic Sentiment Analysis: State-of-the-Art, Taxonomy and Open Research Challenges
    Abo, Mohamed Elhag Mohamed
    Raj, Ram Gopal
    Qazi, Atika
    [J]. IEEE ACCESS, 2019, 7 : 162008 - 162024
  • [7] Automatic categorization of Arabic articles based on their political orientation
    Abooraig, Raddad
    Al-Zu'bi, Shadi
    Kanan, Tarek
    Hawashin, Bilal
    Al Ayoub, Mahmoud
    Hmeidi, Ismail
    [J]. DIGITAL INVESTIGATION, 2018, 25 : 24 - 41
  • [8] A rule-based stemmer for Arabic Gulf dialect
    Abuata, Belal
    Al-Omari, Asma
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2015, 27 (02) : 104 - 112
  • [9] Automated detection of coronary artery disease using different durations of ECG segments with convolutional neural network
    Acharya, U. Rajendra
    Fujita, Hamido
    Lih, Oh Shu
    Adam, Muhammad
    Tan, Jen Hong
    Chua, Chua Kuang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 132 : 62 - 71
  • [10] Ahmed S, 2013, IEEE INT CONF INNOV