Sentiment lexicon for sentiment analysis of Saudi dialect tweets

被引:24
作者
Al-Thubaity, Abdulmohsen [1 ]
Alqahtani, Qubayl [2 ]
Aljandal, Abdulaziz [2 ]
机构
[1] King Abdulaziz City Sci & Technol, Riyadh, Saudi Arabia
[2] King Saud Univ, AlMuzahmiyah Branch, Riyadh, Saudi Arabia
来源
ARABIC COMPUTATIONAL LINGUISTICS | 2018年 / 142卷
关键词
Arabic sentiment analysis; Arabic sentiment lexicon; Arabic text mining; Arabic language resources; Saudi dialect; EXTRACTION; REPUTATION;
D O I
10.1016/j.procs.2018.10.494
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Twitter is one of the most widely used social media platforms in Saudi Arabia and is a rich source for mining the public's attitude towards political, social, and economic matters. Sentiment analysis is a technique used for identifying the polarity (positive, negative, or neutral) of a given tweet, using either machine learning approaches or sentiment lexicons. This paper presents two resources. The first is the Saudi dialect sentiment lexicon (SauDiSenti), which is a sentiment lexicon for sentiment analysis of Saudi dialect tweets. SauDiSenti comprises 4431 words and phrases from modem standard Arabic (MSA) and Saudi dialects manually extracted from a previously labelled dataset of tweets obtained from trending hashtags in Saudi Arabia. The second is a testing dataset comprising 1500 tweets evenly distributed over three classes: positive, negative, and neutral. To evaluate the performance of SauDiSenti, we used precision, recall, and F measure and compared it to AraSenTi a larger Arabic sentiment dictionary. The data suggest that AraSenTi outperforms SauDiSenti only when both positive and negative tweets are considered, whereas SauDiSenti outperforms AraSenTi when positive, negative, and neutral tweets are considered. Despite the small size of SauDiSenti, its use for sentiment analysis of Saudi dialect tweets shows promising results in comparison to the automatically constructed larger dictionary AraSenTi. SauDiSenti and the testing dataset are available for download at http://corpus.kacstedu.sa/more_info.jsp. (C) 2018 The Authors. Published by Elsevier B.V.
引用
收藏
页码:301 / 307
页数:7
相关论文
共 35 条
[1]  
Abdul-Mageed M, 2014, LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P1162
[2]  
Abdulla N., 2013, 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), P1, DOI [10.1109/AEECT.2013.6716448, DOI 10.1109/AEECT.2013.6716448]
[3]   Automatic Lexicon Construction for Arabic Sentiment Analysis [J].
Abdulla, Nawaf ;
Majdalawi, Roa'a ;
Mohammed, Salwa ;
Al-Ayyoub, Mahmoud ;
Al-Kabi, Mohammed .
2014 INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD), 2014, :547-552
[4]   Towards Improving the Lexicon-Based Approach for Arabic Sentiment Analysis [J].
Abdulla, Nawaf A. ;
Ahmed, Nizar A. ;
Shehab, Mohammed A. ;
Al-Ayyoub, Mahmoud ;
Al-Kabi, Mohammed N. ;
Al-rifai, Saleh .
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2014, 9 (03) :55-71
[5]   Arabic senti-lexicon: Constructing publicly available language resources for Arabic sentiment analysis [J].
Al-Moslmi, Tareq ;
Albared, Mohammed ;
Al-Shabi, Adel ;
Omar, Nazlia ;
Abdullah, Salwani .
JOURNAL OF INFORMATION SCIENCE, 2018, 44 (03) :345-362
[6]   AROMA: A Recursive Deep Learning Model for Opinion Mining in Arabic as a Low Resource Language [J].
Al-Sallab, Ahmad ;
Baly, Ramy ;
Hajj, Hazem ;
Shaban, Khaled Bashir ;
El-Hajj, Wassim ;
Badaro, Gilbert .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2017, 16 (04)
[7]  
Al-Thubaity A., 2018, P 21 SAUD COMP SOC N
[8]  
Al-Twairesh N, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P697
[9]   Arabic tweets sentiment analysis - a hybrid scheme [J].
Aldayel, Haifa K. ;
Azmi, Aqil M. .
JOURNAL OF INFORMATION SCIENCE, 2016, 42 (06) :782-797
[10]   AraSenTi-Lexicon: A Different Approach [J].
AlNegheimish, Hadeel ;
Alshobaili, Jowharah ;
AlMansour, Nora ;
Bin Shiha, Rawan ;
AlTwairesh, Nora ;
Alhumoud, Sarah .
SOCIAL COMPUTING AND SOCIAL MEDIA: APPLICATIONS AND ANALYTICS, SCSM 2017, PT II, 2017, 10283 :226-235