AraSenTi-Tweet: A Corpus for Arabic Sentiment Analysis of Saudi Tweets

被引:87
作者
Al-Twairesh, Nora [1 ]
Al-Khalifa, Hend [1 ]
Al-Salman, AbdulMalik [1 ]
Al-Ohali, Yousef [1 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, Riyadh, Saudi Arabia
来源
ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017) | 2017年 / 117卷
关键词
Sentiment Analysis; Arabic NLP; Corpus Sentiment Annotation; Arabic tweets; Saudi Dialect; RESOURCES;
D O I
10.1016/j.procs.2017.10.094
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Arabic Sentiment Analysis is an active research area these days. However, the Arabic language still lacks sufficient language resources to enable the tasks of sentiment analysis. In this paper, we present the details of collecting and constructing a large dataset of Arabic tweets. The techniques used in cleaning and pre-processing the collected dataset are explained. A corpus of Arabic tweets annotated for sentiment analysis was extracted from this dataset. The corpus consists mainly of tweets written in Modern Standard Arabic and the Saudi dialect. The corpus was manually annotated for sentiment. The annotation process is explained in detail and the challenges during the annotation are highlighted. The corpus contains 17,573 tweets labelled with four labels for sentiment: positive, negative, neutral and mixed. Baseline experiments were conducted to provide benchmark results for future work. (c) 2017 The Authors. Published by Elsevier B.V.
引用
收藏
页码:63 / 72
页数:10
相关论文
共 36 条
[1]   Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums [J].
Abbasi, Ahmed ;
Chen, Hsinchun ;
Salem, Arab .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2008, 26 (03)
[2]  
Abdul-Mageed M., 2011, P 5 LING ANN WORKSH, P110
[3]  
Abdul-Mageed M, 2012, LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P3907
[4]  
Al Shboul B, 2015, 2015 6TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), P206, DOI 10.1109/IACS.2015.7103228
[5]   Human Annotated Arabic Dataset of Book Reviews for Aspect Based Sentiment Analysis [J].
Al-Smadi, Mohammad ;
Qawasmeh, Omar ;
Talafha, Bashar ;
Quwaider, Muhannad .
2015 3RD INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD) AND INTERNATIONAL CONFERENCE ON OPEN AND BIG (OBD), 2015, :726-730
[6]  
Al-Twairesh N, 2014, I C COMP SYST APPLIC, P148, DOI 10.1109/AICCSA.2014.7073192
[7]   Arabic Sentiment Analysis Resources: A Survey [J].
alOwisheq, Areeb ;
alHumoud, Sarah ;
alTwairesh, Nora ;
alBuhairi, Tarfa .
SOCIAL COMPUTING AND SOCIAL MEDIA, SCSM 2016, 2016, 9742 :267-278
[8]  
ALY MA, 2013, ACL, V2, P494, DOI DOI 10.13140/2.1.3960.5761
[9]  
[Anonymous], P 6 INT C LANG RES E
[10]  
[Anonymous], 29 PAC AS C LANG INF