MSTD: Moroccan Sentiment Twitter Dataset

被引:0
|
作者
Mihi, Soukaina [1 ]
Ali, Brahim Ait Ben [1 ]
El Bazi, Ismail [2 ]
Arezki, Sara [1 ]
Laachfoubi, Nabil [1 ]
机构
[1] Univ Hassan First Settat Morocco, Settat, Morocco
[2] Univ Moulay Slimane Beni Mellal, Beni Mellal, Morocco
关键词
Sentiment analysis; Moroccan dialect; machine-learning; stemming; lemmatization; feature extraction;
D O I
10.14569/IJACSA.2020.0111045
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the proliferation of social media and Internet accessibility, a massive amount of data has been produced. In most cases, the textual data available through the web comes mainly from people expressing their views in informal words. The Arabic language is one of the hardest Semitic languages to deal with because of its complex morphology. In this paper, a new contribution to the Arabic resources is presented as a large Moroccan dataset retrieved from Twitter and carefully annotated by native speakers. For the best of our knowledge, this dataset is the largest Moroccan dataset for sentiment analysis. It is distinguished by its size, its quality given by the commitment of annotators, and its accessibility for the research community. Furthermore, the MSTD (Moroccan Sentiment Twitter Dataset) is benchmarked through experiments carried out for 4-way classification as well as polarity classification (positive, negative). Various machine-learning algorithms are combined to feature extraction techniques to reach optimal settings. This work also presents the effect of stemming and lemmatization on the improvement of the obtained accuracies.
引用
收藏
页码:363 / 372
页数:10
相关论文
共 50 条
  • [31] Spanish sentiment analysis in Twitter at the TASS workshop
    Pla, Ferran
    Hurtado, Lluis-F.
    LANGUAGE RESOURCES AND EVALUATION, 2018, 52 (02) : 645 - 672
  • [32] Contribution to the Moroccan Darija sentiment analysis in social networks
    Sara El Ouahabi
    Safâa El Ouahabi
    El Wardani Dadi
    Social Network Analysis and Mining, 13
  • [33] Contribution to the Moroccan Darija sentiment analysis in social networks
    El Ouahabi, Sara
    El Ouahabi, Safaa
    Dadi, El Wardani
    SOCIAL NETWORK ANALYSIS AND MINING, 2023, 13 (01)
  • [34] Covid-19 vaccine hesitancy: Text mining, sentiment analysis and machine learning on COVID-19 vaccination Twitter dataset
    Qorib, Miftahul
    Oladunni, Timothy
    Denis, Max
    Ososanya, Esther
    Cotae, Paul
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 212
  • [35] Sentiment Analysis using Optimised Feature Sets in Different Facebook/Twitter Dataset Domains with Big Data
    Al-Mashhadani M.I.
    Hussein K.M.
    Khudir E.T.
    Ilyas M.
    Iraqi Journal for Computer Science and Mathematics, 2022, 3 (01): : 64 - 70
  • [36] Online Analysis of Sentiment on Twitter
    Minab, Shokoufeh Salem
    Jalali, Mehrdad
    Moattar, Mohammad Hossein
    SECOND INTERNATIONAL CONGRESS ON TECHNOLOGY, COMMUNICATION AND KNOWLEDGE (ICTCK 2015), 2015, : 359 - 365
  • [37] Enhancing Moroccan Dialect Sentiment Analysis Through Optimized Preprocessing and Transfer Learning Techniques
    Matrane, Yassir
    Benabbou, Faouzia
    Ellaky, Zineb
    IEEE ACCESS, 2024, 12 : 187756 - 187777
  • [38] Sentiment Analysis of Twitter Data
    Desai, Radhi D.
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 114 - 117
  • [39] Sentiment Analysis of Twitter Data
    Wang, Yili
    Guo, Jiaxuan
    Yuan, Chengsheng
    Li, Baozhu
    APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [40] Sentiment Analysis of Twitter Data
    El Rahman, Sahar A.
    AlOtaibi, Feddah Alhumaidi
    AlShehri, Wejdan Abdullah
    2019 INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCES (ICCIS), 2019, : 336 - 339