Standardization of Dialect Comments in Social Networks in View of Sentiment Analysis : Case of Tunisian Dialect

被引:0
|
作者
Kchaou, Sameh [1 ]
Boujelbane, Rahma [1 ]
Fsih, Emna [1 ]
Belguith, Lamia Hadrich [1 ]
机构
[1] Univ Sfax, MIRACL Lab FSEGS, ANLP Res Grp, Sfax, Tunisia
来源
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2022年
关键词
Dialect Identification; Neural Machine Translation; Sentiment Analysis; Tunisian Dialect; Modern Standard Arabic;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the growing access to the internet, the spoken Arabic dialect language becomes an informal languages written in social media. Most users post comments using their own dialect. This linguistic situation inhibits mutual understanding between internet users and makes difficult to use computational approaches since most Arabic resources are intended for the formal language: Modern Standard Arabic (MSA). In this paper, we present a pipeline to standardize the written texts in social networks by translating them to the MSA. We fine-tune at first an identification bert-based model to select Tunisian Dialect (TD) comments from MSA and other dialects. Then, the resulting comments are translated using a neural translation model. Each of these steps was evaluated on the same test corpus. In order to test the effectiveness of the approach, we compared two opinion analysis models, the first is intended for the Sentiment Analysis (SA) of dialect texts and the second is for the MSA texts. We concluded that through standardization we obtain the best score.
引用
收藏
页码:5436 / 5443
页数:8
相关论文
共 50 条
  • [21] Comparison of Sentiment Analysis Approaches Using Modern Arabic and Sudanese Dialect
    Hussien, Intisar O.
    Dashtipour, Kia
    Hussain, Amir
    ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, BICS 2018, 2018, 10989 : 615 - 624
  • [22] AfriDial: African Dialect Model based on Deep Learning for Sentiment Analysis
    Sassi, Ameni
    Tonga, Junior
    Poaty, Stephanie
    Steve, Sanon
    Adjid, Djibrine Idriss Abakar
    Cherif, Moukhtar
    Ouarda, Wael
    20TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC 2024, 2024, : 1248 - 1254
  • [23] Sentiment Analysis of Arabic Tweets in Smart Cities: A Review of Saudi Dialect
    Alotaibi, Shoayee
    Mehmood, Rashid
    Katib, Iyad
    2019 FOURTH INTERNATIONAL CONFERENCE ON FOG AND MOBILE EDGE COMPUTING (FMEC), 2019, : 330 - 335
  • [24] Developing Lexicon-based Algorithms and Sentiment Lexicon for Sentiment Analysis of Saudi Dialect Tweets
    Al-Ghaith, Waleed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (11) : 83 - 88
  • [25] Sentiment Analysis of Social Networks' Comments to Predict Stock Return
    Cheng, Juan
    Fu, Jiaolong
    Kang, Yan
    Zhu, Hua
    Dai, Weihui
    HUMAN CENTERED COMPUTING, 2019, 11956 : 67 - 74
  • [26] Resources building for sentiment analysis of content disseminated by Tunisian medias in social networks
    Fsih, Emna
    Boujelbane, Rahma
    Belguith, Lamia Hadrich
    LANGUAGE RESOURCES AND EVALUATION, 2025, 59 (01) : 51 - 76
  • [27] Towards enhancement of a lexicon-based approach for Saudi dialect sentiment analysis
    Assiri, Adel
    Emam, Ahmed
    Al-Dossari, Hmood
    JOURNAL OF INFORMATION SCIENCE, 2018, 44 (02) : 184 - 202
  • [28] A systematic assessment of sentiment analysis models on iraqi dialect-based texts
    Hussein, Hafedh Hameed
    Lakizadeh, Amir
    SYSTEMS AND SOFT COMPUTING, 2025, 7
  • [29] Thai Comments Sentiment Analysis on Social Networks with Deep Learning Approach
    Piyaphakdeesakun, Chayapol
    Facundes, Nuttanart
    Polvichai, Jumpol
    2019 34TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2019), 2019, : 381 - 384
  • [30] Bias Aware Lexicon-Based Sentiment Analysis of Malay Dialect on Social Media Data: A Study on The Sabah Language
    Hijazi, Mohd Hanafi Ahmad
    Libin, Lyndia
    Alfred, Rayner
    Coenen, Frans
    PROCEEDINGS OF 2016 2ND INTERNATIONAL CONFERENCE ON SCIENCE IN INFORMATION TECHNOLOGY (ICSITECH) - INFORMATION SCIENCE FOR GREEN SOCIETY AND ENVIRONMENT, 2016, : 356 - 361