Leveraging ParsBERT for cross-domain polarity sentiment classification of Persian social media comments

被引:2
作者
Nigjeh, Mahnaz Panahandeh [1 ]
Ghanbari, Shirin [2 ]
机构
[1] Amirkabir Univ Technol, Dept Comp Engn & Informat Technol, Tehran, Iran
[2] IRIB Univ, Dept ICT, Tehran, Iran
关键词
Cross-domain sentiment analysis; ParsBERT; Pre-trained language model; Transformer-based model; Persian multi-domain sentiment dataset; STRENGTH DETECTION;
D O I
10.1007/s11042-023-16067-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment analysis is the computational study of the emotions, attitudes and opinions of humans through the extraction of meaningful information. Social media platforms that allow consumers to share and publish content, are enriched with opinionating information that many analytical researches are currently, however, limited to a specific domain. This research presents an architecture to analyze a limited resource language, Persian language, and focuses on the analysis of social media, consisting of informal comments across different domains. The proposed model applies a transformer-based model, ParsBERT, to classify the sentiments of social media comments. Since social media comments have different domains, it is necessary for the proposed model to classify sentiments of comments in different domains. ParsBERT has been fine-tuned on a Persian corpus that has been generated for the purpose of this study. The generated corpus has been gathered from 28,710 Instagram comments in different topic domains and have been labeled as either negative or positive comments. The proposed model has been evaluated based on different test data belonging to different time-periods and topic domains and results have been compared with recent methods for the task of sentiment analysis for three different scenarios. Results show that when the training and test data are from different domains an accuracy of 68% is achieved, which is higher than other shallow methodologies and deep learning methods for determining the sentiments of social media comments in different domains.
引用
收藏
页码:10677 / 10694
页数:18
相关论文
共 36 条
  • [1] Akhoundzade R., 2019, 9 INT C COMP KNOWL E, DOI [10.1109/ICCKE48569.2019.8964692, DOI 10.1109/ICCKE48569.2019.8964692]
  • [2] Alimardani, 2015, J INF SYST TELECOMMU, V3, P135
  • [3] Amiri F., 2015, Proceedings of Recent Advances in Natural Language Processing, Hissar, P9
  • [4] [Anonymous], 2013, 21 IR C EL ENG ICEE, DOI DOI 10.1109/IRANIANCEE.2013.6599671
  • [5] The Impact of Sentiment Features on the Sentiment Polarity Classification in Persian Reviews
    Asgarian, Ehsan
    Kahani, Mohsen
    Sharifi, Shahla
    [J]. COGNITIVE COMPUTATION, 2018, 10 (01) : 117 - 135
  • [6] HOMPer: A new hybrid system for opinion mining in the Persian language
    Basiri, Mohammad Ehsan
    Kabiri, Arman
    [J]. JOURNAL OF INFORMATION SCIENCE, 2020, 46 (01) : 101 - 117
  • [7] Bojanowski Piotr, 2017, T ASSOC COMPUT LING, V5, P135, DOI [10.1162/tacla00051, DOI 10.1162/TACL_A_00051, 10.1162/tacl_a_00051, DOI 10.1162/TACLA00051]
  • [8] Dashtipour K., 2021, Progresses in Artificial Intelligence and Neural Systems, P207, DOI DOI 10.1007/978-981-15-5093
  • [9] PerSent 2.0: Persian Sentiment Lexicon Enriched with Domain-Specific Words
    Dashtipour, Kia
    Raza, Ali
    Gelbukh, Alexander
    Zhang, Rui
    Cambria, Erik
    Hussain, Amir
    [J]. ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, 2020, 11691 : 497 - 509
  • [10] A hybrid Persian sentiment analysis framework: Integrating dependency grammar based rules and deep neural networks
    Dashtipour, Kia
    Gogate, Mandar
    Li, Jingpeng
    Jiang, Fengling
    Kong, Bin
    Hussain, Amir
    [J]. NEUROCOMPUTING, 2020, 380 : 1 - 10