A comparative study of cross-lingual sentiment analysis

被引:6
|
作者
Priban, Pavel [1 ,2 ]
Smid, Jakub [1 ]
Steinberger, Josef [1 ]
Mistera, Adam [1 ]
机构
[1] Univ West Bohemia, Fac Appl Sci, Dept Comp Sci & Engn, Univ 8, Plzen 30100, Czech Republic
[2] NTIS New Technol Informat Soc, Univ 8, Plzen 30100, Czech Republic
关键词
Sentiment analysis; Zero-shot cross-lingual classification; Linear transformation; Transformers; Large language models; Transfer learning;
D O I
10.1016/j.eswa.2024.123247
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a detailed comparative study of the zero -shot cross -lingual sentiment analysis. Namely, we use modern multilingual Transformer -based models and linear transformations combined with CNN and LSTM neural networks. We evaluate their performance in Czech, French, and English. We aim to compare and assess the models' ability to transfer knowledge across languages and discuss the trade-off between their performance and training/inference speed. We build strong monolingual baselines comparable with the current SotA approaches, achieving state-of-the-art results in Czech (96.0% accuracy) and French (97.6% accuracy). Next, we compare our results with the latest large language models (LLMs), i.e., Llama 2 and ChatGPT. We show that the large multilingual Transformer -based XLM-R model consistently outperforms all other cross -lingual approaches in zero -shot cross -lingual sentiment classification, surpassing them by at least 3%. Next, we show that the smaller Transformer -based models are comparable in performance to older but much faster methods with linear transformations. The best -performing model with linear transformation achieved an accuracy of 92.1% on the French dataset, compared to 90.3% received by the smaller XLM-R model. Notably, this performance is achieved with just approximately 0.01 of the training time required for the XLM-R model. It underscores the potential of linear transformations as a pragmatic alternative to resource -intensive and slower Transformer -based models in real -world applications. The LLMs achieved impressive results that are on par or better, at least by 1%-3%, but with additional hardware requirements and limitations. Overall, this study contributes to understanding cross -lingual sentiment analysis and provides valuable insights into the strengths and limitations of cross -lingual approaches for sentiment analysis.
引用
收藏
页数:39
相关论文
共 50 条
  • [1] A Comparative Study of Cross-Lingual Sentiment Classification
    Wan, Xiaojun
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 1, 2012, : 24 - 31
  • [2] Linear Transformations for Cross-lingual Sentiment Analysis
    Priban, Pavel
    Smid, Jakub
    Mistera, Adam
    Kral, Pavel
    TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 125 - 137
  • [3] Cross-Lingual Sentiment Analysis: A Survey
    Xu Y.
    Cao H.
    Wang W.
    Du W.
    Xu C.
    Data Analysis and Knowledge Discovery, 2023, 7 (01) : 1 - 21
  • [4] Cross-Lingual Sentiment Analysis for Indian Regional Languages
    Impana, P.
    Kallimani, Jagadish S.
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2017, : 867 - 872
  • [5] On the Effect of Word Order on Cross-lingual Sentiment Analysis
    Atrio, Alex R.
    Badia, Toni
    Barnes, Jeremy
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (63): : 23 - 30
  • [6] A Survey of Cross-lingual Sentiment Analysis: Methodologies, Models and Evaluations
    Xu, Yuemei
    Cao, Han
    Du, Wanze
    Wang, Wenqing
    DATA SCIENCE AND ENGINEERING, 2022, 7 (03) : 279 - 299
  • [7] A Survey of Cross-lingual Sentiment Analysis: Methodologies, Models and Evaluations
    Yuemei Xu
    Han Cao
    Wanze Du
    Wenqing Wang
    Data Science and Engineering, 2022, 7 : 279 - 299
  • [8] An Approach to Cross-lingual Sentiment Lexicon Construction
    Chang, Chia-Hsuan
    Wu, Ming-Lun
    Hwang, San-Yih
    2019 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS 2019), 2019, : 129 - 131
  • [9] A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM
    Miah, Md Saef Ullah
    Kabir, Md Mohsin
    Bin Sarwar, Talha
    Safran, Mejdl
    Alfarhood, Sultan
    Mridha, M. F.
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [10] How a Deep Contextualized Representation and Attention Mechanism Justifies Explainable Cross-Lingual Sentiment Analysis
    Ghasemi, Rouzbeh
    Momtazi, Saeedeh
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (11)