Vietnamese Sentiment Analysis: An Overview and Comparative Study of Fine-tuning Pretrained Language Models

被引：12

作者：

Dang Van Thin ^{[1
,2
]}

Duong Ngoc Hao ^{[1
,2
]}

Ngan Luu-Thuy Nguyen ^{[1
,2
]}

机构：

[1] Univ Informat Technol, Ho Chi Minh City, Vietnam

[2] Vietnam Natl Univ, Ho Chi Minh City, Vietnam

来源：

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING | 2023年 / 22卷 / 06期

关键词：

Vietnamese Sentiment Analysis; fine-tuning language models; monolingual BERT model; multilingual BERT model; T5; architecture;

D O I：

10.1145/3589131

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Sentiment Analysis (SA) is one of the most active research areas in the Natural Language Processing (NLP) field due to its potential for business and society. With the development of language representation models, numerous methods have shown promising efficiency in fine-tuning pre-trained language models in NLP downstream tasks. For Vietnamese, many available pre-trained language models were also released, including the monolingual and multilingual language models. Unfortunately, all of these models were trained on different architectures, pre-trained data, and pre-processing steps; consequently, fine-tuning these models can be expected to yield different effectiveness. In addition, there is no study focusing on evaluating the performance of these models on the same datasets for the SA task up to now. This article presents a fine-tuning approach to investigate the performance of different pre-trained language models for the Vietnamese SA task. The experimental results show the superior performance of the monolingual PhoBERT model and ViT5 model in comparison with previous studies and provide new state-of-the-art performances on five benchmark Vietnamese SA datasets. To the best of our knowledge, our study is the first attempt to investigate the performance of fine-tuning Transformer-based models on five datasets with different domains and sizes for the Vietnamese SA task.

引用

页数：27

共 50 条

[41] Scaling Federated Learning for Fine-Tuning of Large Language Models
Hilmkil, Agrin
Callh, Sebastian
Barbieri, Matteo
Sutfeld, Leon Rene
Zec, Edvin Listo
Mogren, Olof
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 15 - 23
[42] Fine-tuning large neural language models for biomedical natural language processing
Tinn, Robert
Cheng, Hao
Gu, Yu
Usuyama, Naoto
Liu, Xiaodong
Naumann, Tristan
Gao, Jianfeng
Poon, Hoifung
PATTERNS, 2023, 4 (04):
[43] Fine-Tuning Word Embeddings for Aspect-Based Sentiment Analysis
Duc-Hong Pham
Thi-Thanh-Tan Nguyen
Anh-Cuong Le
TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 500 - 508
[44] BERT Fine-Tuning for Sentiment Analysis on Indonesian Mobile Apps Reviews
Nugroho, Kuncahyo Setyo
Sukmadewa, Anantha Yullian
Wuswilahaken, Haftittah Dw
Bachtiar, Fitra Abdurrachman
Yudistira, Novanto
PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY, SIET 2021, 2021, : 258 - 264
[45] Analysis of Bias in GPT Language Models through Fine-tuning Containing Divergent Data
Turi, Leandro Furlam
Cavalini, Athus
Comarela, Giovanni
Oliveira-Santos, Thiago
Badue, Claudine
De Souza, Alberto F.
2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
[46] Optimizing Fine-Tuning in Quantized Language Models: An In-Depth Analysis of Key Variables
Shen, Ao
Lai, Zhiquan
Li, Dongsheng
Hu, Xiaoyu
CMC-COMPUTERS MATERIALS & CONTINUA, 2025, 82 (01): : 307 - 325
[47] Prompt Engineering or Fine-Tuning? A Case Study on Phishing Detection with Large Language Models
Trad, Fouad
Chehab, Ali
MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (01): : 367 - 384
[48] An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models
Liu, Xueqing
Wang, Chi
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2286 - 2300
[49] An Empirical Study on Fine-tuning Large Language Models of Code for Automated Program Repair
Huang, Kai
Meng, Xiangxin
Zhang, Jian
Liu, Yang
Wang, Wenjie
Li, Shuhao
Zhang, Yuqing
2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1162 - 1174
[50] Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Chen, Zixiang
Deng, Yihe
Yuan, Huizhuo
Ji, Kaixuan
Gu, Quanquan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, 2024, 235

← 1 2 3 4 5 →