BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

被引：9

作者：

Nguyen Luong Tran ^{[1
]}

Duong Minh Le ^{[1
]}

Dat Quoc Nguyen ^{[1
]}

机构：

[1] VinAI Res, Hanoi, Vietnam

来源：

INTERSPEECH 2022 | 2022年

关键词：

BARTpho; Sequence-to-Sequence; Vietnamese; Pre-trained models; Text summarization; Capitalization; Punctuation restoration;

D O I：

10.21437/Interspeech.2022-10177

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We present BARTpho with two versions, BARTpho(syllable) and BARTpho(word), which are the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese. BARTpho uses the "large" architecture and the pre-training scheme of the sequence-to-sequence denoising autoencoder BART, thus it is especially suitable for generative NLP tasks. We conduct experiments to compare our BARTpho with its competitor mBART on a downstream task of Vietnamese text summarization and show that: in both automatic and human evaluations, BARTpho outperforms the strong baseline mBART and improves the state-of-the-art. We further evaluate and compare BARTpho and mBART on the Vietnamese capitalization and punctuation restoration tasks and also find that BARTpho is more effective than mBART on these two tasks. We publicly release BARTpho to facilitate future research and applications of generative Vietnamese NLP tasks.

引用

页码：1751 / 1755

页数：5

共 50 条

[1] Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models
Wu, Di
Ahmad, Wasi Uddin
Chang, Kai-Wei
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6642 - 6658
[2] Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?
Lee, En-Shiun Annie
Thillainathan, Sarubi
Nayak, Shravan
Ranathunga, Surangika
Adelani, David Ifeoluwa
Su, Ruisi
McCarthy, Arya D.
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 58 - 67
[3] A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese
Liao, Xianwen
Huang, Yongzhong
Yang, Peng
Chen, Lei
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (03)
[4] PhoBERT: Pre-trained language models for Vietnamese
Dat Quoc Nguyen
Anh Tuan Nguyen
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1037 - 1042
[5] BERT-NAR-BERT: A Non-Autoregressive Pre-Trained Sequence-to-Sequence Model Leveraging BERT Checkpoints
Sohrab, Mohammad Golam
Asada, Masaki
Rikters, Matiss
Miwa, Makoto
IEEE ACCESS, 2024, 12 : 23 - 33
[6] Bio-K-Transformer: A pre-trained transformer-based sequence-to-sequence model for adverse drug reactions prediction
Qiu, Xihe
Shao, Siyue
Wang, Haoyu
Tan, Xiaoyu
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2025, 260
[7] Sparse Sequence-to-Sequence Models
Peters, Ben
Niculae, Vlad
Martins, Andre F. T.
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1504 - 1519
[8] Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
Rothe, Sascha
Narayan, Shashi
Severyn, Aliaksei
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 264 - 280
[9] Active Learning with Deep Pre-trained Models for Sequence Tagging of Clinical and Biomedical Texts
Shelmanov, Artem
Liventsev, Vadini
Kireev, Danil
Khromov, Nikita
Panchenko, Alexander
Fedulova, Irina
Dylov, Dmitry, V
2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 482 - 489
[10] Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates
Shelmanov, Artem
Puzyrev, Dmitri
Kupriyanova, Lyubov
Belyakov, Denis
Larionov, Daniil
Khromov, Nikita
Kozlova, Olga
Artemova, Ekaterina
Dylov, Dmitry, V
Panchenko, Alexander
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1698 - 1712

← 1 2 3 4 5 →