BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

被引:9
|
作者
Nguyen Luong Tran [1 ]
Duong Minh Le [1 ]
Dat Quoc Nguyen [1 ]
机构
[1] VinAI Res, Hanoi, Vietnam
来源
INTERSPEECH 2022 | 2022年
关键词
BARTpho; Sequence-to-Sequence; Vietnamese; Pre-trained models; Text summarization; Capitalization; Punctuation restoration;
D O I
10.21437/Interspeech.2022-10177
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present BARTpho with two versions, BARTpho(syllable) and BARTpho(word), which are the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese. BARTpho uses the "large" architecture and the pre-training scheme of the sequence-to-sequence denoising autoencoder BART, thus it is especially suitable for generative NLP tasks. We conduct experiments to compare our BARTpho with its competitor mBART on a downstream task of Vietnamese text summarization and show that: in both automatic and human evaluations, BARTpho outperforms the strong baseline mBART and improves the state-of-the-art. We further evaluate and compare BARTpho and mBART on the Vietnamese capitalization and punctuation restoration tasks and also find that BARTpho is more effective than mBART on these two tasks. We publicly release BARTpho to facilitate future research and applications of generative Vietnamese NLP tasks.
引用
收藏
页码:1751 / 1755
页数:5
相关论文
共 50 条
  • [1] Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models
    Wu, Di
    Ahmad, Wasi Uddin
    Chang, Kai-Wei
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6642 - 6658
  • [2] Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?
    Lee, En-Shiun Annie
    Thillainathan, Sarubi
    Nayak, Shravan
    Ranathunga, Surangika
    Adelani, David Ifeoluwa
    Su, Ruisi
    McCarthy, Arya D.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 58 - 67
  • [3] A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese
    Liao, Xianwen
    Huang, Yongzhong
    Yang, Peng
    Chen, Lei
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (03)
  • [4] PhoBERT: Pre-trained language models for Vietnamese
    Dat Quoc Nguyen
    Anh Tuan Nguyen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1037 - 1042
  • [5] BERT-NAR-BERT: A Non-Autoregressive Pre-Trained Sequence-to-Sequence Model Leveraging BERT Checkpoints
    Sohrab, Mohammad Golam
    Asada, Masaki
    Rikters, Matiss
    Miwa, Makoto
    IEEE ACCESS, 2024, 12 : 23 - 33
  • [6] Bio-K-Transformer: A pre-trained transformer-based sequence-to-sequence model for adverse drug reactions prediction
    Qiu, Xihe
    Shao, Siyue
    Wang, Haoyu
    Tan, Xiaoyu
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2025, 260
  • [7] Sparse Sequence-to-Sequence Models
    Peters, Ben
    Niculae, Vlad
    Martins, Andre F. T.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1504 - 1519
  • [8] Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
    Rothe, Sascha
    Narayan, Shashi
    Severyn, Aliaksei
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 264 - 280
  • [9] Active Learning with Deep Pre-trained Models for Sequence Tagging of Clinical and Biomedical Texts
    Shelmanov, Artem
    Liventsev, Vadini
    Kireev, Danil
    Khromov, Nikita
    Panchenko, Alexander
    Fedulova, Irina
    Dylov, Dmitry, V
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 482 - 489
  • [10] Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates
    Shelmanov, Artem
    Puzyrev, Dmitri
    Kupriyanova, Lyubov
    Belyakov, Denis
    Larionov, Daniil
    Khromov, Nikita
    Kozlova, Olga
    Artemova, Ekaterina
    Dylov, Dmitry, V
    Panchenko, Alexander
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1698 - 1712