Alleviating repetitive tokens in non-autoregressive machine translation with unlikelihood training

被引:0
|
作者
Shuheng Wang
Shumin Shi
Heyan Huang
机构
[1] Nanyang Institute of Technology,School of Computer and Software
[2] Beijing Institute of Technology,School of Computer Science and Technology
来源
Soft Computing | 2024年 / 28卷
关键词
Machine translation; Non-autoregressive; Repetitive tokens; Unlikelihood training;
D O I
暂无
中图分类号
学科分类号
摘要
In recent years, significant progress has been made in the field of non-autoregressive machine translations. However, the accuracy of non-autoregressive models still lags behind their autoregressive counterparts. This discrepancy can be attributed to the abundance of repetitive tokens in the target sequences generated by non-autoregressive models. In this study, we delve into this phenomenon and propose a novel approach to train a non-autoregressive model using unlikelihood loss. We evaluate our method on three widely used benchmark tasks. The experimental results demonstrating that our proposed approach significantly reduces the number of repetitive tokens while improving the overall performance of non-autoregressive machine translations. Compared to the baseline model ”Mask-Predict”, the average number of repetitions on IWSLT 14 DE→\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rightarrow $$\end{document}EN valid set is reduced from 0.48 to 0.17, resulting in a remarkable 62% decrease.
引用
收藏
页码:4681 / 4688
页数:7
相关论文
共 50 条
  • [21] Efficient Domain Adaptation for Non-Autoregressive Machine Translation
    You, Wangjie
    Guo, Pei
    Li, Juntao
    Chen, Kehai
    Zhang, Min
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 13657 - 13670
  • [22] Aligned Cross Entropy for Non-Autoregressive Machine Translation
    Ghazvininejad, Marjan
    Karpukhin, Vladimir
    Zettlemoyer, Luke
    Levy, Omer
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [23] Uncertainty-aware non-autoregressive neural machine translation
    Liu, Chuanming
    Yu, Jingqi
    COMPUTER SPEECH AND LANGUAGE, 2023, 78
  • [24] Multilingual Non-Autoregressive Machine Translation without Knowledge Distillation
    Huang, Chenyang
    Huang, Fei
    Zheng, Zaixiang
    Zaiane, Osmar
    Zhou, Hao
    Mou, Lili
    13TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING AND THE 3RD CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, IJCNLP-AACL 2023, 2023, : 161 - 170
  • [25] Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation
    Liu, Min
    Bao, Yu
    Zhao, Chengqi
    Huang, Shujian
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13246 - 13254
  • [26] Improving Non-autoregressive Neural Machine Translation with Monolingual Data
    Zhou, Jiawei
    Keung, Phillip
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1893 - 1898
  • [27] Non-autoregressive neural machine translation with auxiliary representation fusion
    Du, Quan
    Feng, Kai
    Xu, Chen
    Xiao, Tong
    Zhu, Jingbo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (06) : 7229 - 7239
  • [28] NON-AUTOREGRESSIVE MACHINE TRANSLATION WITH A NOVEL MASKED LANGUAGE MODEL
    Li Ke
    Li Jie
    Wangjun
    2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
  • [29] Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade
    Gu, Jiatao
    Kong, Xiang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 120 - 133
  • [30] A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond
    Xiao Y.
    Wu L.
    Guo J.
    Li J.
    Zhang M.
    Qin T.
    Liu T.-Y.
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45 (10) : 11407 - 11427