Alleviating repetitive tokens in non-autoregressive machine translation with unlikelihood training

被引:0
|
作者
Shuheng Wang
Shumin Shi
Heyan Huang
机构
[1] Nanyang Institute of Technology,School of Computer and Software
[2] Beijing Institute of Technology,School of Computer Science and Technology
来源
Soft Computing | 2024年 / 28卷
关键词
Machine translation; Non-autoregressive; Repetitive tokens; Unlikelihood training;
D O I
暂无
中图分类号
学科分类号
摘要
In recent years, significant progress has been made in the field of non-autoregressive machine translations. However, the accuracy of non-autoregressive models still lags behind their autoregressive counterparts. This discrepancy can be attributed to the abundance of repetitive tokens in the target sequences generated by non-autoregressive models. In this study, we delve into this phenomenon and propose a novel approach to train a non-autoregressive model using unlikelihood loss. We evaluate our method on three widely used benchmark tasks. The experimental results demonstrating that our proposed approach significantly reduces the number of repetitive tokens while improving the overall performance of non-autoregressive machine translations. Compared to the baseline model ”Mask-Predict”, the average number of repetitions on IWSLT 14 DE→\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rightarrow $$\end{document}EN valid set is reduced from 0.48 to 0.17, resulting in a remarkable 62% decrease.
引用
收藏
页码:4681 / 4688
页数:7
相关论文
共 50 条
  • [1] Alleviating repetitive tokens in non-autoregressive machine translation with unlikelihood training
    Wang, Shuheng
    Shi, Shumin
    Huang, Heyan
    SOFT COMPUTING, 2024, 28 (5) : 4681 - 4688
  • [2] Hint-Based Training for Non-Autoregressive Machine Translation
    Li, Zhuohan
    Lin, Zi
    He, Di
    Tian, Fei
    Qin, Tao
    Wang, Liwei
    Liu, Tie-Yan
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5708 - 5713
  • [3] Sequence-Level Training for Non-Autoregressive Neural Machine Translation
    Shao, Chenze
    Feng, Yang
    Zhang, Jinchao
    Meng, Fandong
    Zhou, Jie
    COMPUTATIONAL LINGUISTICS, 2021, 47 (04) : 891 - 925
  • [4] Integrating Translation Memories into Non-Autoregressive Machine Translation
    Xu, Jitao
    Crego, Josep
    Yvon, Francois
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1326 - 1338
  • [5] Enhanced encoder for non-autoregressive machine translation
    Wang, Shuheng
    Shi, Shumin
    Huang, Heyan
    MACHINE TRANSLATION, 2021, 35 (04) : 595 - 609
  • [6] Rephrasing the Reference for Non-autoregressive Machine Translation
    Shao, Chenze
    Zhang, Jinchao
    Zhou, Jie
    Feng, Yang
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13538 - 13546
  • [7] Acyclic Transformer for Non-Autoregressive Machine Translation
    Huang, Fei
    Zhou, Hao
    Liu, Yang
    Li, Hang
    Huang, Minlie
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [8] Non-Autoregressive Machine Translation with Auxiliary Regularization
    Wang, Yiren
    Tian, Fei
    He, Di
    Qin, Tao
    Zhai, ChengXiang
    Liu, Tie-Yan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5377 - 5384
  • [9] A Survey of Non-Autoregressive Neural Machine Translation
    Li, Feng
    Chen, Jingxian
    Zhang, Xuejun
    ELECTRONICS, 2023, 12 (13)
  • [10] Non-Autoregressive Machine Translation as Constrained HMM
    Li, Haoran
    Jie, Zhanming
    Lui, Wei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 12361 - 12372