Aligned Cross Entropy for Non-Autoregressive Machine Translation

被引:0
|
作者
Ghazvininejad, Marjan [1 ]
Karpukhin, Vladimir [1 ]
Zettlemoyer, Luke [1 ]
Levy, Omer [1 ]
机构
[1] Facebook AI Res, Menlo Pk, CA 94025 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-autoregressive machine translation models significantly speed up decoding by allowing for parallel prediction of the entire target sequence. However, modeling word order is more challenging due to the lack of autoregressive factors in the model. This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order. In this paper, we propose aligned cross entropy (AXE) as an alternative loss function for training of non-autoregressive models. AXE uses a differentiable dynamic program to assign loss based on the best possible monotonic alignment between target tokens and model predictions. AXE-based training of conditional masked language models (CMLMs) substantially improves performance on major WMT benchmarks, while setting a new state of the art for non-autoregressive models.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Incorporating a local translation mechanism into non-autoregressive translation
    Kong, Xiang
    Zhang, Zhisong
    Hovy, Eduard
    arXiv, 2020,
  • [42] Incorporating a Local Translation Mechanism into Non-autoregressive Translation
    Kong, Xiang
    Zhang, Zhisong
    Hovy, Eduard
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1067 - 1073
  • [43] Non-autoregressive Streaming Transformer for Simultaneous Translation
    Ma, Zhengrui
    Zhang, Shaolei
    Guo, Shoutao
    Shao, Chenze
    Zhang, Min
    Feng, Yang
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 5177 - 5190
  • [44] Enriching Non-Autoregressive Transformer with Syntactic and Semantic Structures for Neural Machine Translation
    Liu, Ye
    Wan, Yao
    Zhang, Jian-Guo
    Zhao, Wenting
    Yu, Philip S.
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1235 - 1244
  • [45] Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation
    Liu, Jinglin
    Ren, Yi
    Tan, Xu
    Zhang, Chen
    Qin, Tao
    Zhao, Zhou
    Liu, Tie-Yan
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3861 - 3867
  • [46] DiMS: Distilling Multiple Steps of Iterative Non-Autoregressive Transformers for Machine Translation
    Norouzi, Sajad
    Hosseinzadeh, Rasa
    Perez, Felipe
    Volkovs, Maksims
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8538 - 8553
  • [47] Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation
    Shao, Chenze
    Zhang, Jinchao
    Feng, Yang
    Meng, Fandong
    Zhou, Jie
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 198 - 205
  • [48] Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation
    Hao, Yongchang
    He, Shilin
    Jiao, Wenxiang
    Tu, Zhaopeng
    Lyu, Michael R.
    Wang, Xing
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3989 - 3996
  • [49] Non-Autoregressive Neural Machine Translation with Consistency Regularization Optimized Variational Framework
    Zhu, Minghao
    Wang, Junli
    Yan, Chungang
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 607 - 617
  • [50] ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation
    Tu, Lifu
    Pang, Richard Yuanzhe
    Wiseman, Sam
    Gimpel, Kevin
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2819 - 2826