JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation

被引:0
|
作者
Mao, Zhuoyuan [1 ]
Cromieres, Fabien [1 ]
Dabre, Raj [2 ]
Song, Haiyue [1 ]
Kurohashi, Sadao [1 ]
机构
[1] Kyoto Univ, Kyoto, Japan
[2] Natl Inst Informat & Commun Technol, Kyoto, Japan
来源
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020) | 2020年
关键词
pre-training; neural machine translation; bunsetsu; low resource;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora. However, they do not account for linguistic information obtained using syntactic analyzers which is known to be invaluable for several Natural Language Processing (NLP) tasks. To this end, we propose JASS, Japanese-specific Sequence to Sequence, as a novel pre-training alternative to MASS for NMT involving Japanese as the source or target language. JASS is joint BMASS (Bunsetsu MASS) and BRSS (Bunsetsu Reordering Sequence to Sequence) pre-training which focuses on Japanese linguistic units called bunsetsus. In our experiments on ASPEC Japanese-English and News Commentary Japanese-Russian translation we show that JASS can give results that are competitive with if not better than those given by MASS. Furthermore, we show for the first time that joint MASS and JASS pre-training gives results that significantly surpass the individual methods indicating their complementary nature. We will release our code, pre-trained models and bunsetsu annotated data as resources for researchers to use in their own NLP tasks.
引用
收藏
页码:3683 / 3691
页数:9
相关论文
共 50 条
  • [41] Moment matching training for neural machine translation: An empirical study
    Nguyen, Long H. B.
    Pham, Nghi T.
    Duc, Le D. C.
    Cong Duy Vu Hoang
    Dien Dinh
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (03) : 2633 - 2645
  • [42] Adversarial Training for Unknown Word Problems in Neural Machine Translation
    Ji, Yatu
    Hou, Hongxu
    Chen, Junjie
    Wu, Nier
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (01)
  • [43] Effectively training neural machine translation models with monolingual data
    Yang, Zhen
    Chen, Wei
    Wang, Feng
    Xu, Bo
    NEUROCOMPUTING, 2019, 333 : 240 - 247
  • [44] Training with Additional Semantic Constraints for Enhancing Neural Machine Translation
    Ji, Yatu
    Hou, Hongxu
    Chen, Junjie
    Wu, Nier
    PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2019, 11670 : 300 - 313
  • [45] ZeUS: An Unified Training Framework for Constrained Neural Machine Translation
    Yang, Murun
    IEEE ACCESS, 2024, 12 : 124695 - 124704
  • [46] Pre-Training Graph Neural Networks for Cold-Start Users and Items Representation
    Hao, Bowen
    Zhang, Jing
    Yin, Hongzhi
    Li, Cuiping
    Chen, Hong
    WSDM '21: PROCEEDINGS OF THE 14TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2021, : 265 - 273
  • [47] Layer-wise Pre-training Mechanism Based on Neural Network for Epilepsy Detection
    Lin, Zichao
    Gu, Zhenghui
    Li, Yinghao
    Yu, Zhuliang
    Li, Yuanqing
    2020 12TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2020, : 224 - 227
  • [48] Neural Networks for Sequential Data: a Pre-training Approach based on Hidden Markov Models
    Pasa, Luca
    Testolin, Alberto
    Sperduti, Alessandro
    NEUROCOMPUTING, 2015, 169 : 323 - 333
  • [49] VOICE CONVERSION USING DEEP NEURAL NETWORKS WITH SPEAKER-INDEPENDENT PRE-TRAINING
    Mohammadi, Seyed Hamidreza
    Kain, Alexander
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 19 - 23
  • [50] Dynamic Pre-training of Deep Recurrent Neural Networks for Predicting Environmental Monitoring Data
    Ong, Bun Theang
    Sugiura, Komei
    Zettsu, Koji
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, : 760 - 765