JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation

被引:0
|
作者
Mao, Zhuoyuan [1 ]
Cromieres, Fabien [1 ]
Dabre, Raj [2 ]
Song, Haiyue [1 ]
Kurohashi, Sadao [1 ]
机构
[1] Kyoto Univ, Kyoto, Japan
[2] Natl Inst Informat & Commun Technol, Kyoto, Japan
来源
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020) | 2020年
关键词
pre-training; neural machine translation; bunsetsu; low resource;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora. However, they do not account for linguistic information obtained using syntactic analyzers which is known to be invaluable for several Natural Language Processing (NLP) tasks. To this end, we propose JASS, Japanese-specific Sequence to Sequence, as a novel pre-training alternative to MASS for NMT involving Japanese as the source or target language. JASS is joint BMASS (Bunsetsu MASS) and BRSS (Bunsetsu Reordering Sequence to Sequence) pre-training which focuses on Japanese linguistic units called bunsetsus. In our experiments on ASPEC Japanese-English and News Commentary Japanese-Russian translation we show that JASS can give results that are competitive with if not better than those given by MASS. Furthermore, we show for the first time that joint MASS and JASS pre-training gives results that significantly surpass the individual methods indicating their complementary nature. We will release our code, pre-trained models and bunsetsu annotated data as resources for researchers to use in their own NLP tasks.
引用
收藏
页码:3683 / 3691
页数:9
相关论文
共 50 条
  • [21] Graph Neural Pre-training for Recommendation with Side Information
    Liu, Siwei
    Meng, Zaiqiao
    Macdonald, Craig
    Ounis, Iadh
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (03)
  • [22] MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation
    Wang, Jinpeng
    Zeng, Ziyun
    Wang, Yunxiao
    Wang, Yuting
    Lu, Xingyu
    Li, Tianxiang
    Yuan, Jun
    Zhang, Rui
    Zheng, Hai-Tao
    Xia, Shu-Tao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6548 - 6557
  • [23] Context- and Sequence-Aware Convolutional Recurrent Encoder for Neural Machine Translation
    Mallick, Ritam
    Susan, Seba
    Agrawal, Vaibhaw
    Garg, Rizul
    Rawal, Prateek
    36TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2021, 2021, : 853 - 856
  • [24] Pre-Training of an Artificial Neural Network for Software Fault Prediction
    Owhadi-Kareshk, Moein
    Sedaghat, Yasser
    Akbarzadeh-T, Mohammad-R
    PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2017, : 223 - 228
  • [25] Pre-training Graph Neural Network for Cross Domain Recommendation
    Wang, Chen
    Liang, Yueqing
    Liu, Zhiwei
    Zhang, Tao
    Yu, Philip S.
    2021 IEEE THIRD INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2021), 2021, : 140 - 145
  • [26] Generative adversarial training for neural machine translation
    Yang, Zhen
    Chen, Wei
    Wang, Feng
    Xu, Bo
    NEUROCOMPUTING, 2018, 321 : 146 - 155
  • [27] Pre-training model for low-resource Chinese-Braille translation
    Yu, Hailong
    Su, Wei
    Liu, Lei
    Zhang, Jing
    Cai, Chuan
    Xu, Cunlu
    DISPLAYS, 2023, 79
  • [28] PSP: Pre-training and Structure Prompt Tuning for Graph Neural Networks
    Ge, Qingqing
    Zhao, Zeyuan
    Liu, Yiding
    Cheng, Anfeng
    Li, Xiang
    Wang, Shuaiqiang
    Yin, Dawei
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT V, ECML PKDD 2024, 2024, 14945 : 423 - 439
  • [29] Improving Information Extraction on Business Documents with Specific Pre-training Tasks
    Douzon, Thibault
    Duffner, Stefan
    Garcia, Christophe
    Espinas, Jeremy
    DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 111 - 125
  • [30] Exploiting Pre-Ordering for Neural Machine Translation
    Zhao, Yang
    Zhang, Jiajun
    Zong, Chengqing
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 893 - 899