JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation

被引：0

作者：

Mao, Zhuoyuan ^{[1
]}

Cromieres, Fabien ^{[1
]}

Dabre, Raj ^{[2
]}

Song, Haiyue ^{[1
]}

Kurohashi, Sadao ^{[1
]}

机构：

[1] Kyoto Univ, Kyoto, Japan

[2] Natl Inst Informat & Commun Technol, Kyoto, Japan

来源：

PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020) | 2020年

关键词：

pre-training; neural machine translation; bunsetsu; low resource;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora. However, they do not account for linguistic information obtained using syntactic analyzers which is known to be invaluable for several Natural Language Processing (NLP) tasks. To this end, we propose JASS, Japanese-specific Sequence to Sequence, as a novel pre-training alternative to MASS for NMT involving Japanese as the source or target language. JASS is joint BMASS (Bunsetsu MASS) and BRSS (Bunsetsu Reordering Sequence to Sequence) pre-training which focuses on Japanese linguistic units called bunsetsus. In our experiments on ASPEC Japanese-English and News Commentary Japanese-Russian translation we show that JASS can give results that are competitive with if not better than those given by MASS. Furthermore, we show for the first time that joint MASS and JASS pre-training gives results that significantly surpass the individual methods indicating their complementary nature. We will release our code, pre-trained models and bunsetsu annotated data as resources for researchers to use in their own NLP tasks.

引用

页码：3683 / 3691

页数：9

共 50 条

[1] Pre-Training on Mixed Data for Low-Resource Neural Machine Translation
Zhang, Wenbo
Li, Xiao
Yang, Yating
Dong, Rui
INFORMATION, 2021, 12 (03)
[2] Pre-training neural machine translation with alignment information via optimal transport
Su, Xueping
Zhao, Xingkai
Ren, Jie
Li, Yunhong
Raetsch, Matthias
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (16) : 48377 - 48397
[3] Pre-training neural machine translation with alignment information via optimal transport
Xueping Su
Xingkai Zhao
Jie Ren
Yunhong Li
Matthias Rätsch
Multimedia Tools and Applications, 2024, 83 : 48377 - 48397
[4] Low-Resource Neural Machine Translation Using XLNet Pre-training Model
Wu, Nier
Hou, Hongxu
Guo, Ziyue
Zheng, Wei
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V, 2021, 12895 : 503 - 514
[5] TWO-STAGE PRE-TRAINING FOR SEQUENCE TO SEQUENCE SPEECH RECOGNITION
Fan, Zhiyun
Zhou, Shiyu
Xu, Bo
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[6] Exploring the Role of Monolingual Data in Cross-Attention Pre-training for Neural Machine Translation
Khang Pham
Long Nguyen
Dien Dinh
COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2023, 2023, 14162 : 179 - 190
[7] SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations
Niu, Changan
Li, Chuanyi
Ng, Vincent
Ge, Jidong
Huang, Liguo
Luo, Bin
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 2006 - 2018
[8] Character-Aware Low-Resource Neural Machine Translation with Weight Sharing and Pre-training
Cao, Yichao
Li, Miao
Feng, Tao
Wang, Rujing
CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 321 - 333
[9] Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation
Mao, Zhuoyuan
Chu, Chenhui
Kurohashi, Sadao
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
[10] Unifying Event Detection and Captioning as Sequence Generation via Pre-training
Zhang, Qi
Song, Yuqing
Jin, Qin
COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 363 - 379

← 1 2 3 4 5 →