LKMT: Linguistics Knowledge-Driven Multi-Task Neural Machine Translation for Urdu and English

被引:0
|
作者
Hassan, Muhammad Naeem Ul [1 ,2 ]
Yu, Zhengtao [1 ,2 ]
Wang, Jian [1 ,2 ]
Li, Ying [1 ,2 ]
Gao, Shengxiang [1 ,2 ]
Yang, Shuwan [1 ,2 ]
Mao, Cunli [1 ,2 ]
机构
[1] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming 650500, Peoples R China
[2] Kunming Univ Sci & Technol, Yunnan Key Lab Artificial Intelligence, Kunming 650500, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2024年 / 81卷 / 01期
基金
中国国家自然科学基金;
关键词
Urdu NMT (neural machine translation); Urdu natural language processing; Urdu Linguistic features; low resources language; linguistic features pretrain model;
D O I
10.32604/cmc.2024.054673
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Thanks to the strong representation capability of pre-trained language models, supervised machine translation models have achieved outstanding performance. However, the performances of these models drop sharply when the scale of the parallel training corpus is limited. Considering the pre-trained language model has a strong ability for monolingual representation, it is the key challenge for machine translation to construct the in-depth relationship between the source and target language by injecting the lexical and syntactic information into pre-trained language models. To alleviate the dependence on the parallel corpus, we propose a Linguistics Knowledge-Driven Multi- Task (LKMT) approach to inject part-of-speech and syntactic knowledge into pre-trained models, thus enhancing the machine translation performance. On the one hand, we integrate part-of-speech and dependency labels into the embedding layer and exploit large-scale monolingual corpus to update all parameters of pre-trained language models, thus ensuring the updated language model contains potential lexical and syntactic information. On the other hand, we leverage an extra self-attention layer to explicitly inject linguistic knowledge into the pre-trained language model-enhanced machine translation model. Experiments on the benchmark dataset show that our proposed LKMT approach improves the Urdu-English translation accuracy by 1.97 points and the English-Urdu translation accuracy by 2.42 points, highlighting the effectiveness of our LKMT framework. Detailed ablation experiments confirm the positive impact of part-of-speech and dependency parsing on machine translation.
引用
收藏
页码:951 / 969
页数:19
相关论文
共 49 条
  • [41] Outcome-Driven Clustering of Acute Coronary Syndrome Patients Using Multi-Task Neural Network with Attention
    Xia, Eryu
    Du, Xin
    Mei, Jing
    Sun, Wen
    Tong, Suijun
    Kang, Zhiqing
    Sheng, Jian
    Li, Jian
    Ma, Changsheng
    Dong, Jianzeng
    Li, Shaochun
    MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 : 457 - 461
  • [42] Multi-task convolutional neural network with coarse-to-fine knowledge transfer for long-tailed classification
    Li, Zhengyu
    Zhao, Hong
    Lin, Yaojin
    INFORMATION SCIENCES, 2022, 608 : 900 - 916
  • [43] A Simple and Strong Baseline: NAIST-NICT Neural Machine Translation System for WAT2017 English-Japanese Translation Task
    Nara Institute of Science and Technology, 8916-5 Takayama-cho, Nara, Ikoma-shi
    630-0192, Japan
    不详
    619-0289, Japan
    IJCNLP - Workshop Asian Transl., WMT - Proc. Workshop, 1600, (135-139):
  • [44] Continual Learning with Confidence-based Multi-teacher Knowledge Distillation for Neural Machine Translation
    Guo, Jiahua
    Liang, Yunlong
    Xu, Jinan
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 336 - 343
  • [45] A Knowledge-Driven Cooperative Optimization Algorithm for Multi-objective Energy-Efficient Flexible Job Scheduling with Variable Machine Speeds
    Zhang, Weimeng
    Li, Junqing
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VIII, ICIC 2024, 2024, 14869 : 494 - 505
  • [46] Mining structure-property relationships in polymer nanocomposites using data driven finite element analysis and multi-task convolutional neural networks
    Wang, Yixing
    Zhang, Min
    Lin, Anqi
    Iyer, Akshay
    Prasad, Aditya Shanker
    Li, Xiaolin
    Zhang, Yichi
    Schadler, Linda S.
    Chen, Wei
    Brinson, L. Catherine
    MOLECULAR SYSTEMS DESIGN & ENGINEERING, 2020, 5 (05) : 962 - 975
  • [47] CM-Off-Meme: Code-Mixed Hindi-English Offensive Meme Detection with Multi-Task Learning by Leveraging Contextual Knowledge
    Kumari, Gitanjali
    Bandyopadhyay, Dibyanayan
    Ekbal, Asif
    Vinutha, B.N.
    2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings, 2024, : 3380 - 3393
  • [48] Theory-driven or data-driven? Modelling ride-sourcing mode choices using integrated choice and latent variable model and multi-task learning deep neural networks
    Liu, Yicong
    Loa, Patrick
    Wang, Kaili
    Habib, Khandker Nurul
    JOURNAL OF CHOICE MODELLING, 2023, 48
  • [49] The interaction effect between source text complexity and machine translation quality on the task difficulty of NMT post-editing from English to Chinese: A multi-method study
    Jia, Yanfang
    Zheng, Binghan
    ACROSS LANGUAGES AND CULTURES, 2022, 23 (01) : 36 - 55